Project

Profile

Help

Story #8459

closed

As a user I want to reclaim disk space for a list of repositories

Added by ipanova@redhat.com over 1 year ago. Updated about 1 year ago.

Status:
CLOSED - CURRENTRELEASE
Priority:
Normal
Category:
-
Sprint/Milestone:
Start date:
Due date:
% Done:

100%

Estimated time:
Platform Release:
Groomed:
Yes
Sprint Candidate:
Yes
Tags:
GalaxyNG, Katello
Sprint:
Sprint 102
Quarter:

Description

Problem Statement/Use cases:

  • As a user I might have content that I no longer need to serve but I want to keep it in the repo for history purposes. I want to still be able to free up some disk space in such case.
  • As a user I have repos that have on_demand download policy and I want to clear out downloaded files for those repos. A repository with on_demand download policy will store artifacts locally after they have been requested, but there really isn't a way to have pulp delete the locally stored files and free disk space if these packages are unlikely to be used again.

Solution

pulp/api/v3/repositories/<reclaim-disk-space>/ Provide a separate endpoint which will accept a list of repo hrefs to reclaim disk space for.

  • Artifacts only exclusively to that list of repos will be removed. No new versions will be created or removed because content set will not change.
  • a keeplist option will contain list of repo_version hrefs which will be excluded from Artifact removal.
  • The task will only remove those Artifacts that have a corresponding Remote artifact.
  • There, however, will be a force flag that will remove Artifacts that have no RA( aka uploaded content). This will not be a default behavior.
  • The task will trim Artifacts regardless of the download policy. ( Content app would be able to stream artifact if it is locally available otherwise with the help of RA( if available) it would download artifact from the remote source)

https://hackmd.io/e78_-T2rQRWvXaGjHT6Kyw


Related issues

Related to Pulp - Story #8324: As a user, I can have Pulp download content for a repository versionCLOSED - DUPLICATE

Actions
Has duplicate Pulp - Story #5926: As a user, I can clear out downloaded files from on_demand reposCLOSED - DUPLICATE

Actions
Actions #1

Updated by ipanova@redhat.com over 1 year ago

  • Description updated (diff)
Actions #2

Updated by ipanova@redhat.com over 1 year ago

  • Description updated (diff)
Actions #4

Updated by ttereshc over 1 year ago

I suggest to change the language from deferred download policy to on_demand download policy.

Other than that it looks good and +1 to keeplist or some kind of labeling to preserve repo versions of interest. If a user would want to reclaim disc space on a regular basis, it might not be convenient to provide the keeplist each time, so maybe labeling repo versions in some way would be a better approach.

Actions #5

Updated by mdellweg over 1 year ago

This is kind of the anti-feature of the repoversion-repair. Maybe it's wise to try to give them similar apis. I think that would mean the atom to work on would not be the repository, but the repository_version. You'd probably need to pass a list of versions to get a meaningful feature here. But it would give you the keep_list functionality for free (Just don't specify the ones to preserve).

Actions #6

Updated by ipanova@redhat.com over 1 year ago

ttereshc wrote:

I suggest to change the language from deferred download policy to on_demand download policy.

Other than that it looks good and +1 to keeplist or some kind of labeling to preserve repo versions of interest. If a user would want to reclaim disc space on a regular basis, it might not be convenient to provide the keeplist each time, so maybe labeling repo versions in some way would be a better approach.

To my knowledge labeling is available only on the repos and not repo versions. I agree that labels are more user friendly

Actions #7

Updated by ipanova@redhat.com over 1 year ago

  • Description updated (diff)
Actions #8

Updated by ipanova@redhat.com over 1 year ago

  • Groomed changed from No to Yes
  • Sprint Candidate changed from No to Yes
Actions #9

Updated by newswangerd over 1 year ago

Will there be some other mechanism to do this, but for a specific set of content in a repository? We'd also like to be able to clean up a list of content in a repository without completely wiping out all content in previous versions of a repo.

Actions #10

Updated by newswangerd over 1 year ago

I guess we can accomplish this by removing the content from the latest version of the repo and then run the task to purge content on all versions except the latest.

Actions #11

Updated by ipanova@redhat.com over 1 year ago

  • Tags GalaxyNG added
Actions #12

Updated by ipanova@redhat.com over 1 year ago

  • Tags Katello added
Actions #13

Updated by ipanova@redhat.com over 1 year ago

  • Sprint/Milestone set to Content/disk space management
Actions #14

Updated by ttereshc over 1 year ago

Should this #5926 be closed as a dupe of this one?

Actions #15

Updated by ipanova@redhat.com over 1 year ago

ttereshc wrote:

Should this #5926 be closed as a dupe of this one?

Yes this can closed. Thanks for the reminder.

Actions #16

Updated by ttereshc over 1 year ago

  • Has duplicate Story #5926: As a user, I can clear out downloaded files from on_demand repos added
Actions #17

Updated by ipanova@redhat.com over 1 year ago

  • Description updated (diff)
Actions #18

Updated by dalley over 1 year ago

When we do this we should possibly consider (or at least think about the use case) of doing the opposite. "download everything in this repository, even if it came from multiple remotes originally"

Actions #19

Updated by daviddavis over 1 year ago

dalley I filed a feature request that I think is similar to your suggestion: https://pulp.plan.io/issues/8324

Actions #20

Updated by daviddavis over 1 year ago

  • Sprint/Milestone changed from Content/disk space management to 3.15.0
Actions #21

Updated by dalley over 1 year ago

  • Related to Story #8324: As a user, I can have Pulp download content for a repository version added
Actions #22

Updated by ipanova@redhat.com over 1 year ago

  • Status changed from NEW to ASSIGNED
  • Assignee set to ipanova@redhat.com
  • Sprint set to Sprint 99
Actions #23

Updated by pulpbot over 1 year ago

  • Status changed from ASSIGNED to POST
Actions #24

Updated by rchan about 1 year ago

  • Sprint changed from Sprint 99 to Sprint 100
Actions #25

Updated by rchan about 1 year ago

  • Sprint changed from Sprint 100 to Sprint 101
Actions #26

Updated by ipanova@redhat.com about 1 year ago

  • Sprint changed from Sprint 101 to Sprint 102

Added by ipanova@redhat.com about 1 year ago

Revision b1c9dac9

Reclaim disk space story.

closes #8459 Required PR: https://github.com/pulp/pulp_file/pull/546

Actions #27

Updated by ipanova@redhat.com about 1 year ago

  • Status changed from POST to MODIFIED
  • % Done changed from 0 to 100
Actions #28

Updated by pulpbot about 1 year ago

  • Status changed from MODIFIED to CLOSED - CURRENTRELEASE

Also available in: Atom PDF