Story #8459
closed
As a user I want to reclaim disk space for a list of repositories
Status:
CLOSED - CURRENTRELEASE
Description
Problem Statement/Use cases:¶
- As a user I might have content that I no longer need to serve but I want to keep it in the repo for history purposes. I want to still be able to free up some disk space in such case.
- As a user I have repos that have on_demand download policy and I want to clear out downloaded files for those repos. A repository with on_demand download policy will store artifacts locally after they have been requested, but there really isn't a way to have pulp delete the locally stored files and free disk space if these packages are unlikely to be used again.
Solution¶
pulp/api/v3/repositories/<reclaim-disk-space>/
Provide a separate endpoint which will accept a list of repo hrefs to reclaim disk space for.
- Artifacts only exclusively to that list of repos will be removed. No new versions will be created or removed because content set will not change.
- a
keeplist
option will contain list of repo_version hrefs which will be excluded from Artifact removal.
- The task will only remove those Artifacts that have a corresponding Remote artifact.
- There, however, will be a
force
flag that will remove Artifacts that have no RA( aka uploaded content). This will not be a default behavior.
- The task will trim Artifacts regardless of the download policy. ( Content app would be able to stream artifact if it is locally available otherwise with the help of RA( if available) it would download artifact from the remote source)
https://hackmd.io/e78_-T2rQRWvXaGjHT6Kyw
- Description updated (diff)
- Description updated (diff)
I suggest to change the language from deferred download policy
to on_demand download policy
.
Other than that it looks good and +1 to keeplist or some kind of labeling to preserve repo versions of interest.
If a user would want to reclaim disc space on a regular basis, it might not be convenient to provide the keeplist each time, so maybe labeling repo versions in some way would be a better approach.
This is kind of the anti-feature of the repoversion-repair.
Maybe it's wise to try to give them similar apis. I think that would mean the atom to work on would not be the repository, but the repository_version. You'd probably need to pass a list of versions to get a meaningful feature here.
But it would give you the keep_list functionality for free (Just don't specify the ones to preserve).
ttereshc wrote:
I suggest to change the language from deferred download policy
to on_demand download policy
.
Other than that it looks good and +1 to keeplist or some kind of labeling to preserve repo versions of interest.
If a user would want to reclaim disc space on a regular basis, it might not be convenient to provide the keeplist each time, so maybe labeling repo versions in some way would be a better approach.
To my knowledge labeling is available only on the repos and not repo versions. I agree that labels are more user friendly
- Description updated (diff)
- Groomed changed from No to Yes
- Sprint Candidate changed from No to Yes
Will there be some other mechanism to do this, but for a specific set of content in a repository? We'd also like to be able to clean up a list of content in a repository without completely wiping out all content in previous versions of a repo.
I guess we can accomplish this by removing the content from the latest version of the repo and then run the task to purge content on all versions except the latest.
- Sprint/Milestone set to Content/disk space management
Should this #5926 be closed as a dupe of this one?
ttereshc wrote:
Should this #5926 be closed as a dupe of this one?
Yes this can closed. Thanks for the reminder.
- Has duplicate Story #5926: As a user, I can clear out downloaded files from on_demand repos added
- Description updated (diff)
When we do this we should possibly consider (or at least think about the use case) of doing the opposite. "download everything in this repository, even if it came from multiple remotes originally"
- Sprint/Milestone changed from Content/disk space management to 3.15.0
- Related to Story #8324: As a user, I can have Pulp download content for a repository version added
- Status changed from NEW to ASSIGNED
- Assignee set to ipanova@redhat.com
- Sprint set to Sprint 99
- Status changed from ASSIGNED to POST
- Sprint changed from Sprint 99 to Sprint 100
- Sprint changed from Sprint 100 to Sprint 101
- Sprint changed from Sprint 101 to Sprint 102
- Status changed from POST to MODIFIED
- % Done changed from 0 to 100
- Status changed from MODIFIED to CLOSED - CURRENTRELEASE
Also available in: Atom
PDF
Reclaim disk space story.
closes #8459 Required PR: https://github.com/pulp/pulp_file/pull/546