As a user, I can delete already downloaded files for a particular repo(s) in case of on_demand policy
There is a desire to be able to reduce disk space by deleting files of the units associated with a particular repository (or a set of repositories) if files aren't actually needed. Those files can be downloaded again at the next request.Several points to take into consideration:
- File(s) of a content unit shouldn't be deleted if there is at least one repository with immediate policy which the content unit is associated with.
- Content unit should be marked back as non-downloaded one. Mark as it is not downloaded first and only then actually delete files to avoid the race condition when unit is marked as downloaded but files are no longer on a disk.
- Since only one copy of the file(s) for a particular content unit is stored on a disk, they will be deleted for all the repositories which the content unit is associated with.
- Even if a resource reservation is used for a repo of interest, multiple repos may be affected (see previous point ^). This may lead to a race condition:
- between this deletion task and a sync of the other repo which just switched from on_demand to immediate policy for In this case at least one unit may be deleted for a repo with immediate policy when it should not be.
- between this deletion task and periodical download_deferred task which downloads recently requested files.
- A greedy use case to consider: delete all files for all on_demand repositories in Pulp.
#2 Updated by bmbouter over 2 years ago
There is still a separate race condition for a single repo. Maybe this is one of the race conditions described in the ticket, but those seemed to focus on multi-repo race conditions.
1. The user says to delete the content in a specific repo, say foo
2. The unit abc is marked as non-downloaded one
3. A client machine requests unit abc which causes the download to be marked as downloaded again <----- this is the race part
4. Unit abc's files are deleted
This would leave the unit abc in the 'downloaded' state, yet it's deleted from disk.
I think before starting this story the race conditions need to be written out more fully. They are kind of written out, but kind of vague. Also, how will these race conditions be solved? I believe planning how to address each race condition ahead of time would be better than hoping the implementer addresses all race conditions perfectly.
Please register to edit this issue