Issue #3551
closedRemoveOldRepodataStep for yum publisher not checking repomd.xml to remove old files
Description
Current code in yum distributor publish doesn't actually check the files within the repodata.xml to compare the files that should be kept and files that should be deleted. Instead it makes the assumption that the latest one of each file type (to be removed) is present and removes the files older then the threshold (default 14 days). Given that mtime could be changed by someone touching a file or even moving it on the file system, it may lead to deletion of incorrect files.
related code:
for key, val in to_remove.iteritems():
# preserve at least one file of each kind - pop out latest
if not set(groupped[key]) - set(val):
val.pop(0)
for f in val:
self.remove_repodata_file(f[0])
Related issues
Updated by bmbouter over 6 years ago
- Triaged changed from No to Yes
I think in order to do this more accurately we would need to store a record of the published data. Where would we store this type of data?
The rpm devs are really Pulp3 focused right now and Pulp2 is almost near it's maintenance phase. For Pulp2 we could work with you to help prepare a fix. Is there any possibility you would be able to contribute an improvement in this area?
Updated by mtahir over 6 years ago
I have created a fix[1] and pushed it for review. With regards to storing the record of published data, I created the fix with the assumption that any file that is not in repomd.xml, and older than the threshold should be removed.
[1] https://github.com/mztahir/pulp_rpm/commit/0b1c9edaa853b28bdead8af6e5ed36686aa6caa6
Updated by ipanova@redhat.com over 6 years ago
@mtahir, can you actually submit this as a PR to upstream pulp_rpm? thank you.
Updated by bmcivor over 6 years ago
@ipanova, PR has been created on github for this issue. If there's any more information you need, feel free to ask.
Updated by ipanova@redhat.com over 6 years ago
- Status changed from NEW to POST
Updated by ipanova@redhat.com over 6 years ago
- Related to Story #2788: As a user i can configure removal of old published repodata added
Added by mtahir over 6 years ago
Updated by mtahir over 6 years ago
- Status changed from POST to MODIFIED
Applied in changeset 3774e3870e2ec1b485ee5cd1f49e98d7c696399a.
Updated by ipanova@redhat.com over 6 years ago
Updated by dkliban@redhat.com about 6 years ago
- Status changed from MODIFIED to CLOSED - CURRENTRELEASE
- Platform Release set to 2.17.0
Updated by mihai.ibanescu@gmail.com about 5 years ago
- Related to Issue #5573: Publish won't create multiple checkecksummed copies of primary.xml, fileliststs.xml etc even when in fast-forward mode added
Fix RemoveOldRepodataStep for yum publisher
Previously RemoveOldRepodataStep step didn't actually check the files within the repodata.xml to find the files that should be deleted. Instead it made the assumption that the latest one of each file type (to be removed) was present, and removed the files older than the threshold
fixes #3551 https://pulp.plan.io/issues/3551
Co-authored-by: Blake McIvor bmcivor@redhat.com