Project

Profile

Help

Issue #3551

closed

RemoveOldRepodataStep for yum publisher not checking repomd.xml to remove old files

Added by mtahir over 6 years ago. Updated over 5 years ago.

Status:
CLOSED - CURRENTRELEASE
Priority:
Normal
Assignee:
-
Sprint/Milestone:
-
Start date:
Due date:
Estimated time:
Severity:
3. High
Version:
Master
Platform Release:
2.17.0
OS:
Triaged:
Yes
Groomed:
No
Sprint Candidate:
No
Tags:
Pulp 2
Sprint:
Quarter:

Description

Current code in yum distributor publish doesn't actually check the files within the repodata.xml to compare the files that should be kept and files that should be deleted. Instead it makes the assumption that the latest one of each file type (to be removed) is present and removes the files older then the threshold (default 14 days). Given that mtime could be changed by someone touching a file or even moving it on the file system, it may lead to deletion of incorrect files.

related code:

        for key, val in to_remove.iteritems():
            # preserve at least one file of each kind - pop out latest
            if not set(groupped[key]) - set(val):
                val.pop(0)
            for f in val:
                self.remove_repodata_file(f[0])

Related issues

Related to RPM Support - Story #2788: As a user i can configure removal of old published repodata CLOSED - CURRENTRELEASEjluza

Actions
Related to Pulp - Issue #5573: Publish won't create multiple checkecksummed copies of primary.xml, fileliststs.xml etc even when in fast-forward modeCLOSED - CURRENTRELEASEActions
Actions #1

Updated by bmbouter over 6 years ago

  • Triaged changed from No to Yes

I think in order to do this more accurately we would need to store a record of the published data. Where would we store this type of data?

The rpm devs are really Pulp3 focused right now and Pulp2 is almost near it's maintenance phase. For Pulp2 we could work with you to help prepare a fix. Is there any possibility you would be able to contribute an improvement in this area?

Actions #2

Updated by mtahir over 6 years ago

I have created a fix[1] and pushed it for review. With regards to storing the record of published data, I created the fix with the assumption that any file that is not in repomd.xml, and older than the threshold should be removed.

[1] https://github.com/mztahir/pulp_rpm/commit/0b1c9edaa853b28bdead8af6e5ed36686aa6caa6

Actions #3

Updated by ipanova@redhat.com over 6 years ago

@mtahir, can you actually submit this as a PR to upstream pulp_rpm? thank you.

Actions #4

Updated by bmcivor over 6 years ago

@ipanova, PR has been created on github for this issue. If there's any more information you need, feel free to ask.

Actions #5

Updated by ipanova@redhat.com over 6 years ago

  • Status changed from NEW to POST
Actions #6

Updated by ipanova@redhat.com over 6 years ago

  • Related to Story #2788: As a user i can configure removal of old published repodata added

Added by mtahir over 6 years ago

Revision 3774e387 | View on GitHub

Fix RemoveOldRepodataStep for yum publisher

Previously RemoveOldRepodataStep step didn't actually check the files within the repodata.xml to find the files that should be deleted. Instead it made the assumption that the latest one of each file type (to be removed) was present, and removed the files older than the threshold

fixes #3551 https://pulp.plan.io/issues/3551

Co-authored-by: Blake McIvor

Actions #8

Updated by mtahir over 6 years ago

  • Status changed from POST to MODIFIED
Actions #10

Updated by dkliban@redhat.com about 6 years ago

  • Status changed from MODIFIED to CLOSED - CURRENTRELEASE
  • Platform Release set to 2.17.0
Actions #11

Updated by bmbouter over 5 years ago

  • Tags Pulp 2 added
Actions #12

Updated by mihai.ibanescu@gmail.com about 5 years ago

  • Related to Issue #5573: Publish won't create multiple checkecksummed copies of primary.xml, fileliststs.xml etc even when in fast-forward mode added

Also available in: Atom PDF