Task #3954
closed
Prevent duplicate Package content in repos
Status:
CLOSED - CURRENTRELEASE
Description
Packages are unique in a repo by NEVRA. When implemented, this will cause the most recently added unique NEVRA to be kept and other duplicate NEVRA to be removed. Duplicate NEVRA could have different checksums.
[{
'model': Package,
'field_names': ['name', 'epoch', 'version', 'release', 'arch']
}]
Use a RemoveDuplicates
stage provided by pulpcore-plugin.
- Blocked by Story #3934: As a plugin writer, I can have a stage that removes duplicates added
- Description updated (diff)
- Description updated (diff)
- Sprint/Milestone deleted (
Pulp 3 RPM MVP)
- Subject changed from Remove duplicate RPM and Erratum content from repos after sync to Prevent duplicate RPM and Erratum content in repos
This isn't specific to sync. We need to handle the case where content is associated to a repo (ie via upload or copied) as well.
- Subject changed from Prevent duplicate RPM and Erratum content in repos to Prevent duplicate Package and Erratum content in repos
- Description updated (diff)
- Description updated (diff)
- Subject changed from Prevent duplicate Package and Erratum content in repos to Prevent duplicate Package content in repos
- Description updated (diff)
- Groomed changed from No to Yes
- Status changed from NEW to POST
- Assignee set to ttereshc
- Sprint set to Sprint 46
- Related to Test #4297: Prevent duplicate Package content in RPM repos added
- Status changed from POST to MODIFIED
- % Done changed from 0 to 100
@Daviddavis
For testing this, do we need to have fixture that has a duplicate file with the same NEVRA. If this is the case, how do we ensure the most recently added package is the one which is present.
Or is this related to sync a repository and then try to upload the same package using single request upload. If this is the case, is this related to https://pulp.plan.io/issues/4536
Please let me know your thoughts on this.
ragbalak, a fixture couldn't contain two packages with the same NEVRA because the filenames in the repo are based on NEVRA so the two packages couldn't both be served.
What I would do for testing probably is to have two different rpms with the same nevra but different checksums. I would upload one (package A) and then sync down the second (package B). Then you should see both in pulp and you could just check which is associated with your repo. It should be package B and only package B.
Awesome. Thanks @Daviddavis
- Status changed from MODIFIED to CLOSED - CURRENTRELEASE
Also available in: Atom
PDF
Remove RPM duplicates from a repo version
closes #3954 https://pulp.plan.io/issues/3954