Project

Profile

Help

Task #3954

Prevent duplicate Package content in repos

Added by daviddavis about 1 year ago. Updated 7 months ago.

Status:
MODIFIED
Priority:
Normal
Assignee:
Category:
-
Sprint/Milestone:
-
Start date:
Due date:
% Done:

100%

Platform Release:
Blocks Release:
Backwards Incompatible:
No
Groomed:
Yes
Sprint Candidate:
No
Tags:
QA Contact:
Complexity:
Smash Test:
Verified:
No
Verification Required:
No
Sprint:
Sprint 46

Description

Packages are unique in a repo by NEVRA. When implemented, this will cause the most recently added unique NEVRA to be kept and other duplicate NEVRA to be removed. Duplicate NEVRA could have different checksums.

[{
    'model': Package,
    'field_names': ['name', 'epoch', 'version', 'release', 'arch']
}]

Use a RemoveDuplicates stage provided by pulpcore-plugin.


Related issues

Related to RPM Support - Test #4297: Prevent duplicate Package content in RPM repos CLOSED - COMPLETE Actions
Blocked by Pulp - Story #3934: As a plugin writer, I can have a stage that removes duplicates MODIFIED Actions

Associated revisions

Revision 94132db1 View on GitHub
Added by ttereshc 11 months ago

Remove RPM duplicates from a repo version

closes #3954
https://pulp.plan.io/issues/3954

History

#1 Updated by daviddavis about 1 year ago

  • Blocked by Story #3934: As a plugin writer, I can have a stage that removes duplicates added

#2 Updated by daviddavis about 1 year ago

  • Description updated (diff)

#3 Updated by daviddavis about 1 year ago

  • Description updated (diff)

#4 Updated by daviddavis about 1 year ago

  • Sprint/Milestone deleted (Pulp 3 RPM MVP)

#5 Updated by daviddavis about 1 year ago

  • Subject changed from Remove duplicate RPM and Erratum content from repos after sync to Prevent duplicate RPM and Erratum content in repos

This isn't specific to sync. We need to handle the case where content is associated to a repo (ie via upload or copied) as well.

#6 Updated by bmbouter 12 months ago

  • Subject changed from Prevent duplicate RPM and Erratum content in repos to Prevent duplicate Package and Erratum content in repos
  • Description updated (diff)

#7 Updated by bmbouter 12 months ago

  • Description updated (diff)

#8 Updated by ttereshc 11 months ago

  • Subject changed from Prevent duplicate Package and Erratum content in repos to Prevent duplicate Package content in repos
  • Description updated (diff)
  • Groomed changed from No to Yes

#9 Updated by ttereshc 11 months ago

  • Status changed from NEW to POST
  • Assignee set to ttereshc
  • Sprint set to Sprint 46

#10 Updated by bherring 11 months ago

  • Related to Test #4297: Prevent duplicate Package content in RPM repos added

#11 Updated by ttereshc 10 months ago

  • Status changed from POST to MODIFIED
  • % Done changed from 0 to 100

#12 Updated by ragbalak 7 months ago

@Daviddavis

For testing this, do we need to have fixture that has a duplicate file with the same NEVRA. If this is the case, how do we ensure the most recently added package is the one which is present.

Or is this related to sync a repository and then try to upload the same package using single request upload. If this is the case, is this related to https://pulp.plan.io/issues/4536

Please let me know your thoughts on this.

#13 Updated by daviddavis 7 months ago

@ragbalak, a fixture couldn't contain two packages with the same NEVRA because the filenames in the repo are based on NEVRA so the two packages couldn't both be served.

What I would do for testing probably is to have two different rpms with the same nevra but different checksums. I would upload one (package A) and then sync down the second (package B). Then you should see both in pulp and you could just check which is associated with your repo. It should be package B and only package B.

#14 Updated by ragbalak 7 months ago

Awesome. Thanks @Daviddavis

#15 Updated by bmbouter 7 months ago

  • Tags deleted (Pulp 3)

Please register to edit this issue

Also available in: Atom PDF