Project

Profile

Help

Story #5008

Updated by ttereshc over 5 years ago

RemoveDuplicates stage provides the opportunity to enforce uniqueness constraints for content at the sync time. 
 The duplicate problem can be encountered at any time content is added to a repository, not only at sync time. 
 E.g. Content was uploaded, or content was synced as a part of other repo, and now it's added/copied to a new repository. 

 It would be good if some check/validation happened for any added content, e.g. "at this stage":https://github.com/pulp/pulpcore/blob/aef490e201f89fc005ba3239fda3a79c05e28fd7/pulpcore/app/models/repository.py#L343 

 Examples for where uniqueness might be needed in a repo version: 
  - only one content unit with a certain characteristics should be present in a repo (only one advisory with the same id, only one module_defaults for a module, etc) 
  - relative paths if it's a part of a content unit, should not collide (for pulp file, paths 'a' and 'a/b' should not be in the same repo version) 

 This issue was discussed on this thread: https://www.redhat.com/archives/pulp-dev/2019-May/msg00061.html 

 h3. Solution 

 Plugins: 
 On the plugin content model define a @repo_key@ - one or more fields which must be unique within a repo version. 

 Pulpcore: 
 Check uniqueness of the @repo_key@ at the repository version creation time https://github.com/pulp/pulpcore/blob/aef490e201f89fc005ba3239fda3a79c05e28fd7/pulpcore/app/models/repository.py#L343. 
 Whether it's sync, copy, or upload, @repo_key@ uniqueness will be ensured if core/plugin devs use @with RepositoryVersion.create(...)@ context manager.

Back