Project

Profile

Help

Story #4295

As a user, a repository version has no advisories with the same id

Added by ttereshc 11 months ago. Updated 20 days ago.

Status:
MODIFIED
Priority:
Normal
Assignee:
Category:
-
Sprint/Milestone:
-
Start date:
Due date:
% Done:

100%

Platform Release:
Blocks Release:
Backwards Incompatible:
No
Groomed:
No
Sprint Candidate:
Yes
Tags:
Pulp 3 RPM blocker
QA Contact:
Complexity:
Smash Test:
Verified:
No
Verification Required:
No
Sprint:

Description

Motivation

Currently it is possible to have multiple Advisories(UpdateRecord content units) with the same id in one repo version. Those content units are not full duplicates, they have the same id but different content. That leads to a creation of a bad repository at a publication time (id of an advisory should be unique per yum repo, so Pulp needs to publish only one advisory per id).

Possible cases

At sync time in additive mode:
  1. Repo version 1 contains an updateA, in repo version 2 a newer version of the updateA is being added
  2. Repo version 1 contains an updateA, in repo version 2 an older version of the updateA is being added
  3. Repo version 1 contains an updateA, in repo version 2 an alternative version of the updateA (e.g. for different distribution) is being added

In case of a mirror mode for sync, always the incoming version of the updateA is taken, regardless of any criteria.

Suggested solution

For any addition of content to an RPM repo version ensure that there is no more than one UpdateRecord with the same id.

Decide which UpdateRecord to keep based on the criteria defined below.

Criteria

In case of mirror sync, just pick the incoming UpdateRecord.

In all other cases:

  • updated_dates are the same, pkglist intersection is empty (e.g. base repo merged with debuginfo repo)
    -> new UpdateRecord content unit with combined pkglist is created and added to a repo, old UpdateRecord is removed form a repo.
  • updated_dates differ, pkglist intersection is non-empty (update/re-sync/upload-new case) -> UpdateRecord with newer updated_date should be in a repo.
  • updated_dates differ, pkglist intersection is empty - ERROR CONDITION! (base and -debuginfo buit repos are from different versions, not at same date)
    • tell to go make sure that the merging repos are up-to-date, and then retry
  • update_dates are the same, pkglist intersection is non-empty and not equal to either pkglist - ERROR CONDITION!
    • never-happen case - "something is Terribly Wrong Here"

A relevant functional test which is currently being skipped.


Checklist


Related issues

Duplicated by RPM Support - Story #5084: As a user, after copy or repo version creation there are no advisories with the same id CLOSED - DUPLICATE Actions

Associated revisions

Revision 8e1c00d2 View on GitHub
Added by ttereshc 20 days ago

Identify previous repo version to use in finalization steps

re #4295
https://pulp.plan.io/issues/4295

Revision 1d507db4 View on GitHub
Added by ttereshc 20 days ago

Extend finalize_new_version to resolve advisory duplicates

Also move createrepo_c object generation for advisory to corresponding
models. Add ability to generate createrpeo_c object with additional
collecitons.

closes #4295
https://pulp.plan.io/issues/4295

History

#1 Updated by ttereshc 11 months ago

  • Description updated (diff)

#2 Updated by ttereshc 11 months ago

  • Subject changed from Remove UpdateRecord duplicates based on the `updated` timestamp to Remove UpdateRecord duplicates based on the `updated_date`
  • Description updated (diff)

#3 Updated by bmbouter 11 months ago

The technical proposal confuses me a bit, but I probably don't understand it. If the original duplicate is already associated with the repo version the pipeline will only handle the newly mutated content unit which will be a distinct content unit. So the only one you could "not pass along" would be the new one.

In thinking a bit about this, it seems the RemoveDuplicates stage isn't able to provide a "comparison based" removal. A Q() object is effectively built for each content unit, one-by-one.

In considering how we could generalize RemoveDuplicates a bit to allow the plugin writer to have more control over the Q() object there are a few options, but at that point RemoveDuplicates would have almost no code in it. So that gets met thinking that this issue should just make a brand-new custom stage for handling this that just builds the Q() object that RPM needs and use that in the custom pipeline for RPM.

#4 Updated by ttereshc 11 months ago

  • Description updated (diff)

If the original duplicate is already associated with the repo version the pipeline will only handle the newly mutated content unit which will be a distinct content unit. So the only one you could "not pass along" would be the new one.

Yes, that sounds right to me. You "pass along" the mutated content unit if you want to have it in a new repo version instead of the original one.
You "don't pass along" the mutated content unit if you want to keep the original one.

So that gets met thinking that this issue should just make a brand-new custom stage for handling this that just builds the Q() object that RPM needs and use that in the custom pipeline for RPM.

+1 to a new custom stage

I updated the description.

#5 Updated by bmbouter 11 months ago

  • Description updated (diff)

The description looks right. I touched it up a bit for clarity. I also added the part where the RemoveDuplicates stage was still planned to be used.

Maybe see the diff if it's right?

#6 Updated by bmbouter 11 months ago

  • Description updated (diff)

After some IRC discussion we are making the new stage pass or not pass along UpdateRecords so that all unassociating can happen in one place in the pipeline, i.e. RemoveDuplicates.

#7 Updated by daviddavis 11 months ago

  • Groomed changed from No to Yes
  • Sprint Candidate changed from No to Yes

#8 Updated by ttereshc 11 months ago

  • Checklist item Create a new stage ErratumContentUnitFilter added
  • Checklist item Add/enable RemoveDuplicates stage for UpdateRecord content added
  • Checklist item Unskip and adjust the relevant functional test added
  • Description updated (diff)

#9 Updated by ttereshc 9 months ago

  • Sprint set to Sprint 50

#10 Updated by rchan 8 months ago

  • Sprint changed from Sprint 50 to Sprint 51

#11 Updated by daviddavis 8 months ago

  • Sprint changed from Sprint 51 to Sprint 52

#12 Updated by bmbouter 7 months ago

  • Tags deleted (Pulp 3)

#13 Updated by rchan 7 months ago

  • Sprint changed from Sprint 52 to Sprint 53

#14 Updated by amacdona@redhat.com 6 months ago

  • Sprint changed from Sprint 53 to Sprint 54

#15 Updated by ttereshc 6 months ago

  • Sprint changed from Sprint 54 to Sprint 55

#16 Updated by ttereshc 5 months ago

  • Checklist item deleted (Create a new stage ErratumContentUnitFilter)
  • Checklist item deleted (Add/enable RemoveDuplicates stage for UpdateRecord content)
  • Checklist item AdvisoryContentUnitMerger stage for the sync case which applies the described criteria added
  • Checklist item Rename ErratumContentUnitSaver to AdvisoryContentUnitSaver added
  • Subject changed from Remove UpdateRecord duplicates based on the `updated_date` to As a user, after sync a repository version has no advisories with the same id
  • Description updated (diff)
  • Groomed changed from Yes to No

#17 Updated by ttereshc 5 months ago

  • Description updated (diff)

#18 Updated by dkliban@redhat.com 5 months ago

  • Sprint changed from Sprint 55 to Sprint 56

#19 Updated by rchan 4 months ago

  • Sprint changed from Sprint 56 to Sprint 57

#20 Updated by rchan 3 months ago

  • Sprint changed from Sprint 57 to Sprint 58

#21 Updated by rchan 3 months ago

  • Sprint deleted (Sprint 58)

#22 Updated by ttereshc 2 months ago

  • Tags Pulp 3 RPM blocker added

#23 Updated by daviddavis 2 months ago

  • Related to Story #5084: As a user, after copy or repo version creation there are no advisories with the same id added

#24 Updated by ttereshc about 1 month ago

  • Checklist item deleted (Rename ErratumContentUnitSaver to AdvisoryContentUnitSaver)
  • Checklist item changed from AdvisoryContentUnitMerger stage for the sync case which applies the described criteria to Implement advisory merge based on the described criteria
  • Checklist item Add documentation describing advisory behaviour added
  • Subject changed from As a user, after sync a repository version has no advisories with the same id to As a user, a repository version has no advisories with the same id
  • Description updated (diff)

#25 Updated by ttereshc about 1 month ago

  • Status changed from NEW to ASSIGNED
  • Assignee set to ttereshc

#26 Updated by ttereshc about 1 month ago

  • Related to deleted (Story #5084: As a user, after copy or repo version creation there are no advisories with the same id)

#27 Updated by ttereshc about 1 month ago

  • Duplicated by Story #5084: As a user, after copy or repo version creation there are no advisories with the same id added

#28 Updated by ttereshc 20 days ago

  • Checklist item Implement advisory merge based on the described criteria set to Done
  • Status changed from ASSIGNED to POST

#29 Updated by ttereshc 20 days ago

  • Status changed from POST to MODIFIED
  • % Done changed from 0 to 100

Please register to edit this issue

Also available in: Atom PDF