Project

Profile

Help

Story #4295

closed

As a user, a repository version has no advisories with the same id

Added by ttereshc almost 6 years ago. Updated over 4 years ago.

Status:
CLOSED - CURRENTRELEASE
Priority:
Normal
Assignee:
Sprint/Milestone:
-
Start date:
Due date:
% Done:

100%

Estimated time:
Platform Release:
Groomed:
No
Sprint Candidate:
Yes
Tags:
Sprint:
Quarter:

Description

Motivation

Currently it is possible to have multiple Advisories(UpdateRecord content units) with the same id in one repo version. Those content units are not full duplicates, they have the same id but different content. That leads to a creation of a bad repository at a publication time (id of an advisory should be unique per yum repo, so Pulp needs to publish only one advisory per id).

Possible cases

At sync time in additive mode:

  1. Repo version 1 contains an updateA, in repo version 2 a newer version of the updateA is being added
  2. Repo version 1 contains an updateA, in repo version 2 an older version of the updateA is being added
  3. Repo version 1 contains an updateA, in repo version 2 an alternative version of the updateA (e.g. for different distribution) is being added

In case of a mirror mode for sync, always the incoming version of the updateA is taken, regardless of any criteria.

Suggested solution

For any addition of content to an RPM repo version ensure that there is no more than one UpdateRecord with the same id.

Decide which UpdateRecord to keep based on the criteria defined below.

Criteria

In case of mirror sync, just pick the incoming UpdateRecord.

In all other cases:

  • updated_dates are the same, pkglist intersection is empty (e.g. base repo merged with debuginfo repo)
    -> new UpdateRecord content unit with combined pkglist is created and added to a repo, old UpdateRecord is removed form a repo.
  • updated_dates differ, pkglist intersection is non-empty (update/re-sync/upload-new case) -> UpdateRecord with newer updated_date should be in a repo.
  • updated_dates differ, pkglist intersection is empty - ERROR CONDITION! (base and -debuginfo buit repos are from different versions, not at same date)
    • tell to go make sure that the merging repos are up-to-date, and then retry
  • update_dates are the same, pkglist intersection is non-empty and not equal to either pkglist - ERROR CONDITION!
    • never-happen case - "something is Terribly Wrong Here"

A relevant functional test which is currently being skipped.


Related issues

Has duplicate RPM Support - Story #5084: As a user, after copy or repo version creation there are no advisories with the same idCLOSED - DUPLICATE

Actions
Actions #1

Updated by ttereshc almost 6 years ago

  • Description updated (diff)
Actions #2

Updated by ttereshc almost 6 years ago

  • Subject changed from Remove UpdateRecord duplicates based on the `updated` timestamp to Remove UpdateRecord duplicates based on the `updated_date`
  • Description updated (diff)
Actions #3

Updated by bmbouter almost 6 years ago

The technical proposal confuses me a bit, but I probably don't understand it. If the original duplicate is already associated with the repo version the pipeline will only handle the newly mutated content unit which will be a distinct content unit. So the only one you could "not pass along" would be the new one.

In thinking a bit about this, it seems the RemoveDuplicates stage isn't able to provide a "comparison based" removal. A Q() object is effectively built for each content unit, one-by-one.

In considering how we could generalize RemoveDuplicates a bit to allow the plugin writer to have more control over the Q() object there are a few options, but at that point RemoveDuplicates would have almost no code in it. So that gets met thinking that this issue should just make a brand-new custom stage for handling this that just builds the Q() object that RPM needs and use that in the custom pipeline for RPM.

Actions #4

Updated by ttereshc almost 6 years ago

  • Description updated (diff)

If the original duplicate is already associated with the repo version the pipeline will only handle the newly mutated content unit which will be a distinct content unit. So the only one you could "not pass along" would be the new one.

Yes, that sounds right to me. You "pass along" the mutated content unit if you want to have it in a new repo version instead of the original one.
You "don't pass along" the mutated content unit if you want to keep the original one.

So that gets met thinking that this issue should just make a brand-new custom stage for handling this that just builds the Q() object that RPM needs and use that in the custom pipeline for RPM.

+1 to a new custom stage

I updated the description.

Actions #5

Updated by bmbouter almost 6 years ago

  • Description updated (diff)

The description looks right. I touched it up a bit for clarity. I also added the part where the RemoveDuplicates stage was still planned to be used.

Maybe see the diff if it's right?

Actions #6

Updated by bmbouter almost 6 years ago

  • Description updated (diff)

After some IRC discussion we are making the new stage pass or not pass along UpdateRecords so that all unassociating can happen in one place in the pipeline, i.e. RemoveDuplicates.

Actions #7

Updated by daviddavis almost 6 years ago

  • Groomed changed from No to Yes
  • Sprint Candidate changed from No to Yes
Actions #8

Updated by ttereshc almost 6 years ago

  • Description updated (diff)
Actions #9

Updated by ttereshc over 5 years ago

  • Sprint set to Sprint 50
Actions #10

Updated by rchan over 5 years ago

  • Sprint changed from Sprint 50 to Sprint 51
Actions #11

Updated by daviddavis over 5 years ago

  • Sprint changed from Sprint 51 to Sprint 52
Actions #12

Updated by bmbouter over 5 years ago

  • Tags deleted (Pulp 3)
Actions #13

Updated by rchan over 5 years ago

  • Sprint changed from Sprint 52 to Sprint 53
Actions #14

Updated by amacdona@redhat.com over 5 years ago

  • Sprint changed from Sprint 53 to Sprint 54
Actions #15

Updated by ttereshc over 5 years ago

  • Sprint changed from Sprint 54 to Sprint 55
Actions #16

Updated by ttereshc over 5 years ago

  • Subject changed from Remove UpdateRecord duplicates based on the `updated_date` to As a user, after sync a repository version has no advisories with the same id
  • Description updated (diff)
  • Groomed changed from Yes to No
Actions #17

Updated by ttereshc over 5 years ago

  • Description updated (diff)
Actions #18

Updated by dkliban@redhat.com over 5 years ago

  • Sprint changed from Sprint 55 to Sprint 56
Actions #19

Updated by rchan over 5 years ago

  • Sprint changed from Sprint 56 to Sprint 57
Actions #20

Updated by rchan over 5 years ago

  • Sprint changed from Sprint 57 to Sprint 58
Actions #21

Updated by rchan about 5 years ago

  • Sprint deleted (Sprint 58)
Actions #22

Updated by ttereshc about 5 years ago

  • Tags Pulp 3 RPM blocker added
Actions #23

Updated by daviddavis about 5 years ago

  • Related to Story #5084: As a user, after copy or repo version creation there are no advisories with the same id added
Actions #24

Updated by ttereshc about 5 years ago

  • Subject changed from As a user, after sync a repository version has no advisories with the same id to As a user, a repository version has no advisories with the same id
  • Description updated (diff)
Actions #25

Updated by ttereshc about 5 years ago

  • Status changed from NEW to ASSIGNED
  • Assignee set to ttereshc
Actions #26

Updated by ttereshc about 5 years ago

  • Related to deleted (Story #5084: As a user, after copy or repo version creation there are no advisories with the same id)
Actions #27

Updated by ttereshc about 5 years ago

  • Has duplicate Story #5084: As a user, after copy or repo version creation there are no advisories with the same id added

Added by ttereshc about 5 years ago

Revision 8e1c00d2 | View on GitHub

Identify previous repo version to use in finalization steps

re #4295 https://pulp.plan.io/issues/4295

Actions #28

Updated by ttereshc about 5 years ago

  • Status changed from ASSIGNED to POST

Added by ttereshc about 5 years ago

Revision 1d507db4 | View on GitHub

Extend finalize_new_version to resolve advisory duplicates

Also move createrepo_c object generation for advisory to corresponding models. Add ability to generate createrpeo_c object with additional collecitons.

closes #4295 https://pulp.plan.io/issues/4295

Actions #29

Updated by ttereshc about 5 years ago

  • Status changed from POST to MODIFIED
  • % Done changed from 0 to 100
Actions #30

Updated by ttereshc almost 5 years ago

Actions #31

Updated by ttereshc almost 5 years ago

  • Status changed from MODIFIED to CLOSED - CURRENTRELEASE
Actions #32

Updated by bmbouter over 4 years ago

  • Tags deleted (Pulp 3 RPM blocker)

Also available in: Atom PDF