Project

Profile

Help

Issue #858

As a user, I would like to receive updated errata metadata

Added by cduryee over 5 years ago. Updated over 1 year ago.

Status:
CLOSED - CURRENTRELEASE
Priority:
High
Assignee:
Sprint/Milestone:
-
Start date:
Due date:
Estimated time:
Severity:
2. Medium
Version:
Platform Release:
2.8.5
OS:
Triaged:
Yes
Groomed:
No
Sprint Candidate:
No
Tags:
Pulp 2
Sprint:
Sprint 1
Quarter:

Description

Pulp currently does not update already-synced errata metadata. For example, if errata ABC-2014:100 is released, Pulp will sync it down once and then not check again for updates of that errata. This is not correct behavior since errata can be updated for a variety of reasons.

This story is for users to receive updates to already-synced errata. Pulp can check the 'updated' timestamp field to see if the existing unit is out of date. If so, Pulp will need to use the erratum from the updateinfo.xml instead of what's in the units collection. Note that we do not always want to overwrite the pkglist entirely. We may want to look at the 'shortname' on the individual elements in the pkglist to determine if we can overwrite just one set of packages. This is useful when an erratum is in multiple repos with different package lists per repo.

Deliverables:

  • pulp_rpm changes to check for the 'updated' timestamp when importing errata
  • release note for this feature
  • additional zoo repos of the same erratum with different timestamps to allow demoing and QE testing of this feature
  • testing that this change works correctly when the same erratum is in different repos (example: RHEL6 and 7)

Associated revisions

Revision a8b0d87a View on GitHub
Added by ttereshc over 4 years ago

Fix sync and upload of the same erratum

Handle pkglist for the same errata in different repositories. Update errata metadata based on updated field.

closes #858 https://pulp.plan.io/issues/858

History

#2 Updated by mhrivnak about 5 years ago

  • Platform Release set to 2.8.0
  • Groomed set to No
  • Sprint Candidate set to Yes

#3 Updated by jortel@redhat.com over 4 years ago

  • Priority changed from Normal to High
  • Platform Release deleted (2.8.0)

#4 Updated by jortel@redhat.com over 4 years ago

  • Tracker changed from Story to Issue
  • Severity set to 2. Medium
  • Platform Release set to 2.8.1
  • Triaged set to No

#5 Updated by bmbouter over 4 years ago

  • Triaged changed from No to Yes

#6 Updated by ttereshc over 4 years ago

  • Status changed from NEW to ASSIGNED
  • Assignee set to ttereshc

#7 Updated by ttereshc over 4 years ago

This issue turned out to be a tricky one.
Below are my observations, please, correct me, if I am wrong.

Assumptions:
- the erratum id does not change, when its metadata is updated
- the same erratum can be in different repos (example: RHEL6 and 7)
- any errata metadata can be changed
- if errata metadata is updated there is no guarantee that it will be updated simultaneously in both repos (RHEL6 and 7) or even in the same way.

Right now Pulp stores errata in the collection `units_erratum` and there is no information in database about neither the Pulp repo nor the feed related to each erratum.
If erratum id is the same in different repos but the pkglist name is different, Pulp updates pkglist list by adding a new pkglist to it (this change was introduced by this commit).

So we have one record for each erratum (with erratum_id as a unit_key) even if it is presented in different repositories.
So it looks like it is not safe to update errata metadata, if it could be different for different repositories (RHEL6 and 7).

Any thoughts?

#8 Updated by semyers over 4 years ago

ttereshc wrote:

Assumptions:
- the erratum id does not change, when its metadata is updated
- the same erratum can be in different repos (example: RHEL6 and 7)

I believe these are definitely true.

- any errata metadata can be changed
- if errata metadata is updated there is no guarantee that it will be updated simultaneously in both repos (RHEL6 and 7) or even in the same way.

I don't know if these are true, but I think if assumptions are to be made, it's best to assume that they are. So, any errata metadata can be changed, and it is possible to update only the errata metadata in the rhel6 without making any changes to the rhel7 repo.

Right now Pulp stores errata in the collection `units_erratum` and there is no information in database about neither the Pulp repo nor the feed related to each erratum.
If erratum id is the same in different repos but the pkglist name is different, Pulp updates pkglist list by adding a new pkglist to it (this change was introduced by this commit).

To elaborate on this, because we don't keep track (that I know of) of which repository provides which packages, we don't have a reliable way of choosing which packages to remove from existing package lists. So, if the assumption above is true (any errata metadata can be changed), then that includes the package list. If this assumption is false, and specifically if the packagelist is guaranteed not to change for an errata, then this particular problem goes away.

So we have one record for each erratum (with erratum_id as a unit_key) even if it is presented in different repositories.
So it looks like it is not safe to update errata metadata, if it could be different for different repositories (RHEL6 and 7).

Any thoughts?

Getting solid answers about the two uncertain assumptions is probably the best choice, which I think means we need the answers to these questions before we can know the best way to proceed:

- Can all errata metadata be changed, or only specific fields? If it's only specific fields, which fields can be changed?
- When errata metadata is changed, is it changed for all errata with that ID in all repos where it exists in updateinfo, or are changes potentially made in only a single repo (or subset of repos) containing that errata?

Orthogonally related thought:
This is drastic, but one way forward is to make the errata primary key be a composite key including the errata_id and repo_id. I have some ideas about how to make this work, but would prefer simpler options for a more immediate solution and plan a much-needed (in my opinion) errata refactor for a later release.

#9 Updated by rbarlow over 4 years ago

On Tuesday, March 22, 2016 7:45:01 PM EDT you wrote:

one way forward is to make the errata primary key be a composite key
including the errata_id and repo_id.

FWIW, this is how I solved the Tag uniqueness problem in pulp_docker.

#10 Updated by ttereshc over 4 years ago

Answers from jluza:

- Can any errata metadata be changed, or only specific fields? (in case
advisory id has not changed)

They can. I would say almost all fields that are filled by human, are potentially
vulnerable to mistakes so they should be changeable.

Can pkglist also be changed without changing the advisory id?

I suppose it shouldn't happen, but I guess it can.

- The same erratum can be in different repositories (for example, in
RHEL6 and RHEL7).

yes. But it should contain only packages that are in the repository.
Sou you can have multi product advisory, but for RHEL-6 repos, generated
advisories in updateinfo should contain only rhel-6 packages in packagelist.

When errata metadata is changed, is it changed for all errata with the
same id in all repositories?

yes

#11 Updated by ttereshc over 4 years ago

So for now I am going to overwrite the relevant pkglist (based on its name) and the rest of the errata metadata. Does anyone think it is still not safe to update errata metadata?

#12 Updated by semyers over 4 years ago

  • Platform Release changed from 2.8.1 to 2.8.2

#13 Updated by ttereshc over 4 years ago

New info (at least for me).
Currently the same erratum in different repos (RHEL6 and 7) may have the same name for pkglist (but packages will be different as expected).
That means fix for handling errata from different repositories by concatenating pkglist based on its name does not work :(

#14 Updated by rbarlow over 4 years ago

Hello Tanya!

I think it might help if we change the uniqueness constraint on Errata to be a compound key of its name and the repo it is part of. Here's an example of how I solved a similar problem with the Tag model in pulp_docker:

https://github.com/pulp/pulp_docker/blob/a902085/plugins/pulp_docker/plugins/models.py#L253

If you follow this approach, it does introduce some other problems however. Now whenever Errata are "copied" between repositories, you actually need to create a new one that is like the old one, but has a different repo_id. I did that for the Tag here:

https://github.com/pulp/pulp_docker/blob/a90208581de1601c02b280175f774dbbca4f251d/plugins/pulp_docker/plugins/importers/importer.py#L260-L266

So my proposal isn't all roses and ponies, but it worked out OK for the Tag. Just a thought!

#15 Updated by mhrivnak over 4 years ago

  • Sprint/Milestone set to 19

#16 Updated by semyers over 4 years ago

  • Platform Release changed from 2.8.2 to 2.8.3

#18 Updated by semyers over 4 years ago

  • Platform Release changed from 2.8.3 to 2.8.4

#19 Updated by ttereshc over 4 years ago

  • Status changed from ASSIGNED to POST

To handle the same errata in different repositories `_pulp_repo_id` is added to each pkglist collection in the database. Unit key is still errata_id, nothing changed in that regard.

  • _pulp_repo_id may be added to the erratum pkglist collection or new erratum pkglist collection may be added only during sync or upload.
  • During copy no modifications are made to the erratum unit in the database.
  • Pkglists are not removed during repository removal, only erratum as a whole can be removed during orphan removal if no repository contains it.
  • During publish duplicates and empty collections may appear in pkglists in the updateinfo file.

Errata metadata is updated based on the `updated` field. If the `updated` field is in the unknown format, the erratum won't be updated.

https://github.com/pulp/pulp_rpm/pull/852

#20 Updated by ttereshc over 4 years ago

  • Status changed from POST to MODIFIED
  • % Done changed from 0 to 100

#21 Updated by semyers over 4 years ago

  • Status changed from MODIFIED to 5

#22 Updated by semyers over 4 years ago

  • Platform Release changed from 2.8.4 to 2.8.5

#23 Updated by semyers over 4 years ago

  • Status changed from 5 to MODIFIED

#24 Updated by semyers over 4 years ago

  • Status changed from MODIFIED to 5

#25 Updated by semyers over 4 years ago

  • Status changed from 5 to CLOSED - CURRENTRELEASE

#27 Updated by bmbouter over 2 years ago

  • Sprint set to Sprint 1

#28 Updated by bmbouter over 2 years ago

  • Sprint/Milestone deleted (19)

#29 Updated by bmbouter over 1 year ago

  • Tags Pulp 2 added

Please register to edit this issue

Also available in: Atom PDF