Issue #7092
closedre-migration of errata that have now been copied to new repositories are not migrated for those new repositories
Description
If an errata is shared between two repositories it is migrated properly for both repos. If however you sync the errata into 1 repository, run the migration, copy the errata to another repository, and run the migration again, the errata isn't for the new repository.
Steps to reproduce:
- create a yum repo in pulp2, sync some repository with errata
- migrate the repository
- Create a 2nd yum repo in pulp2, copy the errata to it
- run the migration with both repositories
Results: the errata for the 2nd repository will not be migrated.
If you reset the pulp3 db and re-migrate, it all works fine and the errata are migrated for both repos.
This was tested with 0.2.0b5
Updated by jsherril@redhat.com over 4 years ago
The repos i reproduced with are el8 baseos and el8 appstream. I actually had 2 repos in the initial migration and copied the contents to two more. My final migration plan is:
{
"plugins": [
{
"type": "docker",
"repositories": []
},
{
"type": "iso",
"repositories": []
},
{
"type": "rpm",
"repositories": [
{
"name": "02fe9557-8cc2-4965-a314-06845cfc4c9e",
"repository_versions": [
{
"pulp2_repository_id": "02fe9557-8cc2-4965-a314-06845cfc4c9e",
"pulp2_distributor_repository_ids": [
"02fe9557-8cc2-4965-a314-06845cfc4c9e"
]
}
],
"pulp2_importer_repository_id": "02fe9557-8cc2-4965-a314-06845cfc4c9e"
},
{
"name": "99ab9d40-45ea-4d63-b1c4-941ca2c28fc5",
"repository_versions": [
{
"pulp2_repository_id": "99ab9d40-45ea-4d63-b1c4-941ca2c28fc5",
"pulp2_distributor_repository_ids": [
"99ab9d40-45ea-4d63-b1c4-941ca2c28fc5"
]
}
],
"pulp2_importer_repository_id": "99ab9d40-45ea-4d63-b1c4-941ca2c28fc5"
},
{
"name": "cv2-Red_Hat_Enterprise_Linux_8_for_x86_64_-_AppStream_RPMs_8",
"repository_versions": [
{
"pulp2_repository_id": "1-cv2-v1_0-99ab9d40-45ea-4d63-b1c4-941ca2c28fc5",
"pulp2_distributor_repository_ids": [
"1-cv2-Library-99ab9d40-45ea-4d63-b1c4-941ca2c28fc5",
"1-cv2-v1_0-99ab9d40-45ea-4d63-b1c4-941ca2c28fc5"
]
}
]
},
{
"name": "cv2-Red_Hat_Enterprise_Linux_8_for_x86_64_-_BaseOS_RPMs_8",
"repository_versions": [
{
"pulp2_repository_id": "1-cv2-v1_0-02fe9557-8cc2-4965-a314-06845cfc4c9e",
"pulp2_distributor_repository_ids": [
"1-cv2-Library-02fe9557-8cc2-4965-a314-06845cfc4c9e",
"1-cv2-v1_0-02fe9557-8cc2-4965-a314-06845cfc4c9e"
]
}
]
}
]
}
]
}
Updated by dalley over 4 years ago
- Status changed from NEW to ASSIGNED
- Assignee set to dalley
Updated by dalley over 4 years ago
I'm not sure if I did something wrong, but I wasn't immediately able to reproduce this. Here's my two Pulp 2 repos.
[vagrant@pulp2-nightly-pulp3-source-centos7 devel]$ pulp-admin rpm repo content errata --repo-id test1
Description: ParthaBird_Erratum
Id: RHEA-2012:0056
Severity:
Summary:
Title: Bird_Erratum
Type: security
Description: Bear_Erratum
Id: RHEA-2012:0057
Severity:
Summary:
Title: Bear_ErratumPARTHA
Type: security
Description: Gorilla_Erratum
Id: RHEA-2012:0058
Severity:
Summary:
Title: Gorilla_Erratum
Type: enhancement
Description: Sea_Erratum
Id: RHEA-2012:0055
Severity:
Summary:
Title: Sea_Erratum
Type: security
[vagrant@pulp2-nightly-pulp3-source-centos7 devel]$ pulp-admin rpm repo content errata --repo-id test2
Description: Bear_Erratum
Id: RHEA-2012:0057
Severity:
Summary:
Title: Bear_ErratumPARTHA
Type: security
I created & synced test1, and created test2 (but left it empty and didn't include it in the migration plan), then ran the migration, then copied the errata, then migrated again with both repos. At the end, I have 2 repos in Pulp 3, and both have the full set of content they are supposed to have:
...
"present": {
"rpm.advisory": {
"count": 4,
"href": "/pulp/api/v3/content/rpm/advisories/?repository_version=/pulp/api/v3/repositories/rpm/rpm/ece83dc3-b0b9-4ab3-b91c-4588a0ab7d4d/versions/1/"
},
...
"present": {
"rpm.advisory": {
"count": 1,
"href": "/pulp/api/v3/content/rpm/advisories/?repository_version=/pulp/api/v3/repositories/rpm/rpm/33d543be-e407-46de-a8ed-0503f670b478/versions/1/"
}
},
This was with the normal test fixture - there may be something different about the Centos 8 repos (and also I could have missed something). Maybe creating the repo early influenced something despite not having it in the migration plan. Will keep trying tomorrow.
Updated by jsherril@redhat.com over 4 years ago
The issue i was seeing was that the pulp2content objects for the errata would have blank 'pulp3_repository_version' attributes. I didn't confirm whether or not they actually ended up in the 'new' repo version.
Updated by dalley over 4 years ago
Strange. So what I see is, the erratum ends up in both repos as expected, and it's not duplicated or anything (content unit has the same href).
Here's what my /pulp2content/ looks like
{
"pulp_href": "/pulp/api/v3/pulp2content/fa0ad407-c842-41ac-a7bf-dbbf96de1a85/",
"pulp_created": "2020-07-15T15:45:52.141120Z",
"pulp2_id": "238b6545-d4af-4727-9391-92530070d54f",
"pulp2_content_type_id": "erratum",
"pulp2_last_updated": 1594827881,
"pulp2_storage_path": null,
"downloaded": false,
"pulp3_content": "/pulp/api/v3/content/rpm/advisories/676aeb6f-854f-4942-8642-ebd1291468e1/",
"pulp3_repository_version": "/pulp/api/v3/repositories/rpm/rpm/d995c4d3-7dae-44f7-9ba0-87e6298370ff/versions/1/"
},
...
...
{
"pulp_href": "/pulp/api/v3/pulp2content/95395826-43c9-407c-919f-34ea343c7fc4/",
"pulp_created": "2020-07-15T15:45:51.989257Z",
"pulp2_id": "18e4f45a-8522-47b7-979c-03d842a8f01a",
"pulp2_content_type_id": "rpm",
"pulp2_last_updated": 1594827881,
"pulp2_storage_path": "/var/lib/pulp/content/units/rpm/f8/d7cba1691f3bc7e29a7c8966be06d8418de488f4ffa7f84ad472de6d604e9b/zebra-0.1-2.noarch.rpm",
"downloaded": false,
"pulp3_content": "/pulp/api/v3/content/rpm/packages/6347ad7c-8df8-4bbd-9d9b-8f8a54047c6a/"
},
A few things stand out:
- Not all content types provide the same fields here, RPMs don't have a "pulp3_repository_version", which surprises me but I can see why there might be a reason for it
- This is "wrong" because the same errata can be in multiple repository versions simultaneously, so it's nonsensical to just have a field for a single repo version
- Unless, there are supposed to be multiple entries for the same errata in here per-repo, in which case, there are not
I don't have entries with that field empty though.
Updated by dalley over 4 years ago
I'm finding new bugs. Wiping the system and re-migrating gives this:
(pulp) [vagrant@pulp2-nightly-pulp3-source-centos7 devel]$ pulp-admin rpm repo content errata --repo-id test2
Description: Bear_Erratum
Id: RHEA-2012:0057
Severity:
Summary:
Title: Bear_ErratumPARTHA
Type: security
(pulp) [vagrant@pulp2-nightly-pulp3-source-centos7 devel]$ http GET :24817/pulp/api/v3/content/rpm/advisories/?repository_version=/pulp/api/v3/repositories/rpm/rpm/d18c672a-d6e9-4e6a-938f-3cf5a811db06/versions/1/
HTTP/1.1 200 OK
Allow: GET, POST, HEAD, OPTIONS
Connection: close
Content-Length: 1356
Content-Type: application/json
Date: Thu, 16 Jul 2020 20:01:07 GMT
Server: gunicorn/20.0.4
Vary: Accept, Cookie
X-Frame-Options: SAMEORIGIN
{
"count": 2,
"next": null,
"previous": null,
"results": [
{
"description": "Bear_Erratum",
"fromstr": "errata@redhat.com",
"id": "RHEA-2012:0057",
"issued_date": "2013-01-27 16:08:05",
"pkglist": [
{
"name": "collection-0",
"packages": [
{
"arch": "noarch",
"epoch": "0",
"filename": "bear-4.1-1.noarch.rpm",
"name": "bear",
"reboot_suggested": false,
"release": "1",
"relogin_suggested": false,
"restart_suggested": false,
"src": "http://www.fedoraproject.org",
"sum": "",
"sum_type": "",
"version": "4.1"
}
],
"shortname": ""
}
],
"pulp_created": "2020-07-16T19:55:54.463062Z",
"pulp_href": "/pulp/api/v3/content/rpm/advisories/0bb6563b-6c6b-437e-b86c-95b75e2e1625/",
"pushcount": "",
"reboot_suggested": false,
"references": [],
"release": "1",
"rights": "",
"severity": "",
"solution": "",
"status": "stable",
"summary": "",
"title": "Bear_ErratumPARTHA",
"type": "security",
"updated_date": "2013-01-27 16:08:05",
"version": "1"
},
{
"description": "Bear_Erratum",
"fromstr": "errata@redhat.com",
"id": "RHEA-2012:0057",
"issued_date": "2013-01-27 16:08:05",
"pkglist": [],
"pulp_created": "2020-07-16T19:55:54.456933Z",
"pulp_href": "/pulp/api/v3/content/rpm/advisories/930cd7e6-0945-40aa-81c9-cd29c9c056c4/",
"pushcount": "",
"reboot_suggested": false,
"references": [],
"release": "1",
"rights": "",
"severity": "",
"solution": "",
"status": "stable",
"summary": "",
"title": "Bear_ErratumPARTHA",
"type": "security",
"updated_date": "2013-01-27 16:08:05",
"version": "1"
}
]
}
Updated by dalley over 4 years ago
^^ Is just because there's no RPM packages in the current repo, so it ends up constructing an entirely different content unit with an empty package list.
When everything is done in one batch, we get two Pulp2Content and two UpdateRecords (because the pkglist is different, which changes the digest), and somehow both content units end up in the same repository even though only one of them was in the original.
Problems:
- Two advisories in the pulp3 repo, identical except for the package list. It should only be one.
- Probably it is simply wrong to be looking at RPMs in the repo to reconstruct the pkglist, because it causes issues like this where the pkglist is actually modified. This causes 2 content units to be created by accident.
@Justin I'm curious if you can reproduce this? Extra content seems easy to overlook compared to missing content.
When it's done in two steps as it was in the description, what happens is that it doesn't create a new pulp2content, and because it does this it doesn't try to create a new pulp3content from the pulp2content either (probably a good thing since it would be a wrong one if it did), but the content unit ends up associated with the repository anyway.
Problems:
- No new pulp2content is created
All of this needs to be fixed in this issue since if we fix the pulp2content issue it's going to immediately start having the problem w/ multiple content units and that is, if anything, far more broken.
Updated by ttereshc over 4 years ago
A few things stand out:
Not all content types provide the same fields here, RPMs don't have a "pulp3_repository_version", which surprises me but I can see why there might be a reason for it
This is for advisory only (and it will be the same for distribution trees) One pulp2 erratum is migrated into multiple advisories in pulp3, thus the repo version reference. For consistency, we can show in the API empty value for RPMs.
This is "wrong" because the same errata can be in multiple repository versions simultaneously, so it's nonsensical to just have a field for a single repo version. Unless, there are supposed to be multiple entries for the same errata in here per-repo, in which case, there are not
Yes, that's the case. If you look at the pulp2content the unique constraint is pulp2repo + unit_id. If both errata ended up in one pulp3 repo version (only the case when 2 pulp2 repos are exactly the same), one will see "duplicated" entries in the API response.
Is just because there's no RPM packages in the current repo, so it ends up constructing an entirely different content unit with an empty package list.
It's correct that it creates an erratum with empty pkglist. It's exactly how Pulp 2 would publish it. There is no way for us to identify which package list should be used for a repository without looking at the packages. We can't create one advisory with all pkglists available in pulp2. It will break clients, users would need to mirror sync to fix that, it's bad upgrade experience. With looking at the RPMs, they get exactly what they had in pulp2. Even if it's wrong, it's the same "wrong", there is no surprise and no clients are broken. Everything will be right with the next sync if it ever happens.
What is not correct is that they end up in the same repo version.
When everything is done in one batch, we get two Pulp2Content and two UpdateRecords (because the pkglist is different, which changes the digest), and somehow both content units end up in the same repository even though only one of them was in the original.
Problems:
Two advisories in the pulp3 repo, identical except for the package list. It should only be one.
Agreed
Probably it is simply wrong to be looking at RPMs in the repo to reconstruct the pkglist, because it causes issues like this where the pkglist is actually modified.
That's correct behaviour.
This causes 2 content units to be created by accident.
Root cause of the problem is likely different. I'd check the resolve_advisories in the RPM plugin which is called in the finalization of the repo version creation.
When it's done in two steps as it was in the description, what happens is that it doesn't create a new pulp2content, and because it does this it doesn't try to create a new pulp3content from the pulp2content either (probably a good thing since it would be a wrong one if it did), but the content unit ends up associated with the repository anyway.
Agreed that it's the main problem to solve here and it looks to me exactly like the one discussed on irc right before you started digging into it.
"so I think we need to find a way for mutable types to check whether there are new repos containing them"
Updated by dalley over 4 years ago
Agreed that it's the main problem to solve here and it looks to me exactly like the one discussed on irc right before you started digging into it.
Yup, I was just re-summarizing the problem.
Thanks for answering the other questions though, it helps to clear things up a bit.
Added by dalley over 4 years ago
Added by dalley over 4 years ago
Revision 8a29b6f4 | View on GitHub
Fix a bug where errata were not always migrated for new repositories
If an errata has already been migrated and is then added to a new repository, which is then migrated, the new repository does not contain the affected errata.
Ensure that new Pulp2Content are created when existing content are migrated into new repos for the errata corner case (needs one per each repo).
Added by dalley over 4 years ago
Revision 8a29b6f4 | View on GitHub
Fix a bug where errata were not always migrated for new repositories
If an errata has already been migrated and is then added to a new repository, which is then migrated, the new repository does not contain the affected errata.
Ensure that new Pulp2Content are created when existing content are migrated into new repos for the errata corner case (needs one per each repo).
Updated by dalley over 4 years ago
- Status changed from ASSIGNED to MODIFIED
Applied in changeset pulp:pulp-2to3-migration|8a29b6f4f0f04ae655c518a8ce29937fecd9af27.
Added by dalley over 4 years ago
Revision e52b964c | View on GitHub
Fix a bug where errata were not always migrated for new repositories
If an errata has already been migrated and is then added to a new repository, which is then migrated, the new repository does not contain the affected errata.
Ensure that new Pulp2Content are created when existing content are migrated into new repos for the errata corner case (needs one per each repo).
closes: #7092 https://pulp.plan.io/issues/7092 (cherry picked from commit 8a29b6f4f0f04ae655c518a8ce29937fecd9af27)
Added by dalley over 4 years ago
Revision e52b964c | View on GitHub
Fix a bug where errata were not always migrated for new repositories
If an errata has already been migrated and is then added to a new repository, which is then migrated, the new repository does not contain the affected errata.
Ensure that new Pulp2Content are created when existing content are migrated into new repos for the errata corner case (needs one per each repo).
closes: #7092 https://pulp.plan.io/issues/7092 (cherry picked from commit 8a29b6f4f0f04ae655c518a8ce29937fecd9af27)
Added by dalley over 4 years ago
Revision e52b964c | View on GitHub
Fix a bug where errata were not always migrated for new repositories
If an errata has already been migrated and is then added to a new repository, which is then migrated, the new repository does not contain the affected errata.
Ensure that new Pulp2Content are created when existing content are migrated into new repos for the errata corner case (needs one per each repo).
closes: #7092 https://pulp.plan.io/issues/7092 (cherry picked from commit 8a29b6f4f0f04ae655c518a8ce29937fecd9af27)
Updated by dalley over 4 years ago
Applied in changeset pulp:pulp-2to3-migration|e52b964c62d4607a054f01fbef3584bafcf4898d.
Updated by ttereshc over 4 years ago
- Status changed from MODIFIED to CLOSED - CURRENTRELEASE
Fix a bug where errata were not always migrated for new repositories
If an errata has already been migrated and is then added to a new repository, which is then migrated, the new repository does not contain the affected errata.
Ensure that new Pulp2Content are created when existing content are migrated into new repos for the errata corner case (needs one per each repo).
closes: #7092 https://pulp.plan.io/issues/7092