https://pulp.plan.io/https://pulp.plan.io/favicon.ico2020-07-06T19:18:18ZPulpMigration Plugin - Issue #7092: re-migration of errata that have now been copied to new repositories are not migrated for those new repositorieshttps://pulp.plan.io/issues/7092?journal_id=591162020-07-06T19:18:18Zjsherril@redhat.comjsherril@redhat.com
<ul></ul><p>The repos i reproduced with are el8 baseos and el8 appstream. I actually had 2 repos in the initial migration and copied the contents to two more. My final migration plan is:</p>
<pre><code>{
"plugins": [
{
"type": "docker",
"repositories": []
},
{
"type": "iso",
"repositories": []
},
{
"type": "rpm",
"repositories": [
{
"name": "02fe9557-8cc2-4965-a314-06845cfc4c9e",
"repository_versions": [
{
"pulp2_repository_id": "02fe9557-8cc2-4965-a314-06845cfc4c9e",
"pulp2_distributor_repository_ids": [
"02fe9557-8cc2-4965-a314-06845cfc4c9e"
]
}
],
"pulp2_importer_repository_id": "02fe9557-8cc2-4965-a314-06845cfc4c9e"
},
{
"name": "99ab9d40-45ea-4d63-b1c4-941ca2c28fc5",
"repository_versions": [
{
"pulp2_repository_id": "99ab9d40-45ea-4d63-b1c4-941ca2c28fc5",
"pulp2_distributor_repository_ids": [
"99ab9d40-45ea-4d63-b1c4-941ca2c28fc5"
]
}
],
"pulp2_importer_repository_id": "99ab9d40-45ea-4d63-b1c4-941ca2c28fc5"
},
{
"name": "cv2-Red_Hat_Enterprise_Linux_8_for_x86_64_-_AppStream_RPMs_8",
"repository_versions": [
{
"pulp2_repository_id": "1-cv2-v1_0-99ab9d40-45ea-4d63-b1c4-941ca2c28fc5",
"pulp2_distributor_repository_ids": [
"1-cv2-Library-99ab9d40-45ea-4d63-b1c4-941ca2c28fc5",
"1-cv2-v1_0-99ab9d40-45ea-4d63-b1c4-941ca2c28fc5"
]
}
]
},
{
"name": "cv2-Red_Hat_Enterprise_Linux_8_for_x86_64_-_BaseOS_RPMs_8",
"repository_versions": [
{
"pulp2_repository_id": "1-cv2-v1_0-02fe9557-8cc2-4965-a314-06845cfc4c9e",
"pulp2_distributor_repository_ids": [
"1-cv2-Library-02fe9557-8cc2-4965-a314-06845cfc4c9e",
"1-cv2-v1_0-02fe9557-8cc2-4965-a314-06845cfc4c9e"
]
}
]
}
]
}
]
}
</code></pre> Migration Plugin - Issue #7092: re-migration of errata that have now been copied to new repositories are not migrated for those new repositorieshttps://pulp.plan.io/issues/7092?journal_id=592142020-07-08T20:58:11Zdalleydalley@redhat.com
<ul><li><strong>Status</strong> changed from <i>NEW</i> to <i>ASSIGNED</i></li><li><strong>Assignee</strong> set to <i>dalley</i></li></ul> Migration Plugin - Issue #7092: re-migration of errata that have now been copied to new repositories are not migrated for those new repositorieshttps://pulp.plan.io/issues/7092?journal_id=594652020-07-13T16:42:35Zttereshcttereshc@redhat.com
<ul><li><strong>Sprint/Milestone</strong> set to <i>0.2.0</i></li></ul> Migration Plugin - Issue #7092: re-migration of errata that have now been copied to new repositories are not migrated for those new repositorieshttps://pulp.plan.io/issues/7092?journal_id=594762020-07-14T02:20:18Zdalleydalley@redhat.com
<ul></ul><p>I'm not sure if I did something wrong, but I wasn't immediately able to reproduce this. Here's my two Pulp 2 repos.</p>
<pre><code>[vagrant@pulp2-nightly-pulp3-source-centos7 devel]$ pulp-admin rpm repo content errata --repo-id test1
Description: ParthaBird_Erratum
Id: RHEA-2012:0056
Severity:
Summary:
Title: Bird_Erratum
Type: security
Description: Bear_Erratum
Id: RHEA-2012:0057
Severity:
Summary:
Title: Bear_ErratumPARTHA
Type: security
Description: Gorilla_Erratum
Id: RHEA-2012:0058
Severity:
Summary:
Title: Gorilla_Erratum
Type: enhancement
Description: Sea_Erratum
Id: RHEA-2012:0055
Severity:
Summary:
Title: Sea_Erratum
Type: security
[vagrant@pulp2-nightly-pulp3-source-centos7 devel]$ pulp-admin rpm repo content errata --repo-id test2
Description: Bear_Erratum
Id: RHEA-2012:0057
Severity:
Summary:
Title: Bear_ErratumPARTHA
Type: security
</code></pre>
<p>I created & synced test1, and created test2 (but left it empty and didn't include it in the migration plan), then ran the migration, then copied the errata, then migrated again with both repos. At the end, I have 2 repos in Pulp 3, and both have the full set of content they are supposed to have:</p>
<pre><code>...
"present": {
"rpm.advisory": {
"count": 4,
"href": "/pulp/api/v3/content/rpm/advisories/?repository_version=/pulp/api/v3/repositories/rpm/rpm/ece83dc3-b0b9-4ab3-b91c-4588a0ab7d4d/versions/1/"
},
...
"present": {
"rpm.advisory": {
"count": 1,
"href": "/pulp/api/v3/content/rpm/advisories/?repository_version=/pulp/api/v3/repositories/rpm/rpm/33d543be-e407-46de-a8ed-0503f670b478/versions/1/"
}
},
</code></pre>
<p>This was with the normal test fixture - there may be something different about the Centos 8 repos (and also I could have missed something). Maybe creating the repo early influenced something despite not having it in the migration plan. Will keep trying tomorrow.</p> Migration Plugin - Issue #7092: re-migration of errata that have now been copied to new repositories are not migrated for those new repositorieshttps://pulp.plan.io/issues/7092?journal_id=595502020-07-15T15:44:30Zjsherril@redhat.comjsherril@redhat.com
<ul></ul><p>The issue i was seeing was that the pulp2content objects for the errata would have blank 'pulp3_repository_version' attributes. I didn't confirm whether or not they actually ended up in the 'new' repo version.</p> Migration Plugin - Issue #7092: re-migration of errata that have now been copied to new repositories are not migrated for those new repositorieshttps://pulp.plan.io/issues/7092?journal_id=595512020-07-15T16:07:51Zdalleydalley@redhat.com
<ul></ul><p>Strange. So what I see is, the erratum ends up in both repos as expected, and it's not duplicated or anything (content unit has the same href).</p>
<p>Here's what my /pulp2content/ looks like</p>
<pre><code> {
"pulp_href": "/pulp/api/v3/pulp2content/fa0ad407-c842-41ac-a7bf-dbbf96de1a85/",
"pulp_created": "2020-07-15T15:45:52.141120Z",
"pulp2_id": "238b6545-d4af-4727-9391-92530070d54f",
"pulp2_content_type_id": "erratum",
"pulp2_last_updated": 1594827881,
"pulp2_storage_path": null,
"downloaded": false,
"pulp3_content": "/pulp/api/v3/content/rpm/advisories/676aeb6f-854f-4942-8642-ebd1291468e1/",
"pulp3_repository_version": "/pulp/api/v3/repositories/rpm/rpm/d995c4d3-7dae-44f7-9ba0-87e6298370ff/versions/1/"
},
...
...
{
"pulp_href": "/pulp/api/v3/pulp2content/95395826-43c9-407c-919f-34ea343c7fc4/",
"pulp_created": "2020-07-15T15:45:51.989257Z",
"pulp2_id": "18e4f45a-8522-47b7-979c-03d842a8f01a",
"pulp2_content_type_id": "rpm",
"pulp2_last_updated": 1594827881,
"pulp2_storage_path": "/var/lib/pulp/content/units/rpm/f8/d7cba1691f3bc7e29a7c8966be06d8418de488f4ffa7f84ad472de6d604e9b/zebra-0.1-2.noarch.rpm",
"downloaded": false,
"pulp3_content": "/pulp/api/v3/content/rpm/packages/6347ad7c-8df8-4bbd-9d9b-8f8a54047c6a/"
},
</code></pre>
<p>A few things stand out:</p>
<ul>
<li>Not all content types provide the same fields here, RPMs don't have a "pulp3_repository_version", which surprises me but I can see why there might be a reason for it</li>
<li>This is "wrong" because the same errata can be in multiple repository versions simultaneously, so it's nonsensical to just have a field for a single repo version</li>
<li>Unless, there are supposed to be multiple entries for the same errata in here per-repo, in which case, there are not</li>
</ul>
<p>I don't have entries with that field empty though.</p> Migration Plugin - Issue #7092: re-migration of errata that have now been copied to new repositories are not migrated for those new repositorieshttps://pulp.plan.io/issues/7092?journal_id=596142020-07-16T20:04:03Zdalleydalley@redhat.com
<ul></ul><p>I'm finding new bugs. Wiping the system and re-migrating gives this:</p>
<pre><code>(pulp) [vagrant@pulp2-nightly-pulp3-source-centos7 devel]$ pulp-admin rpm repo content errata --repo-id test2
Description: Bear_Erratum
Id: RHEA-2012:0057
Severity:
Summary:
Title: Bear_ErratumPARTHA
Type: security
(pulp) [vagrant@pulp2-nightly-pulp3-source-centos7 devel]$ http GET :24817/pulp/api/v3/content/rpm/advisories/?repository_version=/pulp/api/v3/repositories/rpm/rpm/d18c672a-d6e9-4e6a-938f-3cf5a811db06/versions/1/
HTTP/1.1 200 OK
Allow: GET, POST, HEAD, OPTIONS
Connection: close
Content-Length: 1356
Content-Type: application/json
Date: Thu, 16 Jul 2020 20:01:07 GMT
Server: gunicorn/20.0.4
Vary: Accept, Cookie
X-Frame-Options: SAMEORIGIN
{
"count": 2,
"next": null,
"previous": null,
"results": [
{
"description": "Bear_Erratum",
"fromstr": "errata@redhat.com",
"id": "RHEA-2012:0057",
"issued_date": "2013-01-27 16:08:05",
"pkglist": [
{
"name": "collection-0",
"packages": [
{
"arch": "noarch",
"epoch": "0",
"filename": "bear-4.1-1.noarch.rpm",
"name": "bear",
"reboot_suggested": false,
"release": "1",
"relogin_suggested": false,
"restart_suggested": false,
"src": "http://www.fedoraproject.org",
"sum": "",
"sum_type": "",
"version": "4.1"
}
],
"shortname": ""
}
],
"pulp_created": "2020-07-16T19:55:54.463062Z",
"pulp_href": "/pulp/api/v3/content/rpm/advisories/0bb6563b-6c6b-437e-b86c-95b75e2e1625/",
"pushcount": "",
"reboot_suggested": false,
"references": [],
"release": "1",
"rights": "",
"severity": "",
"solution": "",
"status": "stable",
"summary": "",
"title": "Bear_ErratumPARTHA",
"type": "security",
"updated_date": "2013-01-27 16:08:05",
"version": "1"
},
{
"description": "Bear_Erratum",
"fromstr": "errata@redhat.com",
"id": "RHEA-2012:0057",
"issued_date": "2013-01-27 16:08:05",
"pkglist": [],
"pulp_created": "2020-07-16T19:55:54.456933Z",
"pulp_href": "/pulp/api/v3/content/rpm/advisories/930cd7e6-0945-40aa-81c9-cd29c9c056c4/",
"pushcount": "",
"reboot_suggested": false,
"references": [],
"release": "1",
"rights": "",
"severity": "",
"solution": "",
"status": "stable",
"summary": "",
"title": "Bear_ErratumPARTHA",
"type": "security",
"updated_date": "2013-01-27 16:08:05",
"version": "1"
}
]
}
</code></pre> Migration Plugin - Issue #7092: re-migration of errata that have now been copied to new repositories are not migrated for those new repositorieshttps://pulp.plan.io/issues/7092?journal_id=596212020-07-17T02:45:35Zdalleydalley@redhat.com
<ul></ul><p>^^ Is just because there's no RPM packages in the current repo, so it ends up constructing an entirely different content unit with an empty package list.</p>
<p><a href="https://github.com/pulp/pulp-2to3-migration/blob/master/pulp_2to3_migration/app/plugin/rpm/pulp_2to3_models.py#L357-L367" class="external">https://github.com/pulp/pulp-2to3-migration/blob/master/pulp_2to3_migration/app/plugin/rpm/pulp_2to3_models.py#L357-L367</a></p>
<p>When everything is done in one batch, we get two Pulp2Content and two UpdateRecords (because the pkglist is different, which changes the digest), and somehow both content units end up in the same repository even though only one of them was in the original.</p>
<p>Problems:</p>
<ul>
<li>Two advisories in the pulp3 repo, identical except for the package list. It should only be one.</li>
<li>Probably it is simply wrong to be looking at RPMs in the repo to reconstruct the pkglist, because it causes issues like this where the pkglist is actually modified. This causes 2 content units to be created by accident.</li>
</ul>
<p>@Justin I'm curious if you can reproduce this? Extra content seems easy to overlook compared to missing content.</p>
<p>When it's done in two steps as it was in the description, what happens is that it doesn't create a new pulp2content, and because it does this it doesn't try to create a new pulp3content from the pulp2content either (probably a good thing since it would be a wrong one if it did), but the content unit ends up associated with the repository anyway.</p>
<p>Problems:</p>
<ul>
<li>No new pulp2content is created</li>
</ul>
<p>All of this needs to be fixed in this issue since if we fix the pulp2content issue it's going to immediately start having the problem w/ multiple content units and that is, if anything, far more broken.</p> Migration Plugin - Issue #7092: re-migration of errata that have now been copied to new repositories are not migrated for those new repositorieshttps://pulp.plan.io/issues/7092?journal_id=596282020-07-17T10:26:27Zttereshcttereshc@redhat.com
<ul></ul><blockquote>
<p>A few things stand out:</p>
<p>Not all content types provide the same fields here, RPMs don't have a "pulp3_repository_version", which surprises me but I can see why there might be a reason for it</p>
</blockquote>
<p>This is for advisory only (and it will be the same for distribution trees)
One pulp2 erratum is migrated into multiple advisories in pulp3, thus the repo version reference.
For consistency, we can show in the API empty value for RPMs.</p>
<blockquote>
<p>This is "wrong" because the same errata can be in multiple repository versions simultaneously, so it's nonsensical to just have a field for a single repo version.
Unless, there are supposed to be multiple entries for the same errata in here per-repo, in which case, there are not</p>
</blockquote>
<p>Yes, that's the case. If you look at the pulp2content the unique constraint is pulp2repo + unit_id.
If both errata ended up in one pulp3 repo version (only the case when 2 pulp2 repos are exactly the same), one will see "duplicated" entries in the API response.</p>
<blockquote>
<p>Is just because there's no RPM packages in the current repo, so it ends up constructing an entirely different content unit with an empty package list.</p>
</blockquote>
<p>It's correct that it creates an erratum with empty pkglist. It's exactly how Pulp 2 would publish it. There is no way for us to identify which package list should be used for a repository without looking at the packages. We can't create one advisory with all pkglists available in pulp2. It will break clients, users would need to mirror sync to fix that, it's bad upgrade experience. With looking at the RPMs, they get exactly what they had in pulp2. Even if it's wrong, it's the same "wrong", there is no surprise and no clients are broken. Everything will be right with the next sync if it ever happens.</p>
<p>What is not correct is that they end up in the same repo version.</p>
<blockquote>
<p><a href="https://github.com/pulp/pulp-2to3-migration/blob/master/pulp_2to3_migration/app/plugin/rpm/pulp_2to3_models.py#L357-L367" class="external">https://github.com/pulp/pulp-2to3-migration/blob/master/pulp_2to3_migration/app/plugin/rpm/pulp_2to3_models.py#L357-L367</a></p>
<p>When everything is done in one batch, we get two Pulp2Content and two UpdateRecords (because the pkglist is different, which changes the digest), and somehow both content units end up in the same repository even though only one of them was in the original.</p>
<p>Problems:</p>
<p>Two advisories in the pulp3 repo, identical except for the package list. It should only be one.</p>
</blockquote>
<p>Agreed</p>
<blockquote>
<p>Probably it is simply wrong to be looking at RPMs in the repo to reconstruct the pkglist, because it causes issues like this where the pkglist is actually modified.</p>
</blockquote>
<p>That's correct behaviour.</p>
<blockquote>
<p>This causes 2 content units to be created by accident.</p>
</blockquote>
<p>Root cause of the problem is likely different. I'd check the resolve_advisories in the RPM plugin which is called in the finalization of the repo version creation.</p>
<blockquote>
<p>When it's done in two steps as it was in the description, what happens is that it doesn't create a new pulp2content, and because it does this it doesn't try to create a new pulp3content from the pulp2content either (probably a good thing since it would be a wrong one if it did), but the content unit ends up associated with the repository anyway.</p>
</blockquote>
<p>Agreed that it's the main problem to solve here and it looks to me exactly like the one discussed on irc right before you started digging into it.</p>
<blockquote>
<blockquote>
<blockquote>
<p>"so I <em>think</em> we need to find a way for mutable types to check whether there are new repos containing them"</p>
</blockquote>
</blockquote>
</blockquote> Migration Plugin - Issue #7092: re-migration of errata that have now been copied to new repositories are not migrated for those new repositorieshttps://pulp.plan.io/issues/7092?journal_id=596372020-07-17T12:21:32Zdalleydalley@redhat.com
<ul></ul><blockquote>
<p>Agreed that it's the main problem to solve here and it looks to me exactly like the one discussed on irc right before you started digging into it.</p>
</blockquote>
<p>Yup, I was just re-summarizing the problem.</p>
<p>Thanks for answering the other questions though, it helps to clear things up a bit.</p> Migration Plugin - Issue #7092: re-migration of errata that have now been copied to new repositories are not migrated for those new repositorieshttps://pulp.plan.io/issues/7092?journal_id=597412020-07-21T18:46:21Zdalleydalley@redhat.com
<ul><li><strong>Status</strong> changed from <i>ASSIGNED</i> to <i>MODIFIED</i></li></ul><p>Applied in changeset <a class="changeset" title="Fix a bug where errata were not always migrated for new repositories If an errata has already be..." href="https://pulp.plan.io/projects/pulp/repository/pulp-2to3-migration/revisions/8a29b6f4f0f04ae655c518a8ce29937fecd9af27">pulp:pulp-2to3-migration|8a29b6f4f0f04ae655c518a8ce29937fecd9af27</a>.</p> Migration Plugin - Issue #7092: re-migration of errata that have now been copied to new repositories are not migrated for those new repositorieshttps://pulp.plan.io/issues/7092?journal_id=598822020-07-24T11:20:30Zdalleydalley@redhat.com
<ul></ul><p>Applied in changeset <a class="changeset" title="Fix a bug where errata were not always migrated for new repositories If an errata has already be..." href="https://pulp.plan.io/projects/pulp/repository/pulp-2to3-migration/revisions/e52b964c62d4607a054f01fbef3584bafcf4898d">pulp:pulp-2to3-migration|e52b964c62d4607a054f01fbef3584bafcf4898d</a>.</p> Migration Plugin - Issue #7092: re-migration of errata that have now been copied to new repositories are not migrated for those new repositorieshttps://pulp.plan.io/issues/7092?journal_id=612322020-08-20T18:06:40Zttereshcttereshc@redhat.com
<ul><li><strong>Status</strong> changed from <i>MODIFIED</i> to <i>CLOSED - CURRENTRELEASE</i></li></ul>