Issue #9583
closedDistribution tree uniqueness constraint is not enough for a suboptimal .treeinfo
Description
Ticket moved to GitHub: "pulp/pulp_rpm/2305":https://github.com/pulp/pulp_rpm/issues/2305
Currently Pulp requires the combination of the following fields to be unique:
unique_together = (
"header_version",
"release_name",
"release_short",
"release_version",
"arch",
"build_timestamp",
)
In some cases, it doesn't seem enough. For some reason multiple repositories might have all those fields exactly the same and they differ in variants definition only. Such are not proper distribution trees, the majority do not have any images associated.
Examples brought by always helpful @gdve from https://pulp.plan.io/issues/8566#note-33:
CentOS/8-stream/AppStream/x86_64/os/.treeinfo:build_timestamp = 1625615144
CentOS/8-stream/BaseOS/x86_64/os/.treeinfo:build_timestamp = 1625615155
CentOS/8-stream/HighAvailability/x86_64/os/.treeinfo:build_timestamp = 1625026406
CentOS/8-stream/PowerTools/x86_64/os/.treeinfo:build_timestamp = 1625026406
CentOS/8-stream/RT/x86_64/os/.treeinfo:build_timestamp = 1625026406
AlmaLinux/8.4/AppStream/x86_64/kickstart/.treeinfo:build_timestamp = 1622014553
AlmaLinux/8.4/AppStream/x86_64/os/.treeinfo:build_timestamp = 1622014553
AlmaLinux/8.4/BaseOS/x86_64/kickstart/.treeinfo:build_timestamp = 1622014553
AlmaLinux/8.4/BaseOS/x86_64/os/.treeinfo:build_timestamp = 1622014553
AlmaLinux/8.4/HighAvailability/x86_64/kickstart/.treeinfo:build_timestamp = 1622014558
AlmaLinux/8.4/HighAvailability/x86_64/os/.treeinfo:build_timestamp = 1622014558
AlmaLinux/8.4/PowerTools/x86_64/kickstart/.treeinfo:build_timestamp = 1622014558
AlmaLinux/8.4/PowerTools/x86_64/os/.treeinfo:build_timestamp = 1622014558
AlmaLinux/8.4/extras/x86_64/kickstart/.treeinfo:build_timestamp = 1622014558
AlmaLinux/8.4/extras/x86_64/os/.treeinfo:build_timestamp = 1622014558
Pulp 2to3 migration fails with No declared artifact with relative path ".treeinfo" for content "<DistributionTree: pk=64f44866-0207-4005-9c06-0f45e52cbdd1>"
.
I would expect sync to behave the similarly, needs testing though.
Related issues
Updated by ttereshc about 3 years ago
- Subject changed from Sub repos uniqueness constraint is not enough to Distribution tree uniqueness constraint is not enough for a suboptimal .treeinfo
Updated by ttereshc about 3 years ago
- Related to Issue #8566: Content Migration to Pulp 3 with Katello fails (similar to #8377) added
Updated by ttereshc about 3 years ago
Before trying to fix it, I suggest to approach centos folks and ask for the use case of such stripped .treeinfo files. Do they merge them all at some point?
It would be also good to check with productmd folks, if it's a valid use according to specs.
Updated by quba42 about 3 years ago
We have a test system where 2to3 migration fails with:
No declared artifact with relative path \"images/boot.iso\" for content \"<DistributionTree: pk=f4651a50-5f7a-49a9-8fe5-3247a82362f1>\"
Is it meaningful that this fails on images/boot.iso
and not on .treeinfo
? (I know nothing about distribution trees...)
Updated by quba42 about 3 years ago
I queried for the DistributionTree that is throwing the error, and found that it comes from a AlmaLinux 8 repo. (Just posting this in case that is useful information.)
Updated by dalley about 3 years ago
- Priority changed from Normal to High
- Triaged changed from No to Yes
- Sprint set to Sprint 111
Updated by quba42 about 3 years ago
Updated by ttereshc about 3 years ago
- Status changed from NEW to ASSIGNED
- Assignee set to ttereshc
Updated by ttereshc about 3 years ago
I believe the root cause is as described, the uniqueness constraint is not enough for CentOS and AlmaLinux repos.
It does not work because of those 2 factors:
- each repo which is supposed to be an addon or variant is a standalone repo which has its own partial .treeinfo.
- the .treeinfo files are sometimes generated at the same second, so the build_timestamp does not help in such cases.
As of yesterday/today, these repos have conflicts and I experimented with them:
-
CentOS8 HA and CenOS8 Power Tools (both are partial .treeinfos, aka no images associated or present in metadata)
http://mirror.centos.org/centos/8-stream/HighAvailability/x86_64/os/.treeinfo
http://mirror.centos.org/centos/8-stream/PowerTools/x86_64/os/.treeinfo -
Alma8 BaseOS and Alma8 AppStream (BaseOS is the one with images, AppStream has a partial .treeinfo)
https://repo.almalinux.org/almalinux/8/BaseOS/x86_64/os/.treeinfo
https://repo.almalinux.org/almalinux/8/AppStream/x86_64/os/.treeinfo
On the main branch (basically, current pulpcore 3.17 and pulp_rpm 3.16):
- I synced those repositories and in different order.
- CentOS8 HA and CenOS8 Power Tools both will get the .treeinfo from the one which is synced the last, and no error(!) :(
- same for Alma8 BaseOS and Alma8 AppStream
On the pulpcore 3.7 and pulp_rpm 3.11 (they correspond to Katello 3.18):
- I synced those repositories and in different order.
- CentOS8 HA and CenOS8 Power Tools both get merged into one DistributionTree object but have the .treeinfo from the one which is synced the last, and no error(!) :(
- if I sync Alma8 AppStream first, and then Alma8 BaseOS, I get
"'DistributionTree' object has no attribute 'filename'
- Synced same repos in pulp2 and ran pulp-2to3-migration (order is not controlled)
- CentOS8 HA and CenOS8 Power Tools get merged into one DistributionTree object but have the .treeinfo from the one which is migrated the last, and no error(!) :(
- Alma8 BaseOS and Alma8 AppStream gave me
No declared artifact with relative path \"images/boot.iso\" for content \"<DistributionTree: pk=f4651a50-5f7a-49a9-8fe5-3247a82362f1>\"
I could not reproduce the No declared artifact with relative path ".treeinfo"
one but I believe that the root cause is the same and it's a result of a different order and/or something being removed in between runs as well.
I'm preparing a patch for the main branch with a database migration. This is not backportable and won't help those on pulp_rpm 3.11.
I'm trying to think of some solution for Katello users.
So far I have only a very hacky idea:
- Check the variants or the whole .treeinfo and if it differs, tweak the build_timestamp, basically force the timestamps to differ, so they become 2 different objects in the database.
We do not preserve the original timestamp but everything else works.
Any thoughts or ideas are welcome.
Updated by pulpbot about 3 years ago
- Status changed from ASSIGNED to POST
*** WARNING ***
DO NOT try to PATCH your system with these changes. This fix contains a database migration.
It's hard to revert the changes. You will BREAK YOUR UPGRADE PATH if you use this patch.
***************
Updated by fao89 about 3 years ago
- Description updated (diff)
- Status changed from POST to CLOSED - DUPLICATE