Project

Profile

Help

Issue #9583

closed

Distribution tree uniqueness constraint is not enough for a suboptimal .treeinfo

Added by ttereshc over 2 years ago. Updated over 2 years ago.

Status:
CLOSED - DUPLICATE
Priority:
High
Assignee:
Sprint/Milestone:
-
Start date:
Due date:
Estimated time:
Severity:
2. Medium
Version:
Platform Release:
OS:
Triaged:
Yes
Groomed:
No
Sprint Candidate:
No
Tags:
Sprint:
Sprint 111
Quarter:

Description

Ticket moved to GitHub: "pulp/pulp_rpm/2305":https://github.com/pulp/pulp_rpm/issues/2305


Currently Pulp requires the combination of the following fields to be unique:

        unique_together = (
            "header_version",
            "release_name",
            "release_short",
            "release_version",
            "arch",
            "build_timestamp",
        )

In some cases, it doesn't seem enough. For some reason multiple repositories might have all those fields exactly the same and they differ in variants definition only. Such are not proper distribution trees, the majority do not have any images associated.

Examples brought by always helpful @gdve from https://pulp.plan.io/issues/8566#note-33:

CentOS/8-stream/AppStream/x86_64/os/.treeinfo:build_timestamp = 1625615144
CentOS/8-stream/BaseOS/x86_64/os/.treeinfo:build_timestamp = 1625615155
CentOS/8-stream/HighAvailability/x86_64/os/.treeinfo:build_timestamp = 1625026406
CentOS/8-stream/PowerTools/x86_64/os/.treeinfo:build_timestamp = 1625026406
CentOS/8-stream/RT/x86_64/os/.treeinfo:build_timestamp = 1625026406
AlmaLinux/8.4/AppStream/x86_64/kickstart/.treeinfo:build_timestamp = 1622014553
AlmaLinux/8.4/AppStream/x86_64/os/.treeinfo:build_timestamp = 1622014553
AlmaLinux/8.4/BaseOS/x86_64/kickstart/.treeinfo:build_timestamp = 1622014553
AlmaLinux/8.4/BaseOS/x86_64/os/.treeinfo:build_timestamp = 1622014553
AlmaLinux/8.4/HighAvailability/x86_64/kickstart/.treeinfo:build_timestamp = 1622014558
AlmaLinux/8.4/HighAvailability/x86_64/os/.treeinfo:build_timestamp = 1622014558
AlmaLinux/8.4/PowerTools/x86_64/kickstart/.treeinfo:build_timestamp = 1622014558
AlmaLinux/8.4/PowerTools/x86_64/os/.treeinfo:build_timestamp = 1622014558
AlmaLinux/8.4/extras/x86_64/kickstart/.treeinfo:build_timestamp = 1622014558
AlmaLinux/8.4/extras/x86_64/os/.treeinfo:build_timestamp = 1622014558

Pulp 2to3 migration fails with No declared artifact with relative path ".treeinfo" for content "<DistributionTree: pk=64f44866-0207-4005-9c06-0f45e52cbdd1>". I would expect sync to behave the similarly, needs testing though.


Related issues

Related to Migration Plugin - Issue #8566: Content Migration to Pulp 3 with Katello fails (similar to #8377)CLOSED - CURRENTRELEASEttereshcActions
Actions #1

Updated by ttereshc over 2 years ago

  • Subject changed from Sub repos uniqueness constraint is not enough to Distribution tree uniqueness constraint is not enough for a suboptimal .treeinfo
Actions #2

Updated by ttereshc over 2 years ago

  • Related to Issue #8566: Content Migration to Pulp 3 with Katello fails (similar to #8377) added
Actions #3

Updated by ttereshc over 2 years ago

  • Description updated (diff)
Actions #4

Updated by ttereshc over 2 years ago

Before trying to fix it, I suggest to approach centos folks and ask for the use case of such stripped .treeinfo files. Do they merge them all at some point?
It would be also good to check with productmd folks, if it's a valid use according to specs.

Actions #5

Updated by quba42 over 2 years ago

We have a test system where 2to3 migration fails with:

No declared artifact with relative path \"images/boot.iso\" for content \"<DistributionTree: pk=f4651a50-5f7a-49a9-8fe5-3247a82362f1>\"

Is it meaningful that this fails on images/boot.iso and not on .treeinfo? (I know nothing about distribution trees...)

Actions #6

Updated by quba42 over 2 years ago

I queried for the DistributionTree that is throwing the error, and found that it comes from a AlmaLinux 8 repo. (Just posting this in case that is useful information.)

Actions #7

Updated by dalley over 2 years ago

  • Priority changed from Normal to High
  • Triaged changed from No to Yes
  • Sprint set to Sprint 111
Actions #9

Updated by ttereshc over 2 years ago

  • Status changed from NEW to ASSIGNED
  • Assignee set to ttereshc
Actions #10

Updated by ttereshc over 2 years ago

I believe the root cause is as described, the uniqueness constraint is not enough for CentOS and AlmaLinux repos.
It does not work because of those 2 factors:

  • each repo which is supposed to be an addon or variant is a standalone repo which has its own partial .treeinfo.
  • the .treeinfo files are sometimes generated at the same second, so the build_timestamp does not help in such cases.

As of yesterday/today, these repos have conflicts and I experimented with them:

On the main branch (basically, current pulpcore 3.17 and pulp_rpm 3.16):

  • I synced those repositories and in different order.
    • CentOS8 HA and CenOS8 Power Tools both will get the .treeinfo from the one which is synced the last, and no error(!) :(
    • same for Alma8 BaseOS and Alma8 AppStream

On the pulpcore 3.7 and pulp_rpm 3.11 (they correspond to Katello 3.18):

  • I synced those repositories and in different order.
    • CentOS8 HA and CenOS8 Power Tools both get merged into one DistributionTree object but have the .treeinfo from the one which is synced the last, and no error(!) :(
    • if I sync Alma8 AppStream first, and then Alma8 BaseOS, I get "'DistributionTree' object has no attribute 'filename'
  • Synced same repos in pulp2 and ran pulp-2to3-migration (order is not controlled)
    • CentOS8 HA and CenOS8 Power Tools get merged into one DistributionTree object but have the .treeinfo from the one which is migrated the last, and no error(!) :(
    • Alma8 BaseOS and Alma8 AppStream gave me No declared artifact with relative path \"images/boot.iso\" for content \"<DistributionTree: pk=f4651a50-5f7a-49a9-8fe5-3247a82362f1>\"

I could not reproduce the No declared artifact with relative path ".treeinfo" one but I believe that the root cause is the same and it's a result of a different order and/or something being removed in between runs as well.

I'm preparing a patch for the main branch with a database migration. This is not backportable and won't help those on pulp_rpm 3.11.

I'm trying to think of some solution for Katello users.
So far I have only a very hacky idea:

  • Check the variants or the whole .treeinfo and if it differs, tweak the build_timestamp, basically force the timestamps to differ, so they become 2 different objects in the database.
    We do not preserve the original timestamp but everything else works.

Any thoughts or ideas are welcome.

Actions #12

Updated by pulpbot over 2 years ago

  • Status changed from ASSIGNED to POST
*** WARNING ***  
DO NOT try to PATCH your system with these changes. This fix contains a database migration.  
It's hard to revert the changes. You will BREAK YOUR UPGRADE PATH if you use this patch.  
*************** 

PR: https://github.com/pulp/pulp_rpm/pull/2202

Actions #13

Updated by fao89 over 2 years ago

  • Description updated (diff)
  • Status changed from POST to CLOSED - DUPLICATE

Also available in: Atom PDF