Issue #8622
closedCentOS 8 BaseOS .treeinfo's [general] section says AppStream instead of BaseOS
Description
I synced CentOS 8 BaseOS today and noticed that the .treeinfo's [general] section wasn't right:
[checksums] images/boot.iso = sha256:2b801bc5801816d0cf27fc74552cf058951c42c7b72b1fe313429b1070c3876c images/efiboot.img = sha256:3b400994d93956d1a3c59d3c7fbefe1a73e287b2b2b0dedea7d16ef5042cb3c9 images/install.img = sha256:60115a4c57495f4ae4fa2a8b169523267b07b47b8333c9189a4586f54a8cc29f images/pxeboot/initrd.img = sha256:31cb34ff174707a7846483a81579a124c0beea4cc1030f1884103a9a0622de47 images/pxeboot/vmlinuz = sha256:40fa3404fa9686065d95b9dc2caa97a08680b7e8566baa2c3f09ff48fd660d48 [general] ; WARNING.0 = This section provides compatibility with pre-productmd treeinfos. ; WARNING.1 = Read productmd documentation for details about new format. arch = x86_64 family = CentOS Linux name = CentOS Linux 8 packagedir = AppStream/Packages platforms = x86_64,xen repository = AppStream timestamp = 1605735523 variant = AppStream variants = AppStream,BaseOS version = 8 [header] type = productmd.treeinfo version = 1.2 [images-x86_64] boot.iso = images/boot.iso efiboot.img = images/efiboot.img initrd = images/pxeboot/initrd.img kernel = images/pxeboot/vmlinuz [images-xen] initrd = images/pxeboot/initrd.img kernel = images/pxeboot/vmlinuz [release] name = CentOS Linux short = CentOS version = 8 [stage2] mainimage = images/install.img [tree] arch = x86_64 build_timestamp = 1605735523 platforms = x86_64,xen variants = AppStream,BaseOS [variant-AppStream] id = AppStream name = AppStream packages = AppStream/Packages repository = AppStream type = variant uid = AppStream [variant-BaseOS] id = BaseOS name = BaseOS packages = Packages repository = . type = variant uid = BaseOS
In case it isn't always reproducible, I did also sync CenOS 8 BaseOS kickstart, AppStream OS, and AppStream kickstart
Versions:
pulp-certguard (1.1.0) pulp-container (2.2.1) pulp-deb (2.8.0) pulp-file (1.5.0) pulp-rpm (3.10.0) pulpcore (3.9.1)
Additionally (moved from #9123):
When .treeinfo contains relative paths to a location outside of the repository, as is the case with CentOS 8, Pulp cannot serve those sub-repos precisely as they are. So it syncs all of them and publishes all of them into one repository with subdirectories for the sub-repos, and writes the locations of these sub-repos into the .treeinfo metadata.
In the mirrored metadata case, the .treeinfo file will be pointing to the wrong locations, so we need to rewrite the .treeinfo file just like we do during a standard publish.
As .treeinfo isn't checksummed or signed we aren't prevented from doing this.
Related issues
Updated by dalley over 3 years ago
- Triaged changed from No to Yes
- Sprint set to Sprint 95
Updated by Aant over 3 years ago
Hi There, I'm also experiencing this issue since I changed my remote url from http://mirror.centos.org/centos-8/8.3.2011/BaseOS/x86_64/os/ to http://mirror.centos.org/centos-8/8/BaseOS/x86_64/os/. Before that it was still working (the 8.3 repo had been created several months ago with some older pulpcore/pulp_rpm version, unfortunately I deleted the original repo since so I don't know exactly which version).
I have tried to delete and recreate the repo/remote/publication/distribution several different ways (e.g. create distribution first without publication, PATCH later to add publication). I also synchronise the AppStream repo. I tried to delete all AppStream objects: repo, remote, pub, distro and recreate the Base repo but it still contains this AppStream part. Switching Base to http://mirror.centos.org/centos-8/8.4.2105/BaseOS/x86_64/os/ does not solve the problem either.
The problem seems to affect the directory structure, too:
curl http://localhost:8000/pulp/content/centos8_base/
<!DOCTYPE html>
<html>
<body>
<ul>
<li><a href=".treeinfo">.treeinfo</a></li>
<li><a href="AppStream/">AppStream/</a></li>
<li><a href="Packages/">Packages/</a></li>
<li><a href="config.repo">config.repo</a></li>
<li><a href="images/">images/</a></li>
<li><a href="repodata/">repodata/</a></li>
</ul>
</body>
</html>
The biggest problem is that this repo causes a dependency hell on the clients, e.g.:
dnf update -y
...
Error:
Problem: The operation would result in removing the following protected packages: systemd, systemd-udev
(try to add '--allowerasing' to command line to replace conflicting packages or '--skip-broken' to skip uninstallable packages or '--nobest' to use not only best candidate packages)
If I connect the client directly to the URL I synchronise from (mirror.centos.org...) then I can update without any issues.
Current versions: pulpcore 3.13.0 pulp_rpm 3.13.2
I only use the API, no CLI.
As the problem renders the distribution unusable is there a chance to increase the priority of this ticket?
Updated by mgoddard over 3 years ago
Hi, also hitting this when trying to publish CentOS Linux 8 and CentOS Stream 8 repos.
I'm using a pulp-in-one setup. I tried going back to a few old tags, but hit various issues.
3.10 failed with "'utf-8' codec can't decode byte 0xfd in position 0: invalid start byte" when syncing.
3.11 and 3.12 fail on https://pulp.plan.io/issues/8807 when publishing.
3.13 fails with this issue.
I went back to the 3.11 image and ran the following in the container to work around #8807:
pip install --no-deps 'productmd<1.33'
cd /var/run/s6/services
s6-svc -r new-pulpcore-worker@1
s6-svc -r new-pulpcore-worker@2
s6-svc -r pulpcore-api
s6-svc -r pulpcore-content/
s6-svc -r pulpcore-resource-manager/
s6-svc -r pulpcore-worker@1
s6-svc -r pulpcore-worker@2
Finally, it worked. So the problem was introduced after 3.11.0.
Updated by dalley over 3 years ago
Thanks mgoddard, that's really helpful info.
Updated by dalley over 3 years ago
- Status changed from NEW to ASSIGNED
- Assignee set to dalley
Updated by dalley over 3 years ago
@Aant I see no (substantive) difference between the CentOS 8.3 repo treeinfo and the CentOS 8 (rolling) treeinfo, so the difference would have to down to the Pulp version or one of its dependencies, as mgoddard mentioned.
Updated by dalley over 3 years ago
mgoddard I have just attempted using both 3.10 and 3.11 (backporting patches as needed) and the output on all of these versions seems identical to the newer versions (and still wrong). So I'm not sure how to explain the experience of it having worked previously, unless it was an even earlier version.
In any case I have made some progress in tracking down where the issue is, so I'll skip bisecting any further.
Updated by mgoddard over 3 years ago
dalley, I think I can explain it. Using the setup I described earlier, I still see the issues with .treeinfo, however I no longer see the issue reported by @Aant. So I think these are two separate bugs.
Updated by dalley over 3 years ago
@Aant did you have pulp_rpm 3.13.0 installed at any point? There was a data corruption bug that was fixed within a couple of days, but if some of your RPMs were synced during that window it might explain the "dependency hell". And if that's the case I can help you clean up, which hopefully shouldn't be too difficult.
Updated by Aant over 3 years ago
dalley, yes I have had 3.13.0 and I think I synchronised that repo before updating to 3.13.2. Maybe that's the reason why I have this problem not the change in the remote url (8.3 vs 8). I would appreciate if you could help me fix it.
Updated by dalley over 3 years ago
No problem. What you need to do is delete any repository versions created during the time period when 3.13.0 was running, which lets all the new packages that were created during that time be deleted.
That can be done with
pulp rpm repository version destroy --repository $name --version $number
If you have a lot of repositories affected, let me know and we can do something more sophisticated
And then immediately after perform a delete of the "orphan" content that no longer belongs to any repositories - this will include all of the content that was created when those repository versions were created.
pulp orphans delete
Does your workflow include any copying of RPMs?
Updated by Aant over 3 years ago
HI dalley, previously I had tried to delete and recreate both the appstream and the base repos, remotes, publications, distributions, then delete the orphans and recreate repo, remote, publication, distribution but it always leads to the same issue. I guess deleting a repo also deletes the repo versions. The /AppStream directory is there in the Base repo even if I don't have an appstream repo.
In the meantime I realised that the same issue happens with an almalinux repo, as well that I synchronise against a remote of https://repo.almalinux.org/almalinux/8/BaseOS/x86_64/os/.
Updated by dalley over 3 years ago
It sounds like that issue is not related to the 3.13.0 bug, then. I will try to reproduce that separately.
Updated by dalley over 3 years ago
- Related to Issue #9123: Mirrored .treeinfo metadata needs to be rewritten in cases where the relative location of the sub-repos moved added
Updated by dalley over 3 years ago
Hey @Aant, following up about that separate bug (not the one in the title).
I appreciate you bringing it up, because it helped push us to look further into the original 3.13.0 bug. It appears that the fix for that issue was flawed and didn't work properly on Python <3.8. There is a new issue filed here to track it: https://pulp.plan.io/issues/9107
Any further discussions about it should probably happen over there. But FYI, I think at this point we will need to provide a proper in-place repair script rather than simply suggesting to blow the repository versions away again (although once we have a new release out with the fix, it would be the fastest way to fix things, because it will take a few more days to get that script ready).
Regarding the original issue written up here -- it's my immediate priority after that is handled. I've been looking into it already but other things have... intervened.
Updated by dalley over 3 years ago
@Aant See https://pulp.plan.io/issues/9107#note-19
Updated by dalley over 3 years ago
- Description updated (diff)
- Status changed from ASSIGNED to POST
Updated by ipanova@redhat.com over 3 years ago
- Sprint changed from Sprint 101 to Sprint 102
Updated by dalley over 3 years ago
- Related to Issue #9208: Published .treeinfo metadata not matching expectations (remaining issues) added
Added by dalley over 3 years ago
Updated by dalley over 3 years ago
- Status changed from POST to MODIFIED
Applied in changeset 00853cbb4b882528abb63953a238aa05e886f24e.
Updated by dalley over 3 years ago
- Copied to Backport #9218: Backport #8622 "CentOS 8 BaseOS .treeinfo's [general] section says AppStream instead of BaseOS" to 3.14.z added
Updated by pulpbot about 3 years ago
- Status changed from MODIFIED to CLOSED - CURRENTRELEASE
Fix .treeinfo metadata being written improperly for Pulp users
closes: #8622 https://pulp.plan.io/issues/8622