Project

Profile

Help

Issue #3585

closed

Pulp sync/publish relies on unique filenames per repo, corrupting repository

Added by dekimsey about 6 years ago. Updated about 5 years ago.

Status:
CLOSED - WONTFIX
Priority:
Normal
Assignee:
-
Sprint/Milestone:
-
Start date:
Due date:
Estimated time:
Severity:
3. High
Version:
2.12.2
Platform Release:
OS:
RHEL 7
Triaged:
Yes
Groomed:
No
Sprint Candidate:
No
Tags:
Pulp 2
Sprint:
Quarter:

Description

We have an upstream vendor, Centrify, that releases unique (nevra) packages in their published yum repo but reuses the filename in their yum repo. Effectively unpublishing the old copy. While this is poor practice on their part, it's breaking Pulp's rpm content management.

On publish, Pulp doesn't maintain these packages as separate entities, symlinks and metadata do not match. They are stored correctly on sync in pulp's content/units/rpm directory tree. However, the published symlinks use the upstream filename and not a computed nevra. This results in "content does not match metadata" errors from yum and a general corruption of the repository. Not only are multiple versions not only accessible but the sync process doesn't seem to handle this gracefully either. New metadata, old package on disk.

# Sync'd content organized by digest (yay).
# sync'd filenames aren't nevra (whatevs, we got us some digests)
[root@katello ~]$ find /var/lib/pulp/content/units/rpm -name 'CentrifyDC-5.4.3-*x86_64.rpm'
/var/lib/pulp/content/units/rpm/56/0e778319ec52a6aa75e7b72345bfd1fa11a91f4b112b36d30ed2c256ce15e5/CentrifyDC-5.4.3-rhel5.x86_64.rpm
/var/lib/pulp/content/units/rpm/a4/39ee653ddd3a4203ec949de3c8ff69494b5bf5d00798536bbbbd5eaeb46863/CentrifyDC-5.4.3-rhel5.x86_64.rpm
[root@katello ~]$ find /var/lib/pulp/content/units/rpm -name 'CentrifyDC-5.4.3-*x86_64.rpm' -execdir rpm -qp {} +
CentrifyDC-5.4.3-905.x86_64
CentrifyDC-5.4.3-887.x86_64

# Published content symlinks are named by filename (boo)
[root@katello ~]# sudo find -L /var/lib/pulp/published/yum/https/repos/Trustwave/Library -name 'CentrifyDC-5.4.3*.x86_64.rpm'
/var/lib/pulp/published/yum/https/repos/Trustwave/Library/custom/centrify/centrify-centrifydc-rpms/Packages/c/CentrifyDC-5.4.3-rhel5.x86_64.rpm
[root@katello ~]# readlink /var/lib/pulp/published/yum/https/repos/Trustwave/Library/custom/centrify/centrify-centrifydc-rpms/Packages/c/CentrifyDC-5.4.3-rhel5.x86_64.rpm
/var/lib/pulp/content/units/rpm/56/0e778319ec52a6aa75e7b72345bfd1fa11a91f4b112b36d30ed2c256ce15e5/CentrifyDC-5.4.3-rhel5.x86_64.rpm
[root@katello ~]# rpm -qp /var/lib/pulp/published/yum/https/repos/Trustwave/Library/custom/centrify/centrify-centrifydc-rpms/Packages/c/CentrifyDC-5.4.3-rhel5.x86_64.rpm
CentrifyDC-5.4.3-905.x86_64

# Example yum repoquery attempt (though this only shows two versions, sorry)
[root@sandbox ~]# repoquery --location -a CentrifyDC
https://smartproxy.com/pulp/repos/Trustwave/development/ccv-biz-portal-el7/custom/centrify/centrify-centrifydc-rpms/Packages/c/CentrifyDC-5.4.1-rhel4.i386.rpm
https://smartprox.com/pulp/repos/Trustwave/development/ccv-biz-portal-el7/custom/centrify/centrify-centrifydc-rpms/Packages/c/CentrifyDC-5.4.3-rhel5.x86_64.rpm
[root@sandbox ~]# repoquery -a CentrifyDC
CentrifyDC-0:5.4.1-455.i386
CentrifyDC-0:5.4.3-905.x86_64

While I've reported the issue to Centrify, Pulp should not be depending on correct package naming standards from third-parties. I think published content should have symlinks renamed to "$nevra.rpm" to ensure correctness.

In our case, this has caused our content to become inaccessible. I don't know when content from the units directory will be removed. But we've basically lost access to the older packages and yum cannot handle the upgrading since the published files are incorrect packages.

In case anyone else runs into this with Centrify, feel free to reference the case we filed, "180419-159679: rpm repo doesn't use unique rpm names for packages".

Also available in: Atom PDF