Issue #4541
closedRepository publishing duplicates RPM files under Packages and Packages/<LETTER> structure
Description
Some repositories are getting content duplicated on its repodata. For example, the CentoOS 7 Base repository, gets duplicated symlinks as follows:
[root@pulp ~]# pulp-admin rpm repo list --details --repo-id centos-7-base
+----------------------------------------------------------------------+
RPM Repositories
+----------------------------------------------------------------------+
Id: centos-7-base
Display Name: None
Description: None
Content Unit Counts:
Distribution: 1
Package Category: 11
Package Environment: 10
Package Group: 88
Package Langpacks: 1
Rpm: 10019
Notes:
Scratchpad:
Checksum Type: sha256
Importers:
Config:
Feed: http://centos.mirror.constant.com/7/os/x86_64/
Id: yum_importer
Importer Type Id: yum_importer
Last Override Config:
Last Sync: 2019-03-13T19:58:14Z
Last Updated: 2019-03-13T19:54:40Z
Repo Id: centos-7-base
Scratchpad:
Repomd Revision: 1543161601
Distributors:
Auto Publish: True
Config:
Http: True
Https: True
Relative URL: 7/os/x86_64/
Distributor Type Id: yum_distributor
Id: yum_distributor
Last Override Config:
Last Publish: 2019-03-14T04:17:40Z
Last Updated: 2019-03-13T19:54:40Z
Repo Id: centos-7-base
Scratchpad:
Auto Publish: False
Config:
Http: True
Https: True
Relative URL: 7/os/x86_64/
Distributor Type Id: export_distributor
Id: export_distributor
Last Override Config:
Last Publish: None
Last Updated: 2019-03-13T19:54:40Z
Repo Id: centos-7-base
Scratchpad:
Once the repository is published, it ended up having duplicated RPMs
pulp-admin rpm repo publish run --repo-id df05f52b-431e-483d-9878-9ffef41d70c8 --force-full
○ → tree /var/lib/pulp/published/yum/https/repos/7/os/x86_64/Packages/ | grep 389-ds-base
│ ├── 389-ds-base-1.3.8.4-15.el7.x86_64.rpm -> /var/lib/pulp/content/units/rpm/07/cb4bf5d6199cdc7192cbdbbf283cccdef38f18751c53660107b14232d2d349/389-ds-base-1.3.8.4-15.el7.x86_64.rpm
│ ├── 389-ds-base-devel-1.3.8.4-15.el7.x86_64.rpm -> /var/lib/pulp/content/units/rpm/ab/e6f96db9ad2c0f7f097f71b506b25a0d83a4a9926e904100f006361ef66d1f/389-ds-base-devel-1.3.8.4-15.el7.x86_64.rpm
│ ├── 389-ds-base-libs-1.3.8.4-15.el7.x86_64.rpm -> /var/lib/pulp/content/units/rpm/94/5011e802dbf9d470522eac943442a8718e08ba321dc0649a55a0d7f130e363/389-ds-base-libs-1.3.8.4-15.el7.x86_64.rpm
│ └── 389-ds-base-snmp-1.3.8.4-15.el7.x86_64.rpm -> /var/lib/pulp/content/units/rpm/e9/b6b32a0421e87ab83a6ea7346f51e3929746e10eeb2c9b5f05e8278c9e7b61/389-ds-base-snmp-1.3.8.4-15.el7.x86_64.rpm
├── 389-ds-base-1.3.8.4-15.el7.x86_64.rpm -> /var/lib/pulp/content/units/rpm/07/cb4bf5d6199cdc7192cbdbbf283cccdef38f18751c53660107b14232d2d349/389-ds-base-1.3.8.4-15.el7.x86_64.rpm
├── 389-ds-base-devel-1.3.8.4-15.el7.x86_64.rpm -> /var/lib/pulp/content/units/rpm/ab/e6f96db9ad2c0f7f097f71b506b25a0d83a4a9926e904100f006361ef66d1f/389-ds-base-devel-1.3.8.4-15.el7.x86_64.rpm
├── 389-ds-base-libs-1.3.8.4-15.el7.x86_64.rpm -> /var/lib/pulp/content/units/rpm/94/5011e802dbf9d470522eac943442a8718e08ba321dc0649a55a0d7f130e363/389-ds-base-libs-1.3.8.4-15.el7.x86_64.rpm
├── 389-ds-base-snmp-1.3.8.4-15.el7.x86_64.rpm -> /var/lib/pulp/content/units/rpm/e9/b6b32a0421e87ab83a6ea7346f51e3929746e10eeb2c9b5f05e8278c9e7b61/389-ds-base-snmp-1.3.8.4-15.el7.x86_64.rpm
That creates a big problem whenever exporting the content via ISO , as the size of the repo will be twice big.
The repodata is being created correctly as it only references the files under the new strucuture Packages/<first_letter>/
○ → zcat /var/lib/pulp/published/yum/https/repos/7/os/x86_64/Packages/repodata/a3f98f3b850725e0ac52472653efd309be6ba436296e66b1b997ca92c95fcdf1-primary.xml.gz | grep 'href' | grep 389-ds-base
<location href="Packages/3/389-ds-base-libs-1.3.8.4-15.el7.x86_64.rpm"/>
<location href="Packages/3/389-ds-base-1.3.8.4-15.el7.x86_64.rpm"/>
<location href="Packages/3/389-ds-base-snmp-1.3.8.4-15.el7.x86_64.rpm"/>
<location href="Packages/3/389-ds-base-devel-1.3.8.4-15.el7.x86_64.rpm"/>
Related issues
Updated by tchellomello over 5 years ago
**** Troubleshooting Steps **
**** adding a rpdb on the process_main() of PublishRpmStep we can see the following:
/usr/lib/python2.7/site-packages/pulp_rpm/plugins/distributors/yum/publish.py
-----
482 def process_main(self, item=None):
483 """
484 Link the unit to the content directory and the package_dir
485
486 :param item: The item to process or none if this get_iterator is not defined
487 :type item: pulp_rpm.plugins.db.models.RPM or pulp_rpm.plugins.db.models.SRPM
488 """
489 import rpdb; rpdb.set_trace()
490 unit = item
491 source_path = unit._storage_path
492 relative_path = file_utils.make_packages_relative_path(unit.filename)
493 destination_path = os.path.join(self.get_working_dir(), relative_path)
494 plugin_misc.create_symlink(source_path, destination_path)
495 for package_dir in self.dist_step.package_dirs:
496 destination_path = os.path.join(package_dir, unit.filename)
497 plugin_misc.create_symlink(source_path, destination_path)
498
499 for context in (self.file_lists_context, self.other_context, self.primary_context):
500 context.add_unit_metadata(unit)
Running the code we will have:
**** source_path is pointing correct to the content unit file
(Pdb) source_path
u'/var/lib/pulp/content/units/rpm/40/a9bfdeba456727e151d53adc41cc688d4360a71dde8e73a18150d72d6d300b/tigervnc-license-1.8.0-13.el7.noarch.rpm'
**** relative_path has the correct and expected location from that file
(Pdb) relative_path
u'Packages/t/tigervnc-license-1.8.0-13.el7.noarch.rpm'
**** destination_path has the temporary worker location
(Pdb) destination_path
u'/var/cache/pulp/reserved_resource_worker-2@pulp.mmello.local/c8c776b9-71a5-418c-8b20-bdcb2aa55016/Packages/t/tigervnc-license-1.8.0-13.el7.noarch.rpm'
The looking at the temporary worker directory, nothing was placed yet in the Packages directory:
[root@pulp c8c776b9-71a5-418c-8b20-bdcb2aa55016]# tree
.
├── images
│ ├── boot.iso -> /var/lib/pulp/content/units/distribution/de/d64602f1a2165af46f6cd4c8c2645a0422607173fb34617458c9f419e2c7bf/images/boot.iso
│ └── pxeboot
│ ├── initrd.img -> /var/lib/pulp/content/units/distribution/de/d64602f1a2165af46f6cd4c8c2645a0422607173fb34617458c9f419e2c7bf/images/pxeboot/initrd.img
│ └── vmlinuz -> /var/lib/pulp/content/units/distribution/de/d64602f1a2165af46f6cd4c8c2645a0422607173fb34617458c9f419e2c7bf/images/pxeboot/vmlinuz
├── LiveOS
│ └── squashfs.img -> /var/lib/pulp/content/units/distribution/de/d64602f1a2165af46f6cd4c8c2645a0422607173fb34617458c9f419e2c7bf/LiveOS/squashfs.img
└── repodata
├── filelists.xml.gz
├── other.xml.gz
├── primary.xml.gz
└── repomd.xml
When the `plugin_misc.create_symlink(source_path, destination_path)` is called, the symlink is created as expected:
Every 1.0s: tree Packages/ Thu Mar 14 00:10:57 2019
Packages/
└── t
└── tigervnc-license-1.8.0-13.el7.noarch.rpm -> /var/lib/pulp/content/units/rpm/40/a9bfdeba456727e151d53adc41cc688d4360a71dde8e73a18150d72d6d300b/tigervnc-license-1.8.0-13.el7.noarch.rpm
1 directory, 1 file
At this point, a loop is performed which will be calling again the create_symlink method for all members from `self.dist_step.package_dirs`. This case, that list thas the following:
(Pdb) self.dist_step.package_dirs
[u'/var/cache/pulp/reserved_resource_worker-2@pulp.mmello.local/c8c776b9-71a5-418c-8b20-bdcb2aa55016/Packages']
** So running the loop we have:
495 for package_dir in self.dist_step.package_dirs:
496 -> destination_path = os.path.join(package_dir, unit.filename)
497 plugin_misc.create_symlink(source_path, destination_path)
Then on this loop is where the problem happens:
(Pdb) destination_path
u'/var/cache/pulp/reserved_resource_worker-2@pulp.mmello.local/c8c776b9-71a5-418c-8b20-bdcb2aa55016/Packages/tigervnc-license-1.8.0-13.el7.noarch.rpm'
Every 1.0s: tree Packages/ Thu Mar 14 00:14:29 2019
Packages/
├── t
│ └── tigervnc-license-1.8.0-13.el7.noarch.rpm -> /var/lib/pulp/content/units/rpm/40/a9bfdeba456727e151d53adc41cc688d4360a71dde8e73a18150d72d6d300b/tigervnc-license-1.8.0-13.el7.noarch.rpm
└── tigervnc-license-1.8.0-13.el7.noarch.rpm -> /var/lib/pulp/content/units/rpm/40/a9bfdeba456727e151d53adc41cc688d4360a71dde8e73a18150d72d6d300b/tigervnc-license-1.8.0-13.el7.noarch.rpm
1 directory, 2 files
To a quick test, commenting out the for loop on the self.dist_step.package_dirs, the repository got created corretly.
[root@pulp ~]# pulp-admin rpm repo publish run --repo-id centos-7-base --force-full
[...SNIP...]
Publishing RPMs
[==================================================] 100%
10019 of 10019 items
... completed
Then the repo got published as expected:
[root@pulp ~]# ls -la /var/lib/pulp/published/yum/https/repos/7/os/x86_64/Packages/*.rpm 2>/dev/null| wc -l
0
[root@pulp ~]# ls -la /var/lib/pulp/published/yum/https/repos/7/os/x86_64/Packages/*/*.rpm 2>/dev/null| wc -l
10019
[root@pulp ~]# tree /var/lib/pulp/published/yum/https/repos/7/os/x86_64/Packages/ | head
/var/lib/pulp/published/yum/https/repos/7/os/x86_64/Packages/
├── 3
│ ├── 389-ds-base-1.3.8.4-15.el7.x86_64.rpm -> /var/lib/pulp/content/units/rpm/07/cb4bf5d6199cdc7192cbdbbf283cccdef38f18751c53660107b14232d2d349/389-ds-base-1.3.8.4-15.el7.x86_64.rpm
│ ├── 389-ds-base-devel-1.3.8.4-15.el7.x86_64.rpm -> /var/lib/pulp/content/units/rpm/ab/e6f96db9ad2c0f7f097f71b506b25a0d83a4a9926e904100f006361ef66d1f/389-ds-base-devel-1.3.8.4-15.el7.x86_64.rpm
│ ├── 389-ds-base-libs-1.3.8.4-15.el7.x86_64.rpm -> /var/lib/pulp/content/units/rpm/94/5011e802dbf9d470522eac943442a8718e08ba321dc0649a55a0d7f130e363/389-ds-base-libs-1.3.8.4-15.el7.x86_64.rpm
│ └── 389-ds-base-snmp-1.3.8.4-15.el7.x86_64.rpm -> /var/lib/pulp/content/units/rpm/e9/b6b32a0421e87ab83a6ea7346f51e3929746e10eeb2c9b5f05e8278c9e7b61/389-ds-base-snmp-1.3.8.4-15.el7.x86_64.rpm
├── a
│ ├── a2ps-4.14-23.el7.i686.rpm -> /var/lib/pulp/content/units/rpm/20/8453cf3eda8583d4c009b38311588a571e8833f9d557f65c1b55c35bc11ee1/a2ps-4.14-23.el7.i686.rpm
Basically the fix we need to validate if that loop is still necessary or to add a method to verify if the file already exists to avoid duplication.
mmello
Updated by bherring over 5 years ago
- Copied to Test #4542: [pulp_rpm] - Repository publishing duplicates RPM files under Packages and Packages/<LETTER> structure added
Updated by ttereshc over 5 years ago
- Project changed from Pulp to RPM Support
- Subject changed from [pulp_rpm] - Repository publishing duplicates RPM files under Packages and Packages/<LETTER> structure to Repository publishing duplicates RPM files under Packages and Packages/<LETTER> structure
- Category deleted (
14) - Severity changed from 4. Urgent to 3. High
- Tags deleted (
Release Engineering)
Added by dkliban@redhat.com over 5 years ago
Updated by dkliban@redhat.com over 5 years ago
- Status changed from NEW to POST
- Assignee set to dkliban@redhat.com
Updated by dkliban@redhat.com over 5 years ago
- Status changed from POST to MODIFIED
Applied in changeset c4c2ea5b2aeb585972f390f62a23c4efaa426052.
Updated by kersom over 5 years ago
Manually tested. Not suitable for automation.
Pulp Version
[root@localhost ~]# rpm -qa | grep pulp | sort
pulp-admin-client-2.19.0-0.1.rc.el7.noarch
pulp-deb-admin-extensions-1.9.0-0.2.rc.git.110.c3056d7.el7.noarch
pulp-deb-plugins-1.9.0-0.2.rc.git.110.c3056d7.el7.noarch
pulp-docker-admin-extensions-3.2.3-0.1.rc.el7.noarch
pulp-docker-plugins-3.2.3-0.1.rc.el7.noarch
pulp-ostree-admin-extensions-1.4.0-1.el7.noarch
pulp-ostree-plugins-1.4.0-1.el7.noarch
pulp-puppet-admin-extensions-2.19.0-0.1.rc.el7.noarch
pulp-puppet-plugins-2.19.0-0.1.rc.el7.noarch
pulp-puppet-tools-2.19.0-0.1.rc.el7.noarch
pulp-python-admin-extensions-2.0.3-1.el7.noarch
pulp-python-plugins-2.0.3-1.el7.noarch
pulp-rpm-admin-extensions-2.19.0-0.1.rc.el7.noarch
pulp-rpm-plugins-2.19.0-0.1.rc.el7.noarch
pulp-selinux-2.19.0-0.1.rc.el7.noarch
pulp-server-2.19.0-0.1.rc.el7.noarch
python-isodate-0.5.0-4.pulp.el7.noarch
python-pulp-bindings-2.19.0-0.1.rc.el7.noarch
python-pulp-client-lib-2.19.0-0.1.rc.el7.noarch
python-pulp-common-2.19.0-0.1.rc.el7.noarch
python-pulp-deb-common-1.9.0-0.2.rc.git.110.c3056d7.el7.noarch
python-pulp-docker-common-3.2.3-0.1.rc.el7.noarch
python-pulp-oid_validation-2.19.0-0.1.rc.el7.noarch
python-pulp-ostree-common-1.4.0-1.el7.noarch
python-pulp-puppet-common-2.19.0-0.1.rc.el7.noarch
python-pulp-python-common-2.0.3-1.el7.noarch
python-pulp-repoauth-2.19.0-0.1.rc.el7.noarch
python-pulp-rpm-common-2.19.0-0.1.rc.el7.noarch
python-pulp-streamer-2.19.0-0.1.rc.el7.noarch
commands:
pulp-admin login -u admin -p admin
pulp-admin rpm repo create --repo-id foo --feed http://centos.mirror.constant.com/7/os/x86_64/
pulp-admin rpm repo sync run --repo-id foo
pulp-admin rpm repo publish run --repo-id foo
pulp-admin rpm repo list --details --repo-id foo
pulp-admin rpm repo publish run --repo-id foo --force-full
[root@localhost ~]# pulp-admin rpm repo list --details --repo-id foo
+----------------------------------------------------------------------+
RPM Repositories
+----------------------------------------------------------------------+
Id: foo
Display Name: None
Description: None
Content Unit Counts:
Distribution: 1
Package Category: 11
Package Environment: 10
Package Group: 88
Package Langpacks: 1
Rpm: 10016
Notes:
Scratchpad:
Checksum Type: sha256
Importers:
Config:
Feed: http://centos.mirror.constant.com/7/os/x86_64/
Id: yum_importer
Importer Type Id: yum_importer
Last Override Config:
Last Sync: 2019-01-29T22:45:35Z
Last Updated: 2019-01-29T22:30:38Z
Repo Id: foo
Scratchpad: None
Distributors:
Auto Publish: True
Config:
Http: False
Https: True
Relative URL: 7/os/x86_64/
Distributor Type Id: yum_distributor
Id: yum_distributor
Last Override Config:
Last Publish: 2019-01-29T22:49:52Z
Last Updated: 2019-01-29T22:30:38Z
Repo Id: foo
Scratchpad:
Auto Publish: False
Config:
Http: False
Https: True
Relative URL: 7/os/x86_64/
Distributor Type Id: export_distributor
Id: export_distributor
Last Override Config:
Last Publish: None
Last Updated: 2019-01-29T22:30:38Z
Repo Id: foo
Scratchpad:
No duplicate RPMs present.
[root@localhost ~]# tree /var/lib/pulp/published/yum/https/repos/7/os/x86_64/Packages/ | grep 389-ds-base
│ ├── 389-ds-base-1.3.8.4-15.el7.x86_64.rpm -> /var/lib/pulp/content/units/rpm/07/cb4bf5d6199cdc7192cbdbbf283cccdef38f18751c53660107b14232d2d349/389-ds-base-1.3.8.4-15.el7.x86_64.rpm
│ ├── 389-ds-base-devel-1.3.8.4-15.el7.x86_64.rpm -> /var/lib/pulp/content/units/rpm/ab/e6f96db9ad2c0f7f097f71b506b25a0d83a4a9926e904100f006361ef66d1f/389-ds-base-devel-1.3.8.4-15.el7.x86_64.rpm
│ ├── 389-ds-base-libs-1.3.8.4-15.el7.x86_64.rpm -> /var/lib/pulp/content/units/rpm/94/5011e802dbf9d470522eac943442a8718e08ba321dc0649a55a0d7f130e363/389-ds-base-libs-1.3.8.4-15.el7.x86_64.rpm
│ └── 389-ds-base-snmp-1.3.8.4-15.el7.x86_64.rpm -> /var/lib/pulp/content/units/rpm/e9/b6b32a0421e87ab83a6ea7346f51e3929746e10eeb2c9b5f05e8278c9e7b61/389-ds-base-snmp-1.3.8.4-15.el7.x86_64.rpm
Updated by ttereshc over 5 years ago
- Status changed from 5 to CLOSED - CURRENTRELEASE
Problem: RPMs from distribution published twice
Solution: stop publishing Distributions in Packages directory
fixes: #4541 https://pulp.plan.io/issues/4541