Project

Profile

Help

Issue #4541

closed

Repository publishing duplicates RPM files under Packages and Packages/<LETTER> structure

Added by tchellomello over 5 years ago. Updated over 5 years ago.

Status:
CLOSED - CURRENTRELEASE
Priority:
Normal
Sprint/Milestone:
Start date:
Due date:
Estimated time:
Severity:
3. High
Version:
Platform Release:
2.19.0
OS:
Triaged:
No
Groomed:
No
Sprint Candidate:
No
Tags:
Pulp 2
Sprint:
Quarter:

Description

Some repositories are getting content duplicated on its repodata. For example, the CentoOS 7 Base repository, gets duplicated symlinks as follows:

[root@pulp ~]# pulp-admin  rpm repo list --details --repo-id centos-7-base
+----------------------------------------------------------------------+
                            RPM Repositories
+----------------------------------------------------------------------+

Id:                   centos-7-base
Display Name:         None
Description:          None
Content Unit Counts:  
  Distribution:        1
  Package Category:    11
  Package Environment: 10
  Package Group:       88
  Package Langpacks:   1
  Rpm:                 10019
Notes:                
Scratchpad:           
  Checksum Type: sha256
Importers:            
  Config:               
    Feed: http://centos.mirror.constant.com/7/os/x86_64/
  Id:                   yum_importer
  Importer Type Id:     yum_importer
  Last Override Config: 
  Last Sync:            2019-03-13T19:58:14Z
  Last Updated:         2019-03-13T19:54:40Z
  Repo Id:              centos-7-base
  Scratchpad:           
    Repomd Revision: 1543161601
Distributors:         
  Auto Publish:         True
  Config:               
    Http:         True
    Https:        True
    Relative URL: 7/os/x86_64/
  Distributor Type Id:  yum_distributor
  Id:                   yum_distributor
  Last Override Config: 
  Last Publish:         2019-03-14T04:17:40Z
  Last Updated:         2019-03-13T19:54:40Z
  Repo Id:              centos-7-base
  Scratchpad:           
  Auto Publish:         False
  Config:               
    Http:         True
    Https:        True
    Relative URL: 7/os/x86_64/
  Distributor Type Id:  export_distributor
  Id:                   export_distributor
  Last Override Config: 
  Last Publish:         None
  Last Updated:         2019-03-13T19:54:40Z
  Repo Id:              centos-7-base
  Scratchpad:           

Once the repository is published, it ended up having duplicated RPMs

pulp-admin  rpm repo  publish run  --repo-id df05f52b-431e-483d-9878-9ffef41d70c8 --force-full

○ → tree /var/lib/pulp/published/yum/https/repos/7/os/x86_64/Packages/  | grep 389-ds-base
│   ├── 389-ds-base-1.3.8.4-15.el7.x86_64.rpm -> /var/lib/pulp/content/units/rpm/07/cb4bf5d6199cdc7192cbdbbf283cccdef38f18751c53660107b14232d2d349/389-ds-base-1.3.8.4-15.el7.x86_64.rpm
│   ├── 389-ds-base-devel-1.3.8.4-15.el7.x86_64.rpm -> /var/lib/pulp/content/units/rpm/ab/e6f96db9ad2c0f7f097f71b506b25a0d83a4a9926e904100f006361ef66d1f/389-ds-base-devel-1.3.8.4-15.el7.x86_64.rpm
│   ├── 389-ds-base-libs-1.3.8.4-15.el7.x86_64.rpm -> /var/lib/pulp/content/units/rpm/94/5011e802dbf9d470522eac943442a8718e08ba321dc0649a55a0d7f130e363/389-ds-base-libs-1.3.8.4-15.el7.x86_64.rpm
│   └── 389-ds-base-snmp-1.3.8.4-15.el7.x86_64.rpm -> /var/lib/pulp/content/units/rpm/e9/b6b32a0421e87ab83a6ea7346f51e3929746e10eeb2c9b5f05e8278c9e7b61/389-ds-base-snmp-1.3.8.4-15.el7.x86_64.rpm
├── 389-ds-base-1.3.8.4-15.el7.x86_64.rpm -> /var/lib/pulp/content/units/rpm/07/cb4bf5d6199cdc7192cbdbbf283cccdef38f18751c53660107b14232d2d349/389-ds-base-1.3.8.4-15.el7.x86_64.rpm
├── 389-ds-base-devel-1.3.8.4-15.el7.x86_64.rpm -> /var/lib/pulp/content/units/rpm/ab/e6f96db9ad2c0f7f097f71b506b25a0d83a4a9926e904100f006361ef66d1f/389-ds-base-devel-1.3.8.4-15.el7.x86_64.rpm
├── 389-ds-base-libs-1.3.8.4-15.el7.x86_64.rpm -> /var/lib/pulp/content/units/rpm/94/5011e802dbf9d470522eac943442a8718e08ba321dc0649a55a0d7f130e363/389-ds-base-libs-1.3.8.4-15.el7.x86_64.rpm
├── 389-ds-base-snmp-1.3.8.4-15.el7.x86_64.rpm -> /var/lib/pulp/content/units/rpm/e9/b6b32a0421e87ab83a6ea7346f51e3929746e10eeb2c9b5f05e8278c9e7b61/389-ds-base-snmp-1.3.8.4-15.el7.x86_64.rpm

That creates a big problem whenever exporting the content via ISO , as the size of the repo will be twice big.

The repodata is being created correctly as it only references the files under the new strucuture Packages/<first_letter>/

○ → zcat  /var/lib/pulp/published/yum/https/repos/7/os/x86_64/Packages/repodata/a3f98f3b850725e0ac52472653efd309be6ba436296e66b1b997ca92c95fcdf1-primary.xml.gz  | grep 'href' | grep 389-ds-base
  <location href="Packages/3/389-ds-base-libs-1.3.8.4-15.el7.x86_64.rpm"/>
  <location href="Packages/3/389-ds-base-1.3.8.4-15.el7.x86_64.rpm"/>
  <location href="Packages/3/389-ds-base-snmp-1.3.8.4-15.el7.x86_64.rpm"/>
  <location href="Packages/3/389-ds-base-devel-1.3.8.4-15.el7.x86_64.rpm"/>

Related issues

Copied to Pulp - Test #4542: [pulp_rpm] - Repository publishing duplicates RPM files under Packages and Packages/<LETTER> structureCLOSED - COMPLETEkersomActions
Actions #1

Updated by tchellomello over 5 years ago

**** Troubleshooting Steps **

**** adding a rpdb on the process_main() of PublishRpmStep we can see the following:

/usr/lib/python2.7/site-packages/pulp_rpm/plugins/distributors/yum/publish.py
-----
  482     def process_main(self, item=None):
  483         """
  484         Link the unit to the content directory and the package_dir
  485 
  486         :param item: The item to process or none if this get_iterator is not defined
  487         :type item: pulp_rpm.plugins.db.models.RPM or pulp_rpm.plugins.db.models.SRPM
  488         """
  489         import rpdb; rpdb.set_trace()
  490         unit = item
  491         source_path = unit._storage_path
  492         relative_path = file_utils.make_packages_relative_path(unit.filename)
  493         destination_path = os.path.join(self.get_working_dir(), relative_path)
  494         plugin_misc.create_symlink(source_path, destination_path)
  495         for package_dir in self.dist_step.package_dirs:
  496             destination_path = os.path.join(package_dir, unit.filename)
  497             plugin_misc.create_symlink(source_path, destination_path)
  498 
  499         for context in (self.file_lists_context, self.other_context, self.primary_context):
  500             context.add_unit_metadata(unit)

Running the code we will have:

**** source_path is pointing correct to the content unit file

    (Pdb) source_path 
    u'/var/lib/pulp/content/units/rpm/40/a9bfdeba456727e151d53adc41cc688d4360a71dde8e73a18150d72d6d300b/tigervnc-license-1.8.0-13.el7.noarch.rpm'

**** relative_path has the correct and expected location from that file

    (Pdb) relative_path
    u'Packages/t/tigervnc-license-1.8.0-13.el7.noarch.rpm'

**** destination_path has the temporary worker location

    (Pdb) destination_path
    u'/var/cache/pulp/reserved_resource_worker-2@pulp.mmello.local/c8c776b9-71a5-418c-8b20-bdcb2aa55016/Packages/t/tigervnc-license-1.8.0-13.el7.noarch.rpm'

The looking at the temporary worker directory, nothing was placed yet in the Packages directory:

        [root@pulp c8c776b9-71a5-418c-8b20-bdcb2aa55016]# tree
        .
        ├── images
        │   ├── boot.iso -> /var/lib/pulp/content/units/distribution/de/d64602f1a2165af46f6cd4c8c2645a0422607173fb34617458c9f419e2c7bf/images/boot.iso
        │   └── pxeboot
        │       ├── initrd.img -> /var/lib/pulp/content/units/distribution/de/d64602f1a2165af46f6cd4c8c2645a0422607173fb34617458c9f419e2c7bf/images/pxeboot/initrd.img
        │       └── vmlinuz -> /var/lib/pulp/content/units/distribution/de/d64602f1a2165af46f6cd4c8c2645a0422607173fb34617458c9f419e2c7bf/images/pxeboot/vmlinuz
        ├── LiveOS
        │   └── squashfs.img -> /var/lib/pulp/content/units/distribution/de/d64602f1a2165af46f6cd4c8c2645a0422607173fb34617458c9f419e2c7bf/LiveOS/squashfs.img
        └── repodata
            ├── filelists.xml.gz
            ├── other.xml.gz
            ├── primary.xml.gz
            └── repomd.xml

When the `plugin_misc.create_symlink(source_path, destination_path)` is called, the symlink is created as expected:

        Every 1.0s: tree Packages/                                                                                                                                                                         Thu Mar 14 00:10:57 2019

        Packages/
        └── t
            └── tigervnc-license-1.8.0-13.el7.noarch.rpm -> /var/lib/pulp/content/units/rpm/40/a9bfdeba456727e151d53adc41cc688d4360a71dde8e73a18150d72d6d300b/tigervnc-license-1.8.0-13.el7.noarch.rpm

       1 directory, 1 file

At this point, a loop is performed which will be calling again the create_symlink method for all members from `self.dist_step.package_dirs`. This case, that list thas the following:

        (Pdb) self.dist_step.package_dirs 
        [u'/var/cache/pulp/reserved_resource_worker-2@pulp.mmello.local/c8c776b9-71a5-418c-8b20-bdcb2aa55016/Packages']

    ** So running the loop we have:

        495              for package_dir in self.dist_step.package_dirs:
        496  ->                destination_path = os.path.join(package_dir, unit.filename)
        497                  plugin_misc.create_symlink(source_path, destination_path)

Then on this loop is where the problem happens:

        (Pdb) destination_path
        u'/var/cache/pulp/reserved_resource_worker-2@pulp.mmello.local/c8c776b9-71a5-418c-8b20-bdcb2aa55016/Packages/tigervnc-license-1.8.0-13.el7.noarch.rpm'

        Every 1.0s: tree Packages/                                                                                                                                                                         Thu Mar 14 00:14:29 2019

        Packages/
        ├── t
        │   └── tigervnc-license-1.8.0-13.el7.noarch.rpm -> /var/lib/pulp/content/units/rpm/40/a9bfdeba456727e151d53adc41cc688d4360a71dde8e73a18150d72d6d300b/tigervnc-license-1.8.0-13.el7.noarch.rpm
        └── tigervnc-license-1.8.0-13.el7.noarch.rpm -> /var/lib/pulp/content/units/rpm/40/a9bfdeba456727e151d53adc41cc688d4360a71dde8e73a18150d72d6d300b/tigervnc-license-1.8.0-13.el7.noarch.rpm

        1 directory, 2 files

To a quick test, commenting out the for loop on the self.dist_step.package_dirs, the repository got created corretly.

        [root@pulp ~]# pulp-admin  rpm repo  publish run  --repo-id centos-7-base --force-full 

            [...SNIP...]

        Publishing RPMs
        [==================================================] 100%
        10019 of 10019 items
        ... completed

Then the repo got published as expected:

        [root@pulp ~]# ls -la /var/lib/pulp/published/yum/https/repos/7/os/x86_64/Packages/*.rpm  2>/dev/null| wc -l 
        0

        [root@pulp ~]# ls -la /var/lib/pulp/published/yum/https/repos/7/os/x86_64/Packages/*/*.rpm  2>/dev/null| wc -l 
        10019

[root@pulp ~]# tree /var/lib/pulp/published/yum/https/repos/7/os/x86_64/Packages/ | head
/var/lib/pulp/published/yum/https/repos/7/os/x86_64/Packages/
├── 3
│   ├── 389-ds-base-1.3.8.4-15.el7.x86_64.rpm -> /var/lib/pulp/content/units/rpm/07/cb4bf5d6199cdc7192cbdbbf283cccdef38f18751c53660107b14232d2d349/389-ds-base-1.3.8.4-15.el7.x86_64.rpm
│   ├── 389-ds-base-devel-1.3.8.4-15.el7.x86_64.rpm -> /var/lib/pulp/content/units/rpm/ab/e6f96db9ad2c0f7f097f71b506b25a0d83a4a9926e904100f006361ef66d1f/389-ds-base-devel-1.3.8.4-15.el7.x86_64.rpm
│   ├── 389-ds-base-libs-1.3.8.4-15.el7.x86_64.rpm -> /var/lib/pulp/content/units/rpm/94/5011e802dbf9d470522eac943442a8718e08ba321dc0649a55a0d7f130e363/389-ds-base-libs-1.3.8.4-15.el7.x86_64.rpm
│   └── 389-ds-base-snmp-1.3.8.4-15.el7.x86_64.rpm -> /var/lib/pulp/content/units/rpm/e9/b6b32a0421e87ab83a6ea7346f51e3929746e10eeb2c9b5f05e8278c9e7b61/389-ds-base-snmp-1.3.8.4-15.el7.x86_64.rpm
├── a
│   ├── a2ps-4.14-23.el7.i686.rpm -> /var/lib/pulp/content/units/rpm/20/8453cf3eda8583d4c009b38311588a571e8833f9d557f65c1b55c35bc11ee1/a2ps-4.14-23.el7.i686.rpm

Basically the fix we need to validate if that loop is still necessary or to add a method to verify if the file already exists to avoid duplication.

mmello

Actions #3

Updated by bherring over 5 years ago

  • Copied to Test #4542: [pulp_rpm] - Repository publishing duplicates RPM files under Packages and Packages/<LETTER> structure added
Actions #4

Updated by ttereshc over 5 years ago

  • Project changed from Pulp to RPM Support
  • Subject changed from [pulp_rpm] - Repository publishing duplicates RPM files under Packages and Packages/<LETTER> structure to Repository publishing duplicates RPM files under Packages and Packages/<LETTER> structure
  • Category deleted (14)
  • Severity changed from 4. Urgent to 3. High
  • Tags deleted (Release Engineering)

Added by dkliban@redhat.com over 5 years ago

Revision c4c2ea5b | View on GitHub

Problem: RPMs from distribution published twice

Solution: stop publishing Distributions in Packages directory

fixes: #4541 https://pulp.plan.io/issues/4541

Actions #5

Updated by dkliban@redhat.com over 5 years ago

  • Status changed from NEW to POST
  • Assignee set to dkliban@redhat.com
Actions #6

Updated by dkliban@redhat.com over 5 years ago

  • Platform Release set to 2.19.0
Actions #7

Updated by ttereshc over 5 years ago

  • Sprint/Milestone set to 2.19.0
Actions #9

Updated by dkliban@redhat.com over 5 years ago

  • Status changed from POST to MODIFIED
Actions #11

Updated by ttereshc over 5 years ago

  • Status changed from MODIFIED to 5
Actions #12

Updated by kersom over 5 years ago

Manually tested. Not suitable for automation.

Pulp Version

[root@localhost ~]# rpm -qa | grep pulp | sort
pulp-admin-client-2.19.0-0.1.rc.el7.noarch
pulp-deb-admin-extensions-1.9.0-0.2.rc.git.110.c3056d7.el7.noarch
pulp-deb-plugins-1.9.0-0.2.rc.git.110.c3056d7.el7.noarch
pulp-docker-admin-extensions-3.2.3-0.1.rc.el7.noarch
pulp-docker-plugins-3.2.3-0.1.rc.el7.noarch
pulp-ostree-admin-extensions-1.4.0-1.el7.noarch
pulp-ostree-plugins-1.4.0-1.el7.noarch
pulp-puppet-admin-extensions-2.19.0-0.1.rc.el7.noarch
pulp-puppet-plugins-2.19.0-0.1.rc.el7.noarch
pulp-puppet-tools-2.19.0-0.1.rc.el7.noarch
pulp-python-admin-extensions-2.0.3-1.el7.noarch
pulp-python-plugins-2.0.3-1.el7.noarch
pulp-rpm-admin-extensions-2.19.0-0.1.rc.el7.noarch
pulp-rpm-plugins-2.19.0-0.1.rc.el7.noarch
pulp-selinux-2.19.0-0.1.rc.el7.noarch
pulp-server-2.19.0-0.1.rc.el7.noarch
python-isodate-0.5.0-4.pulp.el7.noarch
python-pulp-bindings-2.19.0-0.1.rc.el7.noarch
python-pulp-client-lib-2.19.0-0.1.rc.el7.noarch
python-pulp-common-2.19.0-0.1.rc.el7.noarch
python-pulp-deb-common-1.9.0-0.2.rc.git.110.c3056d7.el7.noarch
python-pulp-docker-common-3.2.3-0.1.rc.el7.noarch
python-pulp-oid_validation-2.19.0-0.1.rc.el7.noarch
python-pulp-ostree-common-1.4.0-1.el7.noarch
python-pulp-puppet-common-2.19.0-0.1.rc.el7.noarch
python-pulp-python-common-2.0.3-1.el7.noarch
python-pulp-repoauth-2.19.0-0.1.rc.el7.noarch
python-pulp-rpm-common-2.19.0-0.1.rc.el7.noarch
python-pulp-streamer-2.19.0-0.1.rc.el7.noarch

commands:


pulp-admin login -u admin -p admin
pulp-admin rpm repo create --repo-id foo --feed http://centos.mirror.constant.com/7/os/x86_64/
pulp-admin rpm repo sync run --repo-id foo
pulp-admin rpm repo publish run --repo-id foo
pulp-admin  rpm repo list --details --repo-id foo
pulp-admin  rpm repo  publish run  --repo-id foo --force-full
[root@localhost ~]# pulp-admin  rpm repo list --details --repo-id foo
+----------------------------------------------------------------------+
                            RPM Repositories
+----------------------------------------------------------------------+

Id:                   foo
Display Name:         None
Description:          None
Content Unit Counts:  
  Distribution:        1
  Package Category:    11
  Package Environment: 10
  Package Group:       88
  Package Langpacks:   1
  Rpm:                 10016
Notes:                
Scratchpad:           
  Checksum Type: sha256
Importers:            
  Config:               
    Feed: http://centos.mirror.constant.com/7/os/x86_64/
  Id:                   yum_importer
  Importer Type Id:     yum_importer
  Last Override Config: 
  Last Sync:            2019-01-29T22:45:35Z
  Last Updated:         2019-01-29T22:30:38Z
  Repo Id:              foo
  Scratchpad:           None
Distributors:         
  Auto Publish:         True
  Config:               
    Http:         False
    Https:        True
    Relative URL: 7/os/x86_64/
  Distributor Type Id:  yum_distributor
  Id:                   yum_distributor
  Last Override Config: 
  Last Publish:         2019-01-29T22:49:52Z
  Last Updated:         2019-01-29T22:30:38Z
  Repo Id:              foo
  Scratchpad:           
  Auto Publish:         False
  Config:               
    Http:         False
    Https:        True
    Relative URL: 7/os/x86_64/
  Distributor Type Id:  export_distributor
  Id:                   export_distributor
  Last Override Config: 
  Last Publish:         None
  Last Updated:         2019-01-29T22:30:38Z
  Repo Id:              foo
  Scratchpad:           

No duplicate RPMs present.

[root@localhost ~]# tree /var/lib/pulp/published/yum/https/repos/7/os/x86_64/Packages/  | grep 389-ds-base
│   ├── 389-ds-base-1.3.8.4-15.el7.x86_64.rpm -> /var/lib/pulp/content/units/rpm/07/cb4bf5d6199cdc7192cbdbbf283cccdef38f18751c53660107b14232d2d349/389-ds-base-1.3.8.4-15.el7.x86_64.rpm
│   ├── 389-ds-base-devel-1.3.8.4-15.el7.x86_64.rpm -> /var/lib/pulp/content/units/rpm/ab/e6f96db9ad2c0f7f097f71b506b25a0d83a4a9926e904100f006361ef66d1f/389-ds-base-devel-1.3.8.4-15.el7.x86_64.rpm
│   ├── 389-ds-base-libs-1.3.8.4-15.el7.x86_64.rpm -> /var/lib/pulp/content/units/rpm/94/5011e802dbf9d470522eac943442a8718e08ba321dc0649a55a0d7f130e363/389-ds-base-libs-1.3.8.4-15.el7.x86_64.rpm
│   └── 389-ds-base-snmp-1.3.8.4-15.el7.x86_64.rpm -> /var/lib/pulp/content/units/rpm/e9/b6b32a0421e87ab83a6ea7346f51e3929746e10eeb2c9b5f05e8278c9e7b61/389-ds-base-snmp-1.3.8.4-15.el7.x86_64.rpm
Actions #13

Updated by ttereshc over 5 years ago

  • Status changed from 5 to CLOSED - CURRENTRELEASE
Actions #14

Updated by bmbouter over 5 years ago

  • Tags Pulp 2 added

Also available in: Atom PDF