modify API with both add_content_units and remove_content_units isn't always correct
Ticket moved to GitHub: "pulp/pulpcore/2046":https://github.com/pulp/pulpcore/issues/2046
In Katello, I was trying to filter out the CentOS 7 OS repository (http://mirror.centos.org/centos-7/7/os/x86_64/) to have all content except the RPMs with the name
kernel. The repo has just over 10K RPMs. Here are the general steps in the Katello code:
- Create a new repository
- Copy all content units except
kernelinto the new repo using the modify API:
api.repositories_api.modify(repository_reference.repository_href, add_content_units: content_unit_array, remove_content_units: ['*'])
-> Note: we used two calls to modify for this repo because we only wanted to copy 10,000 content units at once. The first call uses
remove_content_units, but the second does not.
- Save the version and index from it
The first copy looks fine, but the next set of calls to the modify API leave the last repo version with a very small subset of the content units we asked for (in this case, 146 instead of 10071). the third set of copy actions, however, is correct, and this repeats. Every second set of copies (content view publish in Katello) is wrong.
We've seen this also with the RHEL 7.9 Kickstart repository. The first copy is fine, but the next has no RPMs at all. This repo is small enough (~5,000 RPMs) to only require one modify call, so the chunking that we're doing does not seem related.
Our workaround for this is to break out the content unit removal task instead of doing it in one API call. It seems the issue only relates to calls with both add_content_units and remove_content_units.
pulp-ansible (0.9.0) pulp-certguard (1.4.0) pulp-container (2.7.1) pulp-deb (2.14.1) pulp-file (1.8.2) pulp-python (3.4.0) pulp-rpm (3.14.2) pulpcore (3.14.5)