Issue #8890
closedPublishing a repository can take longer time to finish if many same errata are in many synced repositories
Description
Description of problem: Pulp can take more than an hour to publish a repository when a large number of repositories have been synced from upstream and same errata are existed in the synced repositories, such as RHEL 7.x, RHEL 7 EUS and different aches.
The more repositories have the same errata the more "erratum pkglist" entries will be created in the mongodb which can cause the performance degradation.
For example:
db.erratum_pkglists.find({errata_id: "RHSA-2018:2557"}).count() 387 db.erratum_pkglists.find({errata_id: "RHBA-2019:2180"}).count() 217
When publishing errata, Pulp will use the above query to get all package lists of the errata. This will take long time to process when they are many package lists returned by the query and each package list is consist of many packages.
As we can see below, the "Publish Errata" step is very slow. 53 minutes has passed, it has only processed about 2073 errata. It will take more than an hour to finish. ... { "description": "Publishing Errata", "details": "", "error_details": [], "items_total": 4789, "num_failures": 0, "num_processed": 2073, "num_success": 2073, "state": "IN_PROGRESS", "step_id": "2f09190d-013a-4300-9445-eccb52ad94fe", "step_type": "errata" }, ... "start_time": "2021-06-09T12:41:13Z",
date¶
Wed Jun 9 13:32:41 UTC 2021
Updated by pulpbot over 3 years ago
- Status changed from NEW to POST
Updated by dkliban@redhat.com over 3 years ago
- Project changed from Pulp to RPM Support
Added by hyu over 3 years ago
Updated by hyu over 3 years ago
- Status changed from POST to MODIFIED
Applied in changeset 221bf77b5177c4ec6002f2ccdad281a1a8a83c44.
Fix slow publish when errata are associated to many repos
closes: #8890 https://pulp.plan.io/issues/8890