Project

Profile

Help

Issue #8890

closed

Publishing a repository can take longer time to finish if many same errata are in many synced repositories

Added by hyu over 3 years ago. Updated over 3 years ago.

Status:
MODIFIED
Priority:
Normal
Assignee:
-
Sprint/Milestone:
-
Start date:
Due date:
Estimated time:
Severity:
2. Medium
Version:
2.21.1
Platform Release:
OS:
RHEL 7
Triaged:
No
Groomed:
No
Sprint Candidate:
No
Tags:
Performance, Pulp 2
Sprint:
Quarter:

Description

Description of problem: Pulp can take more than an hour to publish a repository when a large number of repositories have been synced from upstream and same errata are existed in the synced repositories, such as RHEL 7.x, RHEL 7 EUS and different aches.

The more repositories have the same errata the more "erratum pkglist" entries will be created in the mongodb which can cause the performance degradation.

For example:

db.erratum_pkglists.find({errata_id: "RHSA-2018:2557"}).count() 387 db.erratum_pkglists.find({errata_id: "RHBA-2019:2180"}).count() 217

When publishing errata, Pulp will use the above query to get all package lists of the errata. This will take long time to process when they are many package lists returned by the query and each package list is consist of many packages.

As we can see below, the "Publish Errata" step is very slow. 53 minutes has passed, it has only processed about 2073 errata. It will take more than an hour to finish. ... { "description": "Publishing Errata", "details": "", "error_details": [], "items_total": 4789, "num_failures": 0, "num_processed": 2073, "num_success": 2073, "state": "IN_PROGRESS", "step_id": "2f09190d-013a-4300-9445-eccb52ad94fe", "step_type": "errata" }, ... "start_time": "2021-06-09T12:41:13Z",

date

Wed Jun 9 13:32:41 UTC 2021

Also available in: Atom PDF