Project

Profile

Help

Issue #8306

closed

Improve the speed of syncing repository

Added by hyu about 3 years ago. Updated about 3 years ago.

Status:
CLOSED - WONTFIX
Priority:
Normal
Assignee:
-
Sprint/Milestone:
-
Start date:
Due date:
Estimated time:
Severity:
2. Medium
Version:
2.21.1
Platform Release:
OS:
RHEL 7
Triaged:
No
Groomed:
No
Sprint Candidate:
No
Tags:
Performance, Pulp 2
Sprint:
Quarter:

Description

Cloned from bugzilla https://bugzilla.redhat.com/show_bug.cgi?id=1932735

Description of problem: Since we have been facing more and more slow capsule sync issues, I had decided to take some time to fix the pulp 2 codes to improve this as we still have some time before migrating to pulp 3.

Below are the changes I made. With the change, RHEL 7 Server rpm repository will sync 40% - 50% quicker. Some small repositories like satellite tools and RHEL 7 extras repositories which currently taking a few minutes to sync will only take 1 minute or less to finish.

The codes fixes the following issues:

  • Avoid reading unwanted repo metadata fields (such as Primary.xml, Updateinfo.xml) while determining units to download
  • Avoid reading unwanted repo metadata fields (such as Primary.xml, Updateinfo.xml) when removing missing units (Mirror on Sync)
  • Improve the query to purge duplicate units. Previously, it read the whole units_rpm collection (the largest collection). Once the collection reached millions of records it become very slow.
  • Skipping repository publishing if Errata, Yum repo metadata and Comps are not changed. Previously, it would be triggered on every full sync.

Also available in: Atom PDF