Project

Profile

Help

Issue #9553

Publishing repository with large metadata may consume high memory when calculating checksum of the metadata.

Added by hyu 3 months ago. Updated 2 months ago.

Status:
MODIFIED
Priority:
Normal
Assignee:
-
Sprint/Milestone:
-
Start date:
Due date:
Estimated time:
Severity:
2. Medium
Version:
2.21.1
Platform Release:
OS:
Triaged:
Yes
Groomed:
No
Sprint Candidate:
No
Tags:
Katello, Performance, Pulp 2
Sprint:
Quarter:

Description

Pulp is consuming high memory when publishing RHEL 7 repository. This is happening when Pulp is calculating the checksum of the metadata. It reads the whole metadata file into memory at once to calculates the checksum. For example, the other.xml.gz (compressed ) for RHEL 7 repository is about 837MB size. Reading the entire file into memory will cause Pulp worker to consume more than 1GB for RAM.

https://github.com/pulp/pulp/blob/2-master/server/pulp/plugins/util/metadata_writer.py#L99-L101.

How to reproduce:

  1. Sync the RHEL 7 repository.
  2. After that manually force full publish it and run the below command to observe the memory usage.

watch -n 1 'ps -aux | grep resource_worker

  1. The memory usage should be stable between 200MB to 350MB all the time, but will suddenly go up to about 1.1GB for about 3 seconds (around finalizing the publish rpms step) then back to 200MB+.

Associated revisions

Revision a58bca3c View on GitHub
Added by hyu 3 months ago

Reduce the memory usage when calculating checksum

Read the metadata file in chunk when calculating its checksum to save memory.

closes: #9553 https://pulp.plan.io/issues/9553

Revision 03459527 View on GitHub
Added by hyu 2 months ago

Reduce the memory usage when calculating checksum

Read the metadata file in chunk when calculating its checksum to save memory.

closes: #9553 https://pulp.plan.io/issues/9553

History

#1 Updated by pulpbot 3 months ago

  • Status changed from NEW to POST

#2 Updated by dalley 3 months ago

  • Triaged changed from No to Yes

#3 Updated by hyu 2 months ago

  • Status changed from POST to MODIFIED

Also available in: Atom PDF