Issue #3172: Celery worker consumes large number of memory when regenerating applicability for a consumer that binds to many repositories with many errata. - RPM Support - Pulp

Actions

Send by e-mail Copy link

Issue #3172

closed

Celery worker consumes large number of memory when regenerating applicability for a consumer that binds to many repositories with many errata.

Added by hyu about 7 years ago. Updated over 5 years ago.

Status:

CLOSED - CURRENTRELEASE

Priority:

Normal

Assignee:

ttereshc

Sprint/Milestone:

Start date:

Due date:

Estimated time:

Severity:

2. Medium

Version:

2.8.7

Platform Release:

2.16.2

OS:

RHEL 7

Triaged:

Yes

Groomed:

Sprint Candidate:

Yes

Tags:

Pulp 2

Sprint:

Sprint 37

Quarter:

Description

The celery worker is consuming about 60MB RAM initially. After running regenerating applicability for a consumer that binds to 9 repositories, it increased to about 350MB+ and the RAM will never be freed.

I think below are the reason of high memory consumption.

Pulp is fetching the pkglist from all the repositories that a particular Erratum is associated to. This is expensive and the results may contain a lot of duplicate pkglist.

For example, Pulp makes this query:

db.erratum_pkglists.find({"errata_id": "RHBA-2016:1886"}).count()

Instead of doing the following:

db.erratum_pkglists.find({"errata_id": "RHBA-2016:1886", "repo_id" : "my_org-Red_Hat_Enterprise_Linux_Server-Red_Hat_Satellite_Tools_6_2_for_RHEL_7_Server_RPMs_x86_64"}).count()

After amending the "erratum_pkglists" query to filter the errata by repository, the memory consumption and the speed are reduced by 80%

I think I understand why Pulp don't filter the pkglist by repository when regenerating applicability. It is due to the fact that one entry may not contain all the pkglist since an erratum can be copied accross repositories.

I made the following change to retrieve only the "nevra" of the errata pkglist when regenerating applicability for consumer. This patch can reduce the memory consumption by ~50% (350MB to 150MB) for a consumer with 9 repositories.

https://github.com/hao-yu/pulp_rpm/commit/9f5a52823afee80b31c1e3aa14f4f65fc85f9be9

Actions

Project

Profile

Help

RPM Support

Agile boards

Custom queries

Issue #3172

Celery worker consumes large number of memory when regenerating applicability for a consumer that binds to many repositories with many errata.

Updated by dalley about 7 years ago

Updated by rchan about 7 years ago

Updated by hyu about 7 years ago

Updated by ttereshc about 7 years ago

Updated by rchan almost 7 years ago

Updated by jortel@redhat.com almost 7 years ago

Updated by rchan almost 7 years ago

Updated by bmbouter almost 7 years ago

Updated by bmbouter almost 7 years ago

Updated by jortel@redhat.com almost 7 years ago

Added by ttereshc over 6 years ago

Updated by ttereshc over 6 years ago

Updated by ttereshc over 6 years ago

Updated by rchan over 6 years ago

Added by ttereshc over 6 years ago

Added by ttereshc over 6 years ago

Updated by ttereshc over 6 years ago

Added by ttereshc over 6 years ago

Updated by dkliban@redhat.com over 6 years ago

Added by ttereshc over 6 years ago

Added by ttereshc over 6 years ago

Updated by ttereshc over 6 years ago

Updated by ipanova@redhat.com over 6 years ago

Updated by bmbouter over 5 years ago