pulp-admin rpm repo content rpm with the RHEL 6 repository uses too much RAM
Create a repository named rhel-6-server, set its feed to the RHEL 6 upstream, and sync it. Then, run this command:
$ pulp-admin rpm repo content rpm --repo-id rhel-6-server
On my machine, it ran out of RAM before it returned, and took a very long time to execute. According to top, it was using more than 2.1 GB of virtual memory (sum of shared libs, swapped memory, and physical RAM) which is probably too much.
+ This bug was cloned from Bugzilla Bug #1011192 +
#1 Updated by cduryee almost 7 years ago
This is still outstanding as of pulp 2.5.0-0.17.rc. I tried to run this on a system with 4GB mem and my httpd was killed by the OOM killer. I ran this after syncing http://public-yum.oracle.com/repo/OracleLinux/OL6/latest/x86_64/ which has 22975 rpm units.
+ This comment was cloned from Bugzilla #1011192 comment 1 +
#2 Updated by darkfader over 5 years ago
I was just pointed at a similar error; the repo in question is about the same size 20-25k rpm's.
What's puzzling me is our server has 32GB of ram so, without pun'ing too badly: this would be enough ram to put all those RPMs in RAM.
Even considering mongodb is also running, something must be really going wrong there.
I'll post once/if I have more findings that help.
#3 Updated by darkfader over 5 years ago
I got this nagging feeling that I'd joked about this before.. and yes I did.
I remembered now - the goal to make this query work is to enable memory overcommit so it can stretch out a little and gains a sense that it'll have enough ram to process it. it should, in experience not actually use that extra memory.
I'm testing it now:
#4 Updated by darkfader over 5 years ago
In the end... it died.
So, the one thing I don't get, at all:
This query should return the basic info on all the rpms in a repository.
This metadata can't be bigger than the spec files' contents.
A normal spec is like 20KB or less, a large one maybe 200k.
so, even ignoring that this information should already available...
That means it should be between 500MB and 3GB of data to query and move.
It seems to require a lot more to get this queried.
It is also not clear why delivering that info should be in one large drop of data anyway, and that's the only reason I can imagine for using more than, say, 100MB.
#5 Updated by darkfader over 5 years ago
This is the memory info at time of crash:
[270003.447218] Out of memory: Kill process 29492 (httpd) score 472 or sacrifice child [270003.447240] Killed process 29492 (httpd) total-vm:17015216kB, anon-rss:15688384kB, file-rss:24kB
I'm not 100% sure, but I know it really needs more ram than all the RPMs are in size :-/
#6 Updated by bmbouter over 5 years ago
- Severity set to 1. Low
In situations where I want to troubleshoot memory issues I capture a cProfile report of the codepath I'm interested in. Pulp has a little info on doing this documented . For an RPM sync you want to profile this method I believe .
Once the cProfile is in hand, analyze the memory usage with RunSnake .
#7 Updated by email@example.com almost 5 years ago
I have a pulp repo with http://mirror.centos.org/centos/7/updates/x86_64/
When I ran pulp-admin rpm repo content rpm --repo-id=centos7_64-updates --match 'name=^kernel'
It fills up the RAM (5GB). Normally it uses less than 1GB.
However pulp-admin rpm repo content rpm --repo-id=centos7_64-updates --match 'name=^kernel' --fields=name
works and does not crash?
This is RHEL 7.3 with pulp Version 2.11.0
#9 Updated by bmbouter over 2 years ago
Pulp 2 is approaching maintenance mode, and this Pulp 2 ticket is not being actively worked on. As such, it is being closed as WONTFIX. Pulp 2 is still accepting contributions though, so if you want to contribute a fix for this ticket, please reopen or comment on it. If you don't have permissions to reopen this ticket, or you want to discuss an issue, please reach out via the developer mailing list.
Please register to edit this issue