Workers go OOM while trying to sync RHEL 7
- CentOS 7 Vagrant box
- Memory: 8GB
- Running Katello w/ Pulp 3.11
- 2 workers present
When I attempt to sync RHEL 7 Server x86_64, it fails every time with Pulp workers going OOM. At the time when I initiate the sync there is 3.5GB of available memory on the VM.
@Eric 3.5 gb isn't enough for syncing large repositories. The RHEL 7 metadata is 800-900 mb when compressed, but reading it requires decompressing it, and it inflates to roughly more than 4gb. There's some overhead on top of that, so Pulp's memory consumption when syncing RHEL 7 is somewhere around 4.3gb.
We definitely recommend having more than 8gb available if large repositories are going to be synced. I think the Satellite requirement is >20gb? Unfortunately there's really no way to fix this due to the way createrepo_c works.
I guess I can mention this: as a hobby project, I've been working on a Rust library for parsing and writing RPM metadata, and one of my design decisions is to avoid this problem by allowing packages to be streamed from the metadata one-by-one without having everything in memory at the same time.
Technically speaking, Pulp 2 rolled its own metadata manipulation code, so it's not totally unprecedented, but it's not a great idea for Pulp 3 at the present time IMO. createrepo_c is "official" and we get a ton of benefit from piggybacking off of their work and not needing to implement every new feature ourselves. The bus factor of a big complex self-maintained library written by one person in a completely different language (it would have Python bindings) is quite bad.
It's something we could only ever consider using if and only if the team / product as a whole thought the benefits (reducing memory consumption to near-zero) was worth the associated long-term maintenance burden. I'm not convinced that it would be, but I will throw the idea out there for completeness.
- Status changed from CLOSED - NOTABUG to NEW
So this is going to be a problem, not necessarily regarding one sync, but with many at once. I'm re-opening this because we may need to make urgent changes in this direction.
The best short-term option would not be what was described above in note 3, but porting over the other.xml and potentially filelists.xml parsing code from Pulp 2, which apparently does do iterative parsing. We would still use createrepo_c for parsing primary.xml because it is the most complex by far, and not so large.
#7 Updated by firstname.lastname@example.org 4 months ago
After bumping the memory by 4GB on the same setup I encounter the same
issue. I could try going higher to find an upper limit (if one exists). Are
there memory requirement guidelines available ?
On Fri, Jun 11, 2021, 4:33 PM Pulp email@example.com wrote:
ehelms it has been impressed upon us that we need to fix this :)
We're working on it https://github.com/pulp/pulp_rpm/pull/2016
I haven't run any tests with Pulp but outside of Pulp it used about 25x less RAM.
I'm not moving to POST yet because we need to have some discussions between Pulp / Katello / createrepo_c first. But ultimately I think this will end up being merged until we can get createrepo_c into a more usable state, because the current state is not going to be acceptable for 6.10, and I don't think we can get createrepo_c ready in time either.
Please register to edit this issue