Actions
Task #6856
closedDocument Pulp3 Hardware Requirements recommendations
Start date:
Due date:
% Done:
100%
Estimated time:
Platform Release:
Groomed:
Yes
Sprint Candidate:
Yes
Tags:
Documentation
Sprint:
Sprint 79
Quarter:
Description
This question came up in our channel, we should put this info into the docs in the Architecture and Deploying page in a new section called "Hardware Requirements".
Here's some text that was written in the channel about it:
13:26 <cognifloyd> For a single VM install of Pulp 3 (using a django-storages backend for artifact storage so that artifacts aren't in the VM) how much CPU/RAM/Disk should I expect to need in that VM? There will be yum repos for CentOS 6/7/8 + EPEL 6/7/8, and the pypi index, and a few custom file repos. Are there any rule of thumbs to help me initially size this thing?
13:29 <-- lhc130 (~Sam@2a00:23c7:5187:4f00:4e05:623d:9a92:2739) has quit (Ping timeout: 246 seconds)
13:32 <-- shaunm (~shaunm@2600:2b00:9404:7600:2505:1b5f:29f1:a21a) has quit (Quit: shaunm)
13:32 --> shaunm (~shaunm@2600:2b00:9404:7600:c8ea:c1f6:573a:b20) has joined #pulp
13:39 <-- x9c4 (~mdellweg@dslb-002-202-024-226.002.202.pools.vodafone-ip.de) has quit (Quit: Leaving)
13:43 --> pgagne_ (~textual@cpe-76-182-79-82.nc.res.rr.com) has joined #pulp
13:46 <-- pgagne (~textual@cpe-76-182-79-82.nc.res.rr.com) has quit (Ping timeout: 260 seconds)
13:58 <-- orabin (~orabin@31.210.177.133) has quit (Read error: Connection reset by peer)
14:50 --> pombreda_ (~pombreda@host-78-129-33-156.dynamic.voo.be) has joined #pulp
15:05 <cognifloyd> Next question:
15:08 <cognifloyd> Once I get the basic pulp set up, I'll be looking at building a pulp 3 plugin for a file-like artifact I have to deal with that has some annoying encryption requirements. ie The artifact should be encrypted in the django-storages backend, and pulp must not have the key to decrypt it. Clients will be given a key to decrypt those artifacts. Has anything like this been done? I think a plain file repo would work, but I'm wondering if pulp
15:08 <cognifloyd> d need special support since these would be encrypted.
15:26 <bmbouter> cognifloyd: we don't have have sizing recommendations unfortunately, but I can give some anecdotal info
15:26 <bmbouter> cpu count should equal the number of pulp workers you start, which allows you to perform N repository operations concurrently
15:26 <bmbouter> so 2 cpus, you can sync 2 repos concurrently
15:28 <bmbouter> RAM tends to hit it's high watermark during sync and then go back down to nominal levels, so for N workers I'd say plan on a gig for each and then maybe 1 gig for postgres as a start
15:28 <bmbouter> so for 2 workers, 3 gigs total (2 for sync use, 1 for postgresql)
15:28 <bmbouter> our dev machines typically have 2-4 G and we never oom
15:29 <bmbouter> for disk it's the size of the repos you want all added together. pulp de-duplicates content so even as you sync those over time they tend not to grow very muh
15:29 <bmbouter> much
15:29 <bmbouter> I'm not sure what centos6/7/8 + el 6/7/8 is these days but maybe 400G?
15:30 <-- ipanova (~ipanova@ip-86-49-115-30.net.upcbroadband.cz) has quit (Quit: Leaving.)
15:30 <cognifloyd> 400G (ish) for the artifacts or the metadata?
15:30 <bmbouter> in terms of the encryption requirements I think pulp_file would work just fine for you, pulp doesn't need to read/parse the binary data it stores ever, it just needs to calculate the checksums and it can do that on the encrypted data
15:31 <bmbouter> 400G ish for the artifacts
15:31 <bmbouter> the metadata is very small and lives in the db
15:31 <cognifloyd> I'm not concerned about the filesize of the artifacts as I'll have them stored in azure blob storage.
15:31 <bmbouter> oh right
15:31 <bmbouter> you said that
15:31 * cognifloyd would prefer to use GCP, but a client demanded we use azure instead. Bummer
15:31 <cognifloyd> ;)
15:32 <bmbouter> your disk can be small enough to provide working storage during sync prior to blobs being placed on the backend, so maybe 50G would do it all
15:32 <bmbouter> pulp verifies checksum data locally and artifacts download/verify in parallel so 50G is probably more than you'll need but it's a bit hard to predict
15:32 <cognifloyd> ah. ok. Thanks for some starting point rules of thumb. I should be able to adjust from there :)
15:33 <bmbouter> yw, if you can share what you find with use we'd love to hear. also let us know if anything could be better or doesn't work
15:33 <cognifloyd> will do
15:34 <cognifloyd> I really like the pulp 3 architecture with versioned repos (an entire repo metadata rollback sounds awesome). And I hate running Java, so a lot of the other artifact repositories left me with a horrible taste in my mouth. Python is awesome.
Actions
Added hardware requirements.
closes #6856 https://pulp.plan.io/issues/6856