As a user I want pulp to fail early when there isn't enough disk space
It would be nice, if the sync mechanism, after reading the metadata could anticipate the (possibly) needed diskspace, compare that to the available and take appropriate action before starting to download the content units.
Another option might be to add a "fake_sync" or "dry_run_sync" that reports, whether disk space is sufficient.
It is clearly non trivial to anticipate the actual size a sync will require due to existing duplicate artifacts. Nevertheless, it is not an unreasonable user expectation to have some feature like this. Downloading three quarters of some large repo only to fail with zero remaining disk space is no fun.
Intended to resurrect this pulp2 issue for pulp3: https://pulp.plan.io/issues/4668
My concern with a check like this will have so many false-negatives that it won't be helping users out much in-practice. Also failing when the sync runs out of space or failing up front (to me) isn't meaningfully that different since the user's operation failed in both cases.
Here are some of the concerning scenarios that make it practically hard to get an up-front check correct:
N syncs running in parallel, each of them checks and early on there is enough space, but all of them fail.
The shared storage filling up due to client's triggering download-and-save events with policy='on_demand' repositories. Here the check would pass and later fail.
export operations or uploads running running concurrently with other sync's.
Other non-pulp uses eating up that same shared storage. This one isn't as likely if the shared storage is deployed correctly, but if the volume mount is actually oversubscribed it's beyond Pulp's control. The check passes, but later fails.
I agree that this is actually much harder to do then a user might naively expect.
However, I do not agree with all your points, so let me just raise a couple of counter points:
"failing when the sync runs out of space or failing up front (to me) isn't meaningfully that different since the user's operation failed in both cases"
I disagree with this one, since large syncs can take a very long time, and failing early is clearly a valid design goal. In addition, the suggestion for this entire issue came from a real world customer of ours communicating a pain point, so I feel we shouldn't dismiss the issue to easily.
"My concern with a check like this will have so many false-negatives that it won't be helping users out much in-practice"
Possible counter argument is that right now we have a "false-negative" rate of 100% (on the proposed check), so reducing this by any amount could deliver at least some value to users (an early fail for cases that quite obviously won't work, like a single sync is already going to be much to large).
I do agree the scenarios you outlined in 1.-4. make it difficult to provide a real good solution to this.
Perhaps an alternative approach might be to provide some kind of documentation for monitoring when disk space is running low in general? In practice different users are going to have different sizes for what constitutes "low disc space" depending on how large their sync jobs regularly are.
I feel like we should be able to give some kind of answer to users that have experienced running out of disk space as a pain point. (Just my 2 cents).
Please register to edit this issue