Project

Profile

Help

Task #5855

closed

Remove auto-throttling of ProgressReport updates

Added by bmbouter over 4 years ago. Updated over 3 years ago.

Status:
CLOSED - WONTFIX
Priority:
Normal
Assignee:
-
Category:
-
Sprint/Milestone:
-
Start date:
Due date:
% Done:

0%

Estimated time:
Platform Release:
Groomed:
No
Sprint Candidate:
No
Tags:
Sprint:
Quarter:

Description

Pulpcore's auto-throttling was designed to allow plugin writers to not think about how frequently they manage progress report updates. We should remove the auto-throttling for these reasons:

1. Plugin writers probably need to think in batches. We can't adequately abstract them from that. As we make batching a first-class aspect of plugin implementation, auto-throttling becomes an anti-feature.

2. It's magical and implicit. pulpcore magically saves some of your things at times and other times ... it doesn't.

3. Solving the performance issue of progress updates should have one way to solve it. Having two ways when you need one is a design concern.

Actions #1

Updated by bmbouter over 4 years ago

  • Sprint/Milestone set to 3.1.0

Adding for consideration in 3.1 so it doesn't get lost. Not a commitment in any way. It needs discussion and input from others.

Actions #2

Updated by gmbnomis over 4 years ago

I suggest to include a pointer to what should be removed ;-) I think we are talking about https://github.com/pulp/pulpcore/blob/4a55a4411db1a5904716233852b5837e5f7a01aa/pulpcore/app/models/progress.py#L139-L142, right?

My 2 cents:

I agree that the implementation is implicit and somewhat surprising, but I still find the functionality very useful.

There are stages that work best in batches, but there are also stages that don't. The downloader is one example; parsing meta-data from the remote in the first stage is another. As a plugin writer, I wouldn't like to be forced to work in arbitrary batches just to optimize reporting. Also, for some workloads like downloading, the time based approach fits better than an approach based on batches (what batch size to choose to get decent reporting?).

My proposal is to go from implicit to explicit:

  • save() just saves the object (and possibly records a timestamp needed by other methods)
  • update_progress(increment) increments by the given increment and saves the object (hence the update in the name)
  • lazy_update_progress(increment) increments by the given number and may save using time-based throttling (the interval could be made explicit by adding it as a default parameter to the constructor).
  • remove iter() and increment() (there is no use case to increment by 1 and save)
  • possibly add an iterator for batches (would wrap the batch iterator in stages). It saves after every batch.
Actions #3

Updated by bmbouter over 4 years ago

  • Sprint/Milestone deleted (3.1.0)
Actions #4

Updated by dkliban@redhat.com over 3 years ago

  • Status changed from NEW to CLOSED - WONTFIX

Also available in: Atom PDF