Task #2092

Updated by mhrivnak almost 4 years ago

Create django models for the functionality that includes progress reporting. The model should support reporting of multiple metrics at a time. For example, both number of RPMs and number of bytes downloaded. It should also support reporting of progress at multiple stages of a workflow concurrently, such as a case where one step feeds into the next pipeline-style.

Once the model is done, make any required tasks to create an internal API for plugins to use it. It's possible that a helper API won't be valuable depending on how easy it is to use the model directly.

Although at least one feature could be helpful, which is throttling. Many plugins or other workflows may want to report progress as often as they like, but have the framework only attempt writing to the DB once or twice a second. When it's hard to predict if each download will take 10s or 10ms, it's very helpful to have one bit of tooling that meters the expensive DB writes for you.

Notes from the team on desired behaviors:

Q: What specific features will this have?
Tasks which show progress (total_count, current_completed, state)
Tasks that don't show progress, they only have a state
Q: How will the data be modeled? Presumably something associated with a Task.
+1 to a Task but can we please not call it a task. We have a lot of Tasks already in the data layer. Maybe a ProgressReport or ActivityReport?
Q: Are we sticking with linear progress reporting which can only show the state for 1 Task at a time.
I think supporting parallel progress reporting will be mostly easy.
Having parallel progress reporting will enable higher performing plugins using, stream processing or other concurrency models.
I (mhrivnak) think it's a great idea to make a model for the data that supports this, and I think using it will then be pretty straight-forward.