Project

Profile

Help

Issue #6590

closed

task_group API is missing details of progress reports

Added by dkliban@redhat.com almost 4 years ago. Updated almost 4 years ago.

Status:
CLOSED - WONTFIX
Priority:
Normal
Assignee:
Category:
-
Sprint/Milestone:
-
Start date:
Due date:
Estimated time:
Severity:
2. Medium
Version:
Platform Release:
OS:
Triaged:
No
Groomed:
No
Sprint Candidate:
No
Tags:
Sprint:
Quarter:

Description

The Task Group API currently returns only the number of tasks in each state. Users of this API need to be able to easily determine the overall state of the Task Group. Users of this API would also like to be able to see progress reporting from the individual tasks summarized.

The TaskGroupSerializer should include a new field called 'state'. This field should be dynamically generated. The possible states are: waiting (if all tasks are waiting), running (if any tasks are running), failed (if all tasks are in a terminal state and any of them are in a failed state), completed (if all tasks are in a completed state ,or canceled (if the task group was canceled).

The progress reports from all the tasks associated with the task group should also be present. Any progress reports that have the same 'code' should be aggregated into one progress report - with the total and done fields summed.

Actions #1

Updated by ipanova@redhat.com almost 4 years ago

  • Description updated (diff)
Actions #2

Updated by dkliban@redhat.com almost 4 years ago

The progress report aggregation would be useful in the following case:

The migration plugin dispatches 1000 migrate_repository tasks. Each migrate_repository task creates progress reports for "Create Repository", "Create Publication", "Create Distribution". Each of these progress reports has a "total" of 1. After the repository is created the "done" field for the related progress report is set to 1. The same follows publication and distribution.

The user looking at the task_group associated with these 1000 tasks does not want to see 3000 progress reports. The user wants to see 3 progress reports stating that 1000 out of 1000 items are done for each.

Actions #3

Updated by dkliban@redhat.com almost 4 years ago

The progress report aggregation would not be as useful if we implemented a new API that lets users sync many repositories at once.

If the task group contains many sync tasks then the aggregated progress reports would look something like this:

Downloading Metadata Files: 10 done of 10 total
Optimizing Sync: 5 done of 5 total
Parsed Modulemd: 56 done of 56 total
Parsed Modulemd-defaults: 56 done of 56 total 
Parsed Comps: 14 done of 14 total
Parsed Packages: 35751 done of 35751 total
Parsed Advisories: 523 done of 523 total 

It would much better if the sync tasks could record progress in two types of progress reports. The one we have now, ProgressReport, and a GroupProgressReport. A sync task could then provide less granular progress for the TaskGroup API, e.g.: Syncing repository 'foo'. In that case the TaskGroup API could return something like this:

Syncing repository 'foo': 0 done of 1 total.
Syncing repository 'bar': 1 done of 1 total.
Syncing repository 'fedora': 0 done of 1 total.
Actions #4

Updated by dalley almost 4 years ago

  • Status changed from NEW to ASSIGNED
  • Assignee set to dalley
Actions #5

Updated by daviddavis almost 4 years ago

For your sync task example, were you imagining that these would be regular sync tasks? In other words, could these sync tasks run as independent tasks not part of a TaskGroup and also as part of a TaskGroup? If so, I guess we'd have to introduce conditional reporting logic?

Actions #6

Updated by dkliban@redhat.com almost 4 years ago

These would be regular sync tasks that can be run on their own or as part of a group. The sync task would always record ProgressReports. However, if a Group is passed in to the task, a GroupProgressReport would be recorded also. So yes, some conditional logic will be necessary.

Actions #7

Updated by dalley almost 4 years ago

I suppose where I'm at is the question "if conditional logic is required anyway, is there a sufficient reason to have a separate reporting structure rather than simply changing what is reported at the task level".

I can see that there might be reasons not to change the individual task level reporting, but are we sure that it is so necessary to justify adding more parallel reporting mechanisms. If we are, then that's fine.

Actions #8

Updated by dkliban@redhat.com almost 4 years ago

Users will want to have both detailed ProgressReports and GroupProgressReports for sync tasks. I imagine when the sync task fails for one of the tasks in the Group, the user will want to see the detailed version of progress using the Tasks api.

Actions #9

Updated by dkliban@redhat.com almost 4 years ago

  • Status changed from ASSIGNED to CLOSED - WONTFIX

The task group API can already be used to determine the 'state' of the task group. Users should look at how many tasks are in 'waiting' and 'running' state. Once that number reaches 0, the task group is complete.

Once https://pulp.plan.io/issues/6590 is complete, users will also be able to know that all tasks for a task group have actually been dispatched. Once this parameter is True, and there are no tasks in 'waiting' or 'running' state the task group is considered complete.

Also available in: Atom PDF