Issue #1448
closedGenerating repo content applicability doesn't return call report
Description
When Pulp is asked to generate content applicability for repositories, it returns an HTTP 202 (accepted) response. This response should be a [call report](http://pulp.readthedocs.org/en/latest/dev-guide/conventions/sync-v-async.html#call-report), but it's not.
Here's an example of an API call that produces the error:
POST https://pulp.example.com/pulp/api/v2/repositories/actions/content/regenerate_applicability/
{'repo_criteria': {}}
Here's an example of a (HTTP 202) response body:
{
'_href': '/pulp/api/v2/task_groups/568f4f73-b487-4944-b7a5-36bf00159e3c/',
'group_id': '568f4f73-b487-4944-b7a5-36bf00159e3c'
}
Related issues
Updated by mhrivnak over 7 years ago
- Status changed from NEW to ASSIGNED
- Assignee set to dkliban@redhat.com
- Platform Release set to 2.8.0
- Triaged changed from No to Yes
Dennis, is it possible and reasonable to have the API continue to return a call report as before, and make the new behavior opt-in?
Updated by dkliban@redhat.com over 7 years ago
I don't think it's reasonable to return a call report. A call report contains a list of spawned tasks and it does not contain a group id. Without the group id a user would need to check each task to see if it is done.
How would a user opt in to the new design? By providing an extra argument when making the API call?
Updated by bmbouter over 7 years ago
How many spawned tasks would the call report contain?
Would it be possible to add group_id into the call report?
From a backwards compatibility standpoint this changes the HTTP method type and structure of the response.
Updated by Ichimonji10 over 7 years ago
Without the group id a user would need to check each task to see if it is done.
What in the world is a group ID? It took me a while to figure out how to make my API handler poll task reports. The documentation doesn't mention task groups, except to state that "This API has been removed in 2.4, as there is no longer any concept of Task Groups."
Updated by dkliban@redhat.com over 7 years ago
We are currently dispatching 10 profiles per task. Based on what I've seen in production environments this could be upwards of 1000 tasks. We could add the group id to the call report.
The documentation for 2.8 is not published yet. Here[0-1] are the places where the new behavior is described.
[0] https://github.com/pulp/pulp/blob/master/docs/dev-guide/integration/rest-api/consumer/applicability.rst
[1] https://github.com/pulp/pulp/blob/master/docs/dev-guide/integration/rest-api/tasks.rst
Updated by Ichimonji10 over 7 years ago
The documentation for 2.8 is not published yet. Here[0-1] are the places where the new behavior is described.
Is this the new format for HTTP 202 responses across the board for Pulp 2.8?
Updated by dkliban@redhat.com over 7 years ago
This is the only API call in 2.8 that has a different response body for 202. This is mentioned in the release notes[0].
[0] https://github.com/pulp/pulp/blob/master/docs/user-guide/release-notes/master.rst#rest-api-changes
Updated by Ichimonji10 over 7 years ago
Ahaaa. Thanks for the info. I've updated the test suite accordingly. See this commit
I do dislike having two different kinds of response formats. It means that I, as an API user, have to design two different sets of code for dealing with long running tasks. It also means that I have to use out-of-band knowledge to create a special case in my code base for dealing with this one path. Basically, it's a PITA, even if you've read the release notes.
Updated by Ichimonji10 over 7 years ago
- Status changed from ASSIGNED to CLOSED - NOTABUG
Updated by rbarlow over 7 years ago
I am a little concerned, as it sounds like we are not maintaining backwards compatibility from 2.7 to 2.8. Is my understanding correct? If so, I would make a case that we should make the new functionality be either a new REST endpoint, or make it opt-in through a user flag so that integrators from 2.7 don't have to do anything to upgrade to 2.8 unless they want to take advantage of the new awesomeness.
Updated by dkliban@redhat.com over 7 years ago
I have no problem with creating a new endpoint, but I am concerned that it may be too late for this in the release cycle.
Updated by Ichimonji10 over 7 years ago
How about HTTP link headers? We can keep backward compatibility and add group task polling to as many endpoints as we want.
Updated by Ichimonji10 over 7 years ago
The biggest danger in using HTTP link headers is that we might do a bad job of polluting Pulp's link registry.
Updated by rbarlow over 7 years ago
- Status changed from CLOSED - NOTABUG to NEW
I don't think we should consider this issue closed. I raised concerns about this endpoint being backwards incompatible and was assured that it was done in a new endpoint, rather than removing support for the old endpoint. I think it's important that we stick to semantic versioning, especially since we document that we follow it. There are community users who do use this endpoint, as I know that I have personally guided two of them through writing code to interact with it.
Updated by Ichimonji10 over 7 years ago
+1 for ensuring backward compatibility in the 2.8 release.
By the way, if "HTTP link headers" or "link registry" is greek to anyone, check out sections 5.3 (When and How to Use Link Headers) and 5.4 (How to Assign Link Relation Types) from RESTful Web Services Cookbook
Updated by dkliban@redhat.com over 7 years ago
The change in behavior was needed for 2 reasons: 1) Parallelize the work and 2) prevent a possible mongo cursor time out.
The solution for 1 required spawning many tasks to handle the calculations in parallel. The number of tasks that are generated depends on the number of unique consumer profiles that are bound to a particular repository. When managing 10000 unique systems, 1000 tasks are spawned. If we return 1000 tasks in a call report, the client will then try to poll Pulp for 1000 tasks statuses. Depending on the polling interval, this would require Pulp to respond to thousands more extra requests. That is a huge overhead for Pulp. As a result it is necessary to return a resource that summarizes the status of all the tasks.
The solution for 1 also provides the solution for 2. Since each task has to only load profiles from the database once per task, we don't have to worry about a consequent call to the database getting a cursor timeout error.
Updated by Ichimonji10 over 7 years ago
I certainly understand that polling 1,000+ tasks can be extremely slow and fragile. With that in mind, a good question to ask is "is poor performance a good reason to ignore semver and break compatibility?" Either answer will produce angry users. We'll either end up with users who are frustrated with poor performance when polling 1,000+ tasks, or we'll end up with users who are frustrated that they have to add hacky fixes into their API clients to deal with breaking changes on a minor release. I'd opine that we should pick a solution that obeys semver and that can be applied to this entire category of issues.
Updated by bmbouter over 7 years ago
One option to consider is that the total count of group tasks and the completed count of group tasks could be added to an existing call report. That would be an additive change which would also provide the usability enhancement.
Updated by dkliban@redhat.com over 7 years ago
Would we still include the list of all the spawned tasks though?
Updated by Ichimonji10 over 7 years ago
Would we still include the list of all the spawned tasks though?
Yes. I don't see any other way of maintaining backward compatibility.
Updated by Ichimonji10 over 7 years ago
Would we still include the list of all the spawned tasks though?
Yes. I don't see any other way of maintaining backward compatibility.
I guess you could do something clever like make the list of spawned tasks include only one task, and have that one task reflect the state of the entire "regenerate content applicability" operation. I'm not sure if that would cause bugs of its own, though.
Updated by rbarlow over 7 years ago
I think we most likely need the API to work exactly like it did before and provide a way for the user to explicitly request the new behavior, rather than trying to make an additive change to the old behavior. This could either be a new API endpoint, or by having the user provide an optional parameter with their request that allows them to opt in. My preference is for the latter.
Updated by bmbouter over 7 years ago
+1 to an optional GET parameter to opt-in to the new functionality
Updated by rbarlow over 7 years ago
bmbouter wrote:
+1 to an optional GET parameter to opt-in to the new functionality
I think it's actually a POST, but thanks for the +1 ☺
Updated by dkliban@redhat.com over 7 years ago
- Status changed from NEW to ASSIGNED
Updated by ipanova@redhat.com over 7 years ago
- Status changed from ASSIGNED to POST
Updated by semyers over 7 years ago
- Blocks Issue #1552: applicable errata regeneration tasks not getting spawned added
Updated by semyers over 7 years ago
- Blocks deleted (Issue #1552: applicable errata regeneration tasks not getting spawned)
Updated by semyers over 7 years ago
- Related to Issue #1552: applicable errata regeneration tasks not getting spawned added
Added by dkliban@redhat.com over 7 years ago
Added by dkliban@redhat.com over 7 years ago
Adds back the API for generating applicability for a repo
This adds a parameter called 'parallel' to the API call for regenerating applicability for a repository. When specified and True, the work is done in parallel. When not specified or False, the work is done as one long running task.
Updated by dkliban@redhat.com over 7 years ago
- Status changed from POST to MODIFIED
- % Done changed from 0 to 100
Applied in changeset pulp|ac2c17a70327116e7621ae56f1ea308cb321ec4b.
Updated by Ichimonji10 over 7 years ago
I can verify that Pulp is acting correctly. I've done this verification with freshly provisioned local VMs. See: https://github.com/PulpQE/pulp-smash/commit/a58fa7ec96fd6ebc71d33ccc8dd3c92b385a0818
Updated by dkliban@redhat.com over 7 years ago
- Status changed from MODIFIED to 5
Updated by dkliban@redhat.com about 7 years ago
- Status changed from 5 to CLOSED - CURRENTRELEASE
Updated by bmbouter almost 3 years ago
- Category deleted (
14)
We are removing the 'API' category per open floor discussion June 16, 2020.
Adds back the API for generating applicability for a repo
This adds a parameter called 'parallel' to the API call for regenerating applicability for a repository. When specified and True, the work is done in parallel. When not specified or False, the work is done as one long running task.
https://pulp.plan.io/issues/1448 closes #1448