Project

Profile

Help

Issue #1448

closed

Generating repo content applicability doesn't return call report

Added by Ichimonji10 over 8 years ago. Updated almost 4 years ago.

Status:
CLOSED - CURRENTRELEASE
Priority:
Normal
Category:
-
Sprint/Milestone:
-
Start date:
Due date:
Estimated time:
Severity:
2. Medium
Version:
Master
Platform Release:
2.8.0
OS:
Triaged:
Yes
Groomed:
No
Sprint Candidate:
No
Tags:
Pulp 2
Sprint:
Quarter:

Description

When Pulp is asked to generate content applicability for repositories, it returns an HTTP 202 (accepted) response. This response should be a [call report](http://pulp.readthedocs.org/en/latest/dev-guide/conventions/sync-v-async.html#call-report), but it's not.

Here's an example of an API call that produces the error:

POST https://pulp.example.com/pulp/api/v2/repositories/actions/content/regenerate_applicability/
{'repo_criteria': {}}

Here's an example of a (HTTP 202) response body:

{
    '_href': '/pulp/api/v2/task_groups/568f4f73-b487-4944-b7a5-36bf00159e3c/',
    'group_id': '568f4f73-b487-4944-b7a5-36bf00159e3c'
}

Related issues

Related to RPM Support - Issue #1552: applicable errata regeneration tasks not getting spawnedCLOSED - CURRENTRELEASEsemyersActions
Actions #1

Updated by Ichimonji10 over 8 years ago

This issue does not apply to Pulp 2.7.

Actions #2

Updated by mhrivnak over 8 years ago

  • Status changed from NEW to ASSIGNED
  • Assignee set to dkliban@redhat.com
  • Platform Release set to 2.8.0
  • Triaged changed from No to Yes

Dennis, is it possible and reasonable to have the API continue to return a call report as before, and make the new behavior opt-in?

Actions #3

Updated by dkliban@redhat.com over 8 years ago

I don't think it's reasonable to return a call report. A call report contains a list of spawned tasks and it does not contain a group id. Without the group id a user would need to check each task to see if it is done.

How would a user opt in to the new design? By providing an extra argument when making the API call?

Actions #4

Updated by bmbouter over 8 years ago

How many spawned tasks would the call report contain?

Would it be possible to add group_id into the call report?

From a backwards compatibility standpoint this changes the HTTP method type and structure of the response.

Actions #5

Updated by Ichimonji10 over 8 years ago

Without the group id a user would need to check each task to see if it is done.

What in the world is a group ID? It took me a while to figure out how to make my API handler poll task reports. The documentation doesn't mention task groups, except to state that "This API has been removed in 2.4, as there is no longer any concept of Task Groups."

Actions #6

Updated by dkliban@redhat.com over 8 years ago

We are currently dispatching 10 profiles per task. Based on what I've seen in production environments this could be upwards of 1000 tasks. We could add the group id to the call report.

The documentation for 2.8 is not published yet. Here[0-1] are the places where the new behavior is described.

[0] https://github.com/pulp/pulp/blob/master/docs/dev-guide/integration/rest-api/consumer/applicability.rst
[1] https://github.com/pulp/pulp/blob/master/docs/dev-guide/integration/rest-api/tasks.rst

Actions #7

Updated by Ichimonji10 over 8 years ago

The documentation for 2.8 is not published yet. Here[0-1] are the places where the new behavior is described.

Is this the new format for HTTP 202 responses across the board for Pulp 2.8?

Actions #8

Updated by dkliban@redhat.com over 8 years ago

This is the only API call in 2.8 that has a different response body for 202. This is mentioned in the release notes[0].

[0] https://github.com/pulp/pulp/blob/master/docs/user-guide/release-notes/master.rst#rest-api-changes

Actions #9

Updated by Ichimonji10 over 8 years ago

Ahaaa. Thanks for the info. I've updated the test suite accordingly. See this commit

I do dislike having two different kinds of response formats. It means that I, as an API user, have to design two different sets of code for dealing with long running tasks. It also means that I have to use out-of-band knowledge to create a special case in my code base for dealing with this one path. Basically, it's a PITA, even if you've read the release notes.

Actions #10

Updated by Ichimonji10 over 8 years ago

  • Status changed from ASSIGNED to CLOSED - NOTABUG
Actions #11

Updated by rbarlow over 8 years ago

I am a little concerned, as it sounds like we are not maintaining backwards compatibility from 2.7 to 2.8. Is my understanding correct? If so, I would make a case that we should make the new functionality be either a new REST endpoint, or make it opt-in through a user flag so that integrators from 2.7 don't have to do anything to upgrade to 2.8 unless they want to take advantage of the new awesomeness.

Actions #12

Updated by dkliban@redhat.com over 8 years ago

I have no problem with creating a new endpoint, but I am concerned that it may be too late for this in the release cycle.

Actions #13

Updated by Ichimonji10 over 8 years ago

How about HTTP link headers? We can keep backward compatibility and add group task polling to as many endpoints as we want.

Actions #14

Updated by Ichimonji10 over 8 years ago

The biggest danger in using HTTP link headers is that we might do a bad job of polluting Pulp's link registry.

Actions #15

Updated by rbarlow over 8 years ago

  • Status changed from CLOSED - NOTABUG to NEW

I don't think we should consider this issue closed. I raised concerns about this endpoint being backwards incompatible and was assured that it was done in a new endpoint, rather than removing support for the old endpoint. I think it's important that we stick to semantic versioning, especially since we document that we follow it. There are community users who do use this endpoint, as I know that I have personally guided two of them through writing code to interact with it.

Actions #16

Updated by Ichimonji10 over 8 years ago

+1 for ensuring backward compatibility in the 2.8 release.

By the way, if "HTTP link headers" or "link registry" is greek to anyone, check out sections 5.3 (When and How to Use Link Headers) and 5.4 (How to Assign Link Relation Types) from RESTful Web Services Cookbook

Actions #17

Updated by dkliban@redhat.com over 8 years ago

The change in behavior was needed for 2 reasons: 1) Parallelize the work and 2) prevent a possible mongo cursor time out.

The solution for 1 required spawning many tasks to handle the calculations in parallel. The number of tasks that are generated depends on the number of unique consumer profiles that are bound to a particular repository. When managing 10000 unique systems, 1000 tasks are spawned. If we return 1000 tasks in a call report, the client will then try to poll Pulp for 1000 tasks statuses. Depending on the polling interval, this would require Pulp to respond to thousands more extra requests. That is a huge overhead for Pulp. As a result it is necessary to return a resource that summarizes the status of all the tasks.

The solution for 1 also provides the solution for 2. Since each task has to only load profiles from the database once per task, we don't have to worry about a consequent call to the database getting a cursor timeout error.

Actions #18

Updated by Ichimonji10 over 8 years ago

I certainly understand that polling 1,000+ tasks can be extremely slow and fragile. With that in mind, a good question to ask is "is poor performance a good reason to ignore semver and break compatibility?" Either answer will produce angry users. We'll either end up with users who are frustrated with poor performance when polling 1,000+ tasks, or we'll end up with users who are frustrated that they have to add hacky fixes into their API clients to deal with breaking changes on a minor release. I'd opine that we should pick a solution that obeys semver and that can be applied to this entire category of issues.

Actions #19

Updated by bmbouter over 8 years ago

One option to consider is that the total count of group tasks and the completed count of group tasks could be added to an existing call report. That would be an additive change which would also provide the usability enhancement.

Actions #20

Updated by dkliban@redhat.com over 8 years ago

Would we still include the list of all the spawned tasks though?

Actions #21

Updated by Ichimonji10 over 8 years ago

Would we still include the list of all the spawned tasks though?

Yes. I don't see any other way of maintaining backward compatibility.

Actions #22

Updated by Ichimonji10 over 8 years ago

Would we still include the list of all the spawned tasks though?

Yes. I don't see any other way of maintaining backward compatibility.

I guess you could do something clever like make the list of spawned tasks include only one task, and have that one task reflect the state of the entire "regenerate content applicability" operation. I'm not sure if that would cause bugs of its own, though.

Actions #23

Updated by rbarlow over 8 years ago

I think we most likely need the API to work exactly like it did before and provide a way for the user to explicitly request the new behavior, rather than trying to make an additive change to the old behavior. This could either be a new API endpoint, or by having the user provide an optional parameter with their request that allows them to opt in. My preference is for the latter.

Actions #24

Updated by bmbouter over 8 years ago

+1 to an optional GET parameter to opt-in to the new functionality

Actions #25

Updated by rbarlow over 8 years ago

bmbouter wrote:

+1 to an optional GET parameter to opt-in to the new functionality

I think it's actually a POST, but thanks for the +1 ☺

Actions #26

Updated by dkliban@redhat.com about 8 years ago

  • Status changed from NEW to ASSIGNED
Actions #27

Updated by ipanova@redhat.com about 8 years ago

  • Status changed from ASSIGNED to POST
Actions #28

Updated by semyers about 8 years ago

  • Blocks Issue #1552: applicable errata regeneration tasks not getting spawned added
Actions #29

Updated by semyers about 8 years ago

  • Blocks deleted (Issue #1552: applicable errata regeneration tasks not getting spawned)
Actions #30

Updated by semyers about 8 years ago

  • Related to Issue #1552: applicable errata regeneration tasks not getting spawned added

Added by dkliban@redhat.com about 8 years ago

Revision ac2c17a7 | View on GitHub

Adds back the API for generating applicability for a repo

This adds a parameter called 'parallel' to the API call for regenerating applicability for a repository. When specified and True, the work is done in parallel. When not specified or False, the work is done as one long running task.

https://pulp.plan.io/issues/1448 closes #1448

Added by dkliban@redhat.com about 8 years ago

Revision ac2c17a7 | View on GitHub

Adds back the API for generating applicability for a repo

This adds a parameter called 'parallel' to the API call for regenerating applicability for a repository. When specified and True, the work is done in parallel. When not specified or False, the work is done as one long running task.

https://pulp.plan.io/issues/1448 closes #1448

Actions #31

Updated by dkliban@redhat.com about 8 years ago

  • Status changed from POST to MODIFIED
  • % Done changed from 0 to 100
Actions #32

Updated by Ichimonji10 about 8 years ago

I can verify that Pulp is acting correctly. I've done this verification with freshly provisioned local VMs. See: https://github.com/PulpQE/pulp-smash/commit/a58fa7ec96fd6ebc71d33ccc8dd3c92b385a0818

Actions #33

Updated by dkliban@redhat.com about 8 years ago

  • Status changed from MODIFIED to 5
Actions #34

Updated by dkliban@redhat.com about 8 years ago

  • Status changed from 5 to CLOSED - CURRENTRELEASE
Actions #35

Updated by bmbouter about 5 years ago

  • Tags Pulp 2 added
Actions #36

Updated by bmbouter almost 4 years ago

  • Category deleted (14)

We are removing the 'API' category per open floor discussion June 16, 2020.

Also available in: Atom PDF