Project

Profile

Help

Story #4283

closed

tasks viewsets now require content viewsets/serializers (and significant slowdown)

Added by amacdona@redhat.com over 5 years ago. Updated over 4 years ago.

Status:
CLOSED - CURRENTRELEASE
Priority:
Normal
Assignee:
Category:
-
Sprint/Milestone:
Start date:
Due date:
% Done:

0%

Estimated time:
Platform Release:
Groomed:
No
Sprint Candidate:
No
Tags:
Sprint:
Sprint 49
Quarter:

Description

I'm not exactly sure what's going on here. After someone debugs, please feel free to clean up this issue to make more sense :)

After updating to latest pulpcore and pulpcore-plugin, the tasks/ and tasks/<int>/ endpoints no longer work if one of or more of the tasks are docker sync. This is the error message I saw:

 LookupError: Could not determine ViewSet base name for model <class 'pulp_docker.app.models.ManifestList'>

Assuming that the task viewset/serializer had changed, I assumed that the content viewsets/serializers were actually necessary, so I picked up https://pulp.plan.io/issues/3989. However, after an initial implementation of that task, I have seen no difference that seems like it should require a viewset/serializer for the tasks viewset.

(test) ~/d/pulp_docker ❯ http --auth admin:admin http://dev.pulp3:8000/pulp/api/v3/tasks/2/                                                                                                                      ⏎
HTTP/1.1 200 OK
Allow: GET, DELETE, HEAD, OPTIONS
Content-Length: 1026
Content-Type: application/json
Date: Thu, 20 Dec 2018 01:36:06 GMT
Server: WSGIServer/0.2 CPython/3.6.7
Vary: Accept, Cookie
X-Frame-Options: SAMEORIGIN

{
    "_href": "/pulp/api/v3/tasks/2/",
    "created": "2018-12-19T18:15:29.162610Z",
    "created_resources": [
        "/pulp/api/v3/repositories/3/versions/1/"
    ],
    "error": null,
    "finished_at": "2018-12-19T18:16:20.315454Z",
    "job_id": "75866cf5-5bed-411e-92b9-17288e57b7be",
    "non_fatal_errors": [],
    "parent": null,
    "progress_reports": [
        {
            "done": 435,
            "message": "Downloading Artifacts",
            "state": "completed",
            "suffix": "",
            "task": "/pulp/api/v3/tasks/2/",
            "total": 435
        },
        {
            "done": 720,
            "message": "Downloading Artifacts",
            "state": "completed",
            "suffix": "",
            "task": "/pulp/api/v3/tasks/2/",
            "total": 720
        },
        {
            "done": 1240,
            "message": "Associating Content",
            "state": "completed",
            "suffix": "",
            "task": "/pulp/api/v3/tasks/2/",
            "total": 1240
        },
        {
            "done": 0,
            "message": "Un-Associating Content",
            "state": "completed",
            "suffix": "",
            "task": "/pulp/api/v3/tasks/2/",
            "total": null
        },
        {
            "done": 115,
            "message": "Downloading Artifacts",
            "state": "completed",
            "suffix": "",
            "task": "/pulp/api/v3/tasks/2/",
            "total": 115
        }
    ],
    "spawned_tasks": [],
    "started_at": "2018-12-19T18:15:29.245467Z",
    "state": "completed",
    "worker": "/pulp/api/v3/workers/2/"
}

Additionally, the GET request for a docker sync task is much slower than it used to be, taking ~5 seconds. It used to be nearly instant.

All of this leads me to believe that the tasks `list` and `get` views are doing some slow db queries (joins?) and the information provided by those queries does not appear to be included in the information returned to the user. It may not be a bug at all, but seems like something worth investigation.

Actions #1

Updated by amacdona@redhat.com over 5 years ago

Interesting detail a REST query to a task[0] is fast when the task is running. As soon as it finishes, it becomes slow.

[0]: http --auth admin:admin http://dev.pulp3:8000/pulp/api/v3/tasks/1/

Added by dalley about 5 years ago

Revision 83f2367c | View on GitHub

Fix bad pause when serializing repository versions

And also tasks by association, via CreatedResource, which apparently serializes the RepositoryVersion also.

closes #4283 https://pulp.plan.io/issues/4283

Added by dalley about 5 years ago

Revision 83f2367c | View on GitHub

Fix bad pause when serializing repository versions

And also tasks by association, via CreatedResource, which apparently serializes the RepositoryVersion also.

closes #4283 https://pulp.plan.io/issues/4283

Actions #2

Updated by dalley about 5 years ago

  • Status changed from NEW to POST
  • Assignee set to dalley
  • Tags Pulp 3 added
Actions #3

Updated by dalley about 5 years ago

  • Status changed from POST to MODIFIED
Actions #4

Updated by dalley about 5 years ago

  • Status changed from MODIFIED to NEW

I'm reopening this. Because the root issue (rendering the repositoryversions in the background) wasn't really resolved there's still a pause but it doesn't become noticable until you reach large repository sizes. Once you get up to a couple hundred thousand units, the pause is extremely noticable.

Actions #5

Updated by dalley about 5 years ago

One possible solution is to de-normalize repository versions. They're immutable, so when creating one, you can just take a snapshot of the current counts of content, and possibly the URLs, and store that instead of having fields that calculate it anew every time it's serialized.

Actions #6

Updated by dalley about 5 years ago

There's 2 separate issues here - repository versions are extremely slow to render for large repositories due to the calculations required for each one, and that problem should be resolved by just doing the calculation at repository version creation time and saving them in the database.

The second issue is that this slowness bleeds into rendering tasks due to CreatedResource. Fixing the former will "fix" the latter but it's better that we also remove this dependency.

Actions #7

Updated by dalley about 5 years ago

  • Assignee deleted (dalley)
Actions #8

Updated by CodeHeeler about 5 years ago

  • Triaged changed from No to Yes
  • Sprint set to Sprint 48
  • Tags Pulp 3 RC Blocker added
Actions #9

Updated by jortel@redhat.com about 5 years ago

dalley wrote:

The second issue is that this slowness bleeds into rendering tasks due to CreatedResource.

Can you elaborate on this? I'm assuming that the created-resource for a sync is a RepositoryVersion and when serialized (in the task), is just the URL and not the entire serialized RepositoryVersion, right?

Actions #10

Updated by rchan about 5 years ago

  • Sprint changed from Sprint 48 to Sprint 49
Actions #11

Updated by bmbouter about 5 years ago

  • Status changed from NEW to ASSIGNED
  • Assignee set to bmbouter
Actions #12

Updated by bmbouter about 5 years ago

I'm not able to reproduce this. I installed the latest pulpcore, pulpcore-plugin, and pulp_rpm. I then lazy sync epel6 with commands like:

http POST http://localhost:8000/pulp/api/v3/repositories/ name=foo
export REPO_HREF=$(http :8000/pulp/api/v3/repositories/ | jq -r '.results[] | select(.name == "foo") | ._href')
http POST http://localhost:8000/pulp/api/v3/remotes/rpm/rpm/ name='bar' url='https://dl.fedoraproject.org/pub/epel/6Server/x86_64/' policy='on_demand'
export REMOTE_HREF=$(http :8000/pulp/api/v3/remotes/rpm/rpm/ | jq -r '.results[] | select(.name == "bar") | ._href')
http POST :8000${REMOTE_HREF}sync/ repository=$REPO_HREF

It completes successfully and downloads 16673, which should be enough to trigger the bug. However when I list the task it always completes in ~ 0.5 seconds. I can't reproduce the ~ 5 second slowdown.

(pulp) [vagrant@pulp3-source-fedora28 pulp_rpm]$ time http :/pulp/api/v3/tasks/1/
HTTP/1.1 200 OK
Allow: GET, DELETE, HEAD, OPTIONS
Connection: keep-alive
Content-Length: 967
Content-Type: application/json
Date: Fri, 22 Feb 2019 17:12:10 GMT
Server: nginx/1.12.1
Vary: Accept, Cookie
X-Frame-Options: SAMEORIGIN

{
    "_created": "2019-02-21T21:42:59.168700Z",
    "_href": "/pulp/api/v3/tasks/1/",
    "created_resources": [
        "/pulp/api/v3/repositories/1/versions/1/"
    ],
    "error": null,
    "finished_at": "2019-02-22T00:48:22.166977Z",
    "job_id": "7103bb64-a426-4710-bd5d-4a65b341c47b",
    "name": "pulp_rpm.app.tasks.synchronizing.synchronize",
    "non_fatal_errors": [],
    "parent": null,
    "progress_reports": [
        {
            "done": 5,
            "message": "Downloading and Parsing Metadata",
            "state": "completed",
            "suffix": "",
            "task": "/pulp/api/v3/tasks/1/",
            "total": 5
        },
        {
            "done": 0,
            "message": "Downloading Artifacts",
            "state": "completed",
            "suffix": "",
            "task": "/pulp/api/v3/tasks/1/",
            "total": null
        },
        {
            "done": 16673,
            "message": "Associating Content",
            "state": "completed",
            "suffix": "",
            "task": "/pulp/api/v3/tasks/1/",
            "total": 16673
        },
        {
            "done": 0,
            "message": "Un-Associating Content",
            "state": "completed",
            "suffix": "",
            "task": "/pulp/api/v3/tasks/1/",
            "total": null
        }
    ],
    "spawned_tasks": [],
    "started_at": "2019-02-21T21:42:59.266729Z",
    "state": "completed",
    "worker": "/pulp/api/v3/workers/2/"
}

real    0m0.589s
user    0m0.351s
sys     0m0.047s
Actions #13

Updated by dalley about 5 years ago

This is definitely still an issue and I can reproduce it easily. You simply need more tasks/repository versions, or much bigger tasks/repository versions. I expect that moderate production usage will encounter this but doing a simple sync of EPEL in an otherwise empty installation will not.

After 10 syncs totalling about 500,000 content units, viewing a single (1) task takes about 2 seconds, because the associated repository via CreatedResource is so large (and serializing it performs read-intensive ops to do the counts and so forth).

Serializing all 10 tasks takes 14.7 seconds.

Serializing 8 repository versions takes 10.7 seconds.

Serializing 1 repository version takes between 1.1 and 2.5 seconds, depending on which repository version it is.

So this is definitely a problem and fixing it later will require a data migration. Also, we might take this opportunity to also change the representation when serialized to be more in-sync with the new data layout underneath we adopt, which would be a backwards incompatible change.

Actions #14

Updated by bmbouter about 5 years ago

I see the slowdown now so I can fix this I believe. Thanks dalley for the alternate reproducer!

Actions #15

Updated by bmbouter about 5 years ago

  • Tracker changed from Issue to Story
  • Status changed from ASSIGNED to POST
  • % Done set to 0

PR available at: https://github.com/pulp/pulp/pull/3896

This is actually a story because of the design changes involved. I'm going to rewrite the API changes clearly on the top of the issue.

Added by bmbouter about 5 years ago

Revision fe521535 | View on GitHub

Adopts new content_summary layout

The performance issue causes us to introduce a new model named RepositoryVersionContentDetails. This is also a great opportunity to update the content_summary of the RepositoryVersion serializer to match the recent API changes.

Most of this code was inspired from a patch from dalley.

This includes smash updates that are included in a separate PR to preserve backwards compatability.

Required PR: https://github.com/PulpQE/pulp-smash/pull/1174

This also fixes travis script lines that weren't updated due to pulp/pulp being moved to pulp/pulpcore. This PR needs that small fix to pass also.

https://pulp.plan.io/issues/4283 closes #4283

Actions #17

Updated by dalley about 5 years ago

  • Status changed from POST to MODIFIED

Not sure why this didn't get state-changed.

Actions #18

Updated by daviddavis almost 5 years ago

  • Sprint/Milestone set to 3.0.0
Actions #19

Updated by bmbouter almost 5 years ago

  • Tags deleted (Pulp 3, Pulp 3 RC Blocker)
Actions #20

Updated by bmbouter over 4 years ago

  • Status changed from MODIFIED to CLOSED - CURRENTRELEASE

Also available in: Atom PDF