Project

Profile

Help

Story #4283

tasks viewsets now require content viewsets/serializers (and significant slowdown)

Added by amacdona@redhat.com 10 months ago. Updated 6 months ago.

Status:
MODIFIED
Priority:
Normal
Assignee:
Category:
-
Sprint/Milestone:
Start date:
Due date:
% Done:

0%

Platform Release:
Blocks Release:
Backwards Incompatible:
No
Groomed:
No
Sprint Candidate:
No
Tags:
QA Contact:
Complexity:
Smash Test:
Verified:
No
Verification Required:
No
Sprint:
Sprint 49

Description

I'm not exactly sure what's going on here. After someone debugs, please feel free to clean up this issue to make more sense :)

After updating to latest pulpcore and pulpcore-plugin, the tasks/ and tasks/<int>/ endpoints no longer work if one of or more of the tasks are docker sync. This is the error message I saw:

 LookupError: Could not determine ViewSet base name for model <class 'pulp_docker.app.models.ManifestList'>

Assuming that the task viewset/serializer had changed, I assumed that the content viewsets/serializers were actually necessary, so I picked up https://pulp.plan.io/issues/3989. However, after an initial implementation of that task, I have seen no difference that seems like it should require a viewset/serializer for the tasks viewset.

(test) ~/d/pulp_docker ❯ http --auth admin:admin http://dev.pulp3:8000/pulp/api/v3/tasks/2/                                                                                                                      ⏎
HTTP/1.1 200 OK
Allow: GET, DELETE, HEAD, OPTIONS
Content-Length: 1026
Content-Type: application/json
Date: Thu, 20 Dec 2018 01:36:06 GMT
Server: WSGIServer/0.2 CPython/3.6.7
Vary: Accept, Cookie
X-Frame-Options: SAMEORIGIN

{
    "_href": "/pulp/api/v3/tasks/2/",
    "created": "2018-12-19T18:15:29.162610Z",
    "created_resources": [
        "/pulp/api/v3/repositories/3/versions/1/" 
    ],
    "error": null,
    "finished_at": "2018-12-19T18:16:20.315454Z",
    "job_id": "75866cf5-5bed-411e-92b9-17288e57b7be",
    "non_fatal_errors": [],
    "parent": null,
    "progress_reports": [
        {
            "done": 435,
            "message": "Downloading Artifacts",
            "state": "completed",
            "suffix": "",
            "task": "/pulp/api/v3/tasks/2/",
            "total": 435
        },
        {
            "done": 720,
            "message": "Downloading Artifacts",
            "state": "completed",
            "suffix": "",
            "task": "/pulp/api/v3/tasks/2/",
            "total": 720
        },
        {
            "done": 1240,
            "message": "Associating Content",
            "state": "completed",
            "suffix": "",
            "task": "/pulp/api/v3/tasks/2/",
            "total": 1240
        },
        {
            "done": 0,
            "message": "Un-Associating Content",
            "state": "completed",
            "suffix": "",
            "task": "/pulp/api/v3/tasks/2/",
            "total": null
        },
        {
            "done": 115,
            "message": "Downloading Artifacts",
            "state": "completed",
            "suffix": "",
            "task": "/pulp/api/v3/tasks/2/",
            "total": 115
        }
    ],
    "spawned_tasks": [],
    "started_at": "2018-12-19T18:15:29.245467Z",
    "state": "completed",
    "worker": "/pulp/api/v3/workers/2/" 
}

Additionally, the GET request for a docker sync task is much slower than it used to be, taking ~5 seconds. It used to be nearly instant.

All of this leads me to believe that the tasks `list` and `get` views are doing some slow db queries (joins?) and the information provided by those queries does not appear to be included in the information returned to the user. It may not be a bug at all, but seems like something worth investigation.

Associated revisions

Revision 83f2367c View on GitHub
Added by dalley 10 months ago

Fix bad pause when serializing repository versions

And also tasks by association, via CreatedResource, which apparently
serializes the RepositoryVersion also.

closes #4283
https://pulp.plan.io/issues/4283

Revision 83f2367c View on GitHub
Added by dalley 10 months ago

Fix bad pause when serializing repository versions

And also tasks by association, via CreatedResource, which apparently
serializes the RepositoryVersion also.

closes #4283
https://pulp.plan.io/issues/4283

Revision fe521535 View on GitHub
Added by bmbouter 8 months ago

Adopts new content_summary layout

The performance issue causes us to introduce a new model named
RepositoryVersionContentDetails. This is also a great opportunity to
update the content_summary of the RepositoryVersion serializer to match
the recent API changes.

Most of this code was inspired from a patch from @dalley.

This includes smash updates that are included in a separate PR to
preserve backwards compatability.

Required PR: https://github.com/PulpQE/pulp-smash/pull/1174

This also fixes travis script lines that weren't updated due to
pulp/pulp being moved to pulp/pulpcore. This PR needs that small fix to
pass also.

https://pulp.plan.io/issues/4283
closes #4283

History

#1 Updated by amacdona@redhat.com 10 months ago

Interesting detail a REST query to a task0 is fast when the task is running. As soon as it finishes, it becomes slow.

[0]: http --auth admin:admin http://dev.pulp3:8000/pulp/api/v3/tasks/1/

#2 Updated by dalley 10 months ago

  • Status changed from NEW to POST
  • Assignee set to dalley
  • Tags Pulp 3 added

#3 Updated by dalley 10 months ago

  • Status changed from POST to MODIFIED

#4 Updated by dalley 8 months ago

  • Status changed from MODIFIED to NEW

I'm reopening this. Because the root issue (rendering the repositoryversions in the background) wasn't really resolved there's still a pause but it doesn't become noticable until you reach large repository sizes. Once you get up to a couple hundred thousand units, the pause is extremely noticable.

#5 Updated by dalley 8 months ago

One possible solution is to de-normalize repository versions. They're immutable, so when creating one, you can just take a snapshot of the current counts of content, and possibly the URLs, and store that instead of having fields that calculate it anew every time it's serialized.

#6 Updated by dalley 8 months ago

There's 2 separate issues here - repository versions are extremely slow to render for large repositories due to the calculations required for each one, and that problem should be resolved by just doing the calculation at repository version creation time and saving them in the database.

The second issue is that this slowness bleeds into rendering tasks due to CreatedResource. Fixing the former will "fix" the latter but it's better that we also remove this dependency.

#7 Updated by dalley 8 months ago

  • Assignee deleted (dalley)

#8 Updated by CodeHeeler 8 months ago

  • Triaged changed from No to Yes
  • Sprint set to Sprint 48
  • Tags Pulp 3 RC Blocker added

#9 Updated by jortel@redhat.com 8 months ago

dalley wrote:

The second issue is that this slowness bleeds into rendering tasks due to CreatedResource.

Can you elaborate on this? I'm assuming that the created-resource for a sync is a RepositoryVersion and when serialized (in the task), is just the URL and not the entire serialized RepositoryVersion, right?

#10 Updated by rchan 8 months ago

  • Sprint changed from Sprint 48 to Sprint 49

#11 Updated by bmbouter 8 months ago

  • Status changed from NEW to ASSIGNED
  • Assignee set to bmbouter

#12 Updated by bmbouter 8 months ago

I'm not able to reproduce this. I installed the latest pulpcore, pulpcore-plugin, and pulp_rpm. I then lazy sync epel6 with commands like:

http POST http://localhost:8000/pulp/api/v3/repositories/ name=foo
export REPO_HREF=$(http :8000/pulp/api/v3/repositories/ | jq -r '.results[] | select(.name == "foo") | ._href')
http POST http://localhost:8000/pulp/api/v3/remotes/rpm/rpm/ name='bar' url='https://dl.fedoraproject.org/pub/epel/6Server/x86_64/' policy='on_demand'
export REMOTE_HREF=$(http :8000/pulp/api/v3/remotes/rpm/rpm/ | jq -r '.results[] | select(.name == "bar") | ._href')
http POST :8000${REMOTE_HREF}sync/ repository=$REPO_HREF

It completes successfully and downloads 16673, which should be enough to trigger the bug. However when I list the task it always completes in ~ 0.5 seconds. I can't reproduce the ~ 5 second slowdown.

(pulp) [vagrant@pulp3-source-fedora28 pulp_rpm]$ time http :/pulp/api/v3/tasks/1/
HTTP/1.1 200 OK
Allow: GET, DELETE, HEAD, OPTIONS
Connection: keep-alive
Content-Length: 967
Content-Type: application/json
Date: Fri, 22 Feb 2019 17:12:10 GMT
Server: nginx/1.12.1
Vary: Accept, Cookie
X-Frame-Options: SAMEORIGIN

{
    "_created": "2019-02-21T21:42:59.168700Z",
    "_href": "/pulp/api/v3/tasks/1/",
    "created_resources": [
        "/pulp/api/v3/repositories/1/versions/1/" 
    ],
    "error": null,
    "finished_at": "2019-02-22T00:48:22.166977Z",
    "job_id": "7103bb64-a426-4710-bd5d-4a65b341c47b",
    "name": "pulp_rpm.app.tasks.synchronizing.synchronize",
    "non_fatal_errors": [],
    "parent": null,
    "progress_reports": [
        {
            "done": 5,
            "message": "Downloading and Parsing Metadata",
            "state": "completed",
            "suffix": "",
            "task": "/pulp/api/v3/tasks/1/",
            "total": 5
        },
        {
            "done": 0,
            "message": "Downloading Artifacts",
            "state": "completed",
            "suffix": "",
            "task": "/pulp/api/v3/tasks/1/",
            "total": null
        },
        {
            "done": 16673,
            "message": "Associating Content",
            "state": "completed",
            "suffix": "",
            "task": "/pulp/api/v3/tasks/1/",
            "total": 16673
        },
        {
            "done": 0,
            "message": "Un-Associating Content",
            "state": "completed",
            "suffix": "",
            "task": "/pulp/api/v3/tasks/1/",
            "total": null
        }
    ],
    "spawned_tasks": [],
    "started_at": "2019-02-21T21:42:59.266729Z",
    "state": "completed",
    "worker": "/pulp/api/v3/workers/2/" 
}

real    0m0.589s
user    0m0.351s
sys     0m0.047s

#13 Updated by dalley 8 months ago

This is definitely still an issue and I can reproduce it easily. You simply need more tasks/repository versions, or much bigger tasks/repository versions. I expect that moderate production usage will encounter this but doing a simple sync of EPEL in an otherwise empty installation will not.

After 10 syncs totalling about 500,000 content units, viewing a single (1) task takes about 2 seconds, because the associated repository via CreatedResource is so large (and serializing it performs read-intensive ops to do the counts and so forth).

Serializing all 10 tasks takes 14.7 seconds.

Serializing 8 repository versions takes 10.7 seconds.

Serializing 1 repository version takes between 1.1 and 2.5 seconds, depending on which repository version it is.

So this is definitely a problem and fixing it later will require a data migration. Also, we might take this opportunity to also change the representation when serialized to be more in-sync with the new data layout underneath we adopt, which would be a backwards incompatible change.

#14 Updated by bmbouter 8 months ago

I see the slowdown now so I can fix this I believe. Thanks @dalley for the alternate reproducer!

#15 Updated by bmbouter 8 months ago

  • Tracker changed from Issue to Story
  • Status changed from ASSIGNED to POST
  • % Done set to 0

PR available at: https://github.com/pulp/pulp/pull/3896

This is actually a story because of the design changes involved. I'm going to rewrite the API changes clearly on the top of the issue.

#17 Updated by dalley 8 months ago

  • Status changed from POST to MODIFIED

Not sure why this didn't get state-changed.

#18 Updated by daviddavis 6 months ago

  • Sprint/Milestone set to 3.0

#19 Updated by bmbouter 6 months ago

  • Tags deleted (Pulp 3, Pulp 3 RC Blocker)

Please register to edit this issue

Also available in: Atom PDF