Project

Profile

Help

Task #2440

closed

Make celery processes work with pulp.tasking

Added by mhrivnak over 7 years ago. Updated over 4 years ago.

Status:
CLOSED - CURRENTRELEASE
Priority:
Normal
Assignee:
Category:
-
Sprint/Milestone:
Start date:
Due date:
% Done:

0%

Estimated time:
Platform Release:
Groomed:
Yes
Sprint Candidate:
Yes
Tags:
Sprint:
Sprint 11
Quarter:

Description

@tteresch has also been working on this a lot with me. If there could be two assignees, she'd be one!

Pulp 3's pulp.tasking package does not work as-is with celery worker processes. Some re-arrangement has to happen to ensure that no pulp/django code gets used until the celery app has been instantiated and done it's "fixup" of django. It is also not allowed for pulp

Additionally, the pulp.tasking package itself could have a more intuitive organization. We are taking this opportunity to separate it into three categories of code:

  • code related to individual tasks
  • code related to running and managing services like workers and celerybeat
  • the minimum code necessary to instantiate and configure the celery app

The new layout will be:

pulp.tasking will contain celery_app and celery_instance
pulp.tasking.services for code that relates to workers, beat, etc.
pulp.tasking.tasks for Task base classes and the reservation tasks
pulp.tasking.util for get_current_task_id() and cancel()
pulp.app.tasks for pulp tasks that do stuff with pulp data

Actions #1

Updated by ttereshc over 7 years ago

Pulp 3's pulp.tasking package does not work as-is with celery worker processes. Some re-arrangement has to happen to ensure that no pulp/django code gets used until the celery app has been instantiated and done it's "fixup" of django.

I would also add that because celery's django "fixup" does django.setup() we could not use it in our code. (Because the first and easy thing one may want to do is to put django.setup into our code.)

If you think it is an implementation detail, I am ok to skip it. The idea is just to introduce more reasoning here.

Actions #2

Updated by mhrivnak over 7 years ago

  • Description updated (diff)

ttereshc wrote:

Pulp 3's pulp.tasking package does not work as-is with celery worker processes. Some re-arrangement has to happen to ensure that no pulp/django code gets used until the celery app has been instantiated and done it's "fixup" of django.

I would also add that because celery's django "fixup" does django.setup() we could not use it in our code. (Because the first and easy thing one may want to do is to put django.setup into our code.)

If you think it is an implementation detail, I am ok to skip it. The idea is just to introduce more reasoning here.

I think that might be too much of an implementation detail to put in the description, but it is a great point to capture in the comments for anyone interested in more context later.

Actions #3

Updated by bmbouter over 7 years ago

I think the current thinking is to move the task code itself moving to pulp.app.task. The task could be clarified a little bit on the new location. I imaging all tasks would go there including all three of these task types:

  • tasks user know about (descendants of PulpUserFacingTask)
  • tasks users care about directly but don't see (we may not have many of these in pulp3 but in 2.y these were things like reaper tasks. They are not descendants of PulpUserFacingTask)
  • tasks and supporting classes that they don't need to know about (_queue_reserve_resource, PulpUserFacingTask, etc)

If all ^ move to pulp.app.task Wouldn't pulp.tasking only be left with the things that are part of pulp.tasking.services? My hope is that we can just have pulp.tasking and not pulp.tasking.<anotherlayerhere>.

Actions #4

Updated by mhrivnak over 7 years ago

My understanding of the "current thinking" is a little different, but not a lot. :)

The base pulp tasks, like PulpTask and UserFacingTask, in addition to the reservation-related tasks, are somewhat coupled to the "services" code that manages workers, and the celery app itself. They use the celery controller, create and delete working directories, etc. Likewise, the "services" code needs the ability to cancel tasks, so it uses the task cancellation code.

More generally, I think there is a basic set of tooling that is necessary to provide the tasking features we need. That tooling includes the base tasks, the reservation logic, and some worker management code. All of that is related, and to some extent inter-dependent. I think it makes sense for it to be in one distinct place. It's also worth noting that there's nothing pulp-specific about this functionality.

Then there are actual pulp tasks that get stuff done with pulp data. They only use the above functionality. I think these tasks are logically different from the tooling described above.

Thus the current in-flight layout is:

pulp.tasking contains the general tasking features we need, including the celery app itself
pulp.tasking.services has service-management code.
pulp.tasking.tasks has the base task classes, reservation tasks, cancellation, etc.

pulp.app.tasks has pulp tasks that do stuff with pulp data

I don't think there is a circular import concern as long as our model layer doesn't start importing tasks. That said, one interesting idea would be to make pulp.tasking a full django app and move the related models into it. I'm not sure what the implications would be for the REST API. But this would be the fullest extent of keeping the task system self-contained and separate from the tasks that use it. I'm not sold on it, but it's an idea.

Actions #5

Updated by bmbouter over 7 years ago

To point out one quick variation on your last idea mhrivnak, pulp.tasking could become a Django app maybe named 'pulp_tasking' or something like that. pulp_app would require pulp_tasking to run and in all runtimes it would be present. This would allow those tasks to become @shared_tasks and we could then use the django+celery discovery to avoid importing pulp.tasking.tasks from pulp.app.tasks.

This is a subset of your idea since the proposal was to do all ^ but also move the models. Maybe they should move too (what you suggested).

Actions #6

Updated by bmbouter over 7 years ago

mhrivnak: would you want to take the "in-flight" plan from comment 4, revise it based on our latest discussion and post it to the main task description?

Actions #7

Updated by mhrivnak over 7 years ago

Actions #8

Updated by mhrivnak over 7 years ago

  • Description updated (diff)
Actions #9

Updated by bmbouter over 7 years ago

I think this is ready to be groomed.

Actions #10

Updated by bmbouter over 7 years ago

  • Sprint/Milestone set to 29
  • Groomed changed from No to Yes

Moving onto sprint per IRC convo in #pulp-dev

Actions #11

Updated by mhrivnak over 7 years ago

  • Status changed from NEW to ASSIGNED
  • Assignee set to mhrivnak
Actions #12

Updated by mhrivnak over 7 years ago

  • Description updated (diff)

Added by mhrivnak over 7 years ago

Revision 1ef8c359 | View on GitHub

Workers can run and execute tasks.

Thanks to ttereshc aka @goosemania who contributed some of this code.

There is a known issue that resource-reserving tasks still cannot run, but there is a fix in the works that will follow this PR.

Fixes bugs in pulp.tasking Rearranges pulp.tasking to put related code together Adds two tasks (importer and publisher delete) in pulp.app.tasks Fixes import order problems of django app vs. celery app

https://pulp.plan.io/issues/2440

re #2440 re #2400 re #2408

Added by mhrivnak over 7 years ago

Revision 1ef8c359 | View on GitHub

Workers can run and execute tasks.

Thanks to ttereshc aka @goosemania who contributed some of this code.

There is a known issue that resource-reserving tasks still cannot run, but there is a fix in the works that will follow this PR.

Fixes bugs in pulp.tasking Rearranges pulp.tasking to put related code together Adds two tasks (importer and publisher delete) in pulp.app.tasks Fixes import order problems of django app vs. celery app

https://pulp.plan.io/issues/2440

re #2440 re #2400 re #2408

Actions #13

Updated by mhrivnak over 7 years ago

Actions #14

Updated by mhrivnak over 7 years ago

  • Status changed from ASSIGNED to MODIFIED

Added by bmbouter over 7 years ago

Revision 3fe96dcb | View on GitHub

Fixes the Pulp tasking system

This PR introduces three fixes:

  • Updates the Celery dedicated queue name to be 'C.dq2'. This is a Celery backwards incompatible change so Pulp3 is only compatible with Celery 4.0.

  • Fixes a bug in the worker selection which was dispatching work to non-dedicated workers.

  • Resolves a traceback where a TaskStatus was being created twice during normal reserved task execution.

https://pulp.plan.io/issues/2440 re #2440

Added by bmbouter over 7 years ago

Revision 3fe96dcb | View on GitHub

Fixes the Pulp tasking system

This PR introduces three fixes:

  • Updates the Celery dedicated queue name to be 'C.dq2'. This is a Celery backwards incompatible change so Pulp3 is only compatible with Celery 4.0.

  • Fixes a bug in the worker selection which was dispatching work to non-dedicated workers.

  • Resolves a traceback where a TaskStatus was being created twice during normal reserved task execution.

https://pulp.plan.io/issues/2440 re #2440

Actions #15

Updated by bmbouter about 6 years ago

  • Sprint set to Sprint 11
Actions #16

Updated by bmbouter about 6 years ago

  • Sprint/Milestone deleted (29)
Actions #17

Updated by daviddavis almost 5 years ago

  • Sprint/Milestone set to 3.0.0
Actions #18

Updated by bmbouter almost 5 years ago

  • Tags deleted (Pulp 3)
Actions #19

Updated by bmbouter over 4 years ago

  • Status changed from MODIFIED to CLOSED - CURRENTRELEASE

Also available in: Atom PDF