Task #2440
closedMake celery processes work with pulp.tasking
0%
Description
@tteresch has also been working on this a lot with me. If there could be two assignees, she'd be one!
Pulp 3's pulp.tasking package does not work as-is with celery worker processes. Some re-arrangement has to happen to ensure that no pulp/django code gets used until the celery app has been instantiated and done it's "fixup" of django. It is also not allowed for pulp
Additionally, the pulp.tasking package itself could have a more intuitive organization. We are taking this opportunity to separate it into three categories of code:
- code related to individual tasks
- code related to running and managing services like workers and celerybeat
- the minimum code necessary to instantiate and configure the celery app
The new layout will be:
pulp.tasking will contain celery_app and celery_instance
pulp.tasking.services for code that relates to workers, beat, etc.
pulp.tasking.tasks for Task base classes and the reservation tasks
pulp.tasking.util for get_current_task_id() and cancel()
pulp.app.tasks for pulp tasks that do stuff with pulp data
Updated by ttereshc almost 8 years ago
Pulp 3's pulp.tasking package does not work as-is with celery worker processes. Some re-arrangement has to happen to ensure that no pulp/django code gets used until the celery app has been instantiated and done it's "fixup" of django.
I would also add that because celery's django "fixup" does django.setup() we could not use it in our code. (Because the first and easy thing one may want to do is to put django.setup into our code.)
If you think it is an implementation detail, I am ok to skip it. The idea is just to introduce more reasoning here.
Updated by mhrivnak almost 8 years ago
- Description updated (diff)
ttereshc wrote:
Pulp 3's pulp.tasking package does not work as-is with celery worker processes. Some re-arrangement has to happen to ensure that no pulp/django code gets used until the celery app has been instantiated and done it's "fixup" of django.
I would also add that because celery's django "fixup" does django.setup() we could not use it in our code. (Because the first and easy thing one may want to do is to put django.setup into our code.)
If you think it is an implementation detail, I am ok to skip it. The idea is just to introduce more reasoning here.
I think that might be too much of an implementation detail to put in the description, but it is a great point to capture in the comments for anyone interested in more context later.
Updated by bmbouter almost 8 years ago
I think the current thinking is to move the task code itself moving to pulp.app.task
. The task could be clarified a little bit on the new location. I imaging all tasks would go there including all three of these task types:
- tasks user know about (descendants of PulpUserFacingTask)
- tasks users care about directly but don't see (we may not have many of these in pulp3 but in 2.y these were things like reaper tasks. They are not descendants of PulpUserFacingTask)
- tasks and supporting classes that they don't need to know about (_queue_reserve_resource, PulpUserFacingTask, etc)
If all ^ move to pulp.app.task
Wouldn't pulp.tasking
only be left with the things that are part of pulp.tasking.services
? My hope is that we can just have pulp.tasking
and not pulp.tasking.<anotherlayerhere>
.
Updated by mhrivnak almost 8 years ago
My understanding of the "current thinking" is a little different, but not a lot. :)
The base pulp tasks, like PulpTask and UserFacingTask, in addition to the reservation-related tasks, are somewhat coupled to the "services" code that manages workers, and the celery app itself. They use the celery controller, create and delete working directories, etc. Likewise, the "services" code needs the ability to cancel tasks, so it uses the task cancellation code.
More generally, I think there is a basic set of tooling that is necessary to provide the tasking features we need. That tooling includes the base tasks, the reservation logic, and some worker management code. All of that is related, and to some extent inter-dependent. I think it makes sense for it to be in one distinct place. It's also worth noting that there's nothing pulp-specific about this functionality.
Then there are actual pulp tasks that get stuff done with pulp data. They only use the above functionality. I think these tasks are logically different from the tooling described above.
Thus the current in-flight layout is:
pulp.tasking contains the general tasking features we need, including the celery app itself
pulp.tasking.services has service-management code.
pulp.tasking.tasks has the base task classes, reservation tasks, cancellation, etc.
pulp.app.tasks has pulp tasks that do stuff with pulp data
I don't think there is a circular import concern as long as our model layer doesn't start importing tasks. That said, one interesting idea would be to make pulp.tasking a full django app and move the related models into it. I'm not sure what the implications would be for the REST API. But this would be the fullest extent of keeping the task system self-contained and separate from the tasks that use it. I'm not sold on it, but it's an idea.
Updated by bmbouter almost 8 years ago
To point out one quick variation on your last idea mhrivnak, pulp.tasking could become a Django app maybe named 'pulp_tasking' or something like that. pulp_app would require pulp_tasking to run and in all runtimes it would be present. This would allow those tasks to become @shared_tasks and we could then use the django+celery discovery to avoid importing pulp.tasking.tasks
from pulp.app.tasks
.
This is a subset of your idea since the proposal was to do all ^ but also move the models. Maybe they should move too (what you suggested).
Updated by bmbouter almost 8 years ago
mhrivnak: would you want to take the "in-flight" plan from comment 4, revise it based on our latest discussion and post it to the main task description?
Updated by bmbouter almost 8 years ago
- Sprint/Milestone set to 29
- Groomed changed from No to Yes
Moving onto sprint per IRC convo in #pulp-dev
Updated by mhrivnak almost 8 years ago
- Status changed from NEW to ASSIGNED
- Assignee set to mhrivnak
Added by mhrivnak almost 8 years ago
Added by mhrivnak almost 8 years ago
Revision 1ef8c359 | View on GitHub
Workers can run and execute tasks.
Thanks to ttereshc aka @goosemania who contributed some of this code.
There is a known issue that resource-reserving tasks still cannot run, but there is a fix in the works that will follow this PR.
Fixes bugs in pulp.tasking Rearranges pulp.tasking to put related code together Adds two tasks (importer and publisher delete) in pulp.app.tasks Fixes import order problems of django app vs. celery app
Updated by mhrivnak almost 8 years ago
- Status changed from ASSIGNED to MODIFIED
Added by bmbouter almost 8 years ago
Revision 3fe96dcb | View on GitHub
Fixes the Pulp tasking system
This PR introduces three fixes:
-
Updates the Celery dedicated queue name to be 'C.dq2'. This is a Celery backwards incompatible change so Pulp3 is only compatible with Celery 4.0.
-
Fixes a bug in the worker selection which was dispatching work to non-dedicated workers.
-
Resolves a traceback where a TaskStatus was being created twice during normal reserved task execution.
Added by bmbouter almost 8 years ago
Revision 3fe96dcb | View on GitHub
Fixes the Pulp tasking system
This PR introduces three fixes:
-
Updates the Celery dedicated queue name to be 'C.dq2'. This is a Celery backwards incompatible change so Pulp3 is only compatible with Celery 4.0.
-
Fixes a bug in the worker selection which was dispatching work to non-dedicated workers.
-
Resolves a traceback where a TaskStatus was being created twice during normal reserved task execution.
Updated by bmbouter almost 5 years ago
- Status changed from MODIFIED to CLOSED - CURRENTRELEASE
Workers can run and execute tasks.
Thanks to ttereshc aka @goosemania who contributed some of this code.
There is a known issue that resource-reserving tasks still cannot run, but there is a fix in the works that will follow this PR.
Fixes bugs in pulp.tasking Rearranges pulp.tasking to put related code together Adds two tasks (importer and publisher delete) in pulp.app.tasks Fixes import order problems of django app vs. celery app
https://pulp.plan.io/issues/2440
re #2440 re #2400 re #2408