Project

Profile

Help

Task #2440

Make celery processes work with pulp.tasking

Added by mhrivnak almost 3 years ago. Updated 6 months ago.

Status:
MODIFIED
Priority:
Normal
Assignee:
Category:
-
Sprint/Milestone:
Start date:
Due date:
% Done:

0%

Platform Release:
Blocks Release:
Backwards Incompatible:
No
Groomed:
Yes
Sprint Candidate:
Yes
Tags:
QA Contact:
Complexity:
Smash Test:
Verified:
No
Verification Required:
No
Sprint:
Sprint 11

Description

@tteresch has also been working on this a lot with me. If there could be two assignees, she'd be one!

Pulp 3's pulp.tasking package does not work as-is with celery worker processes. Some re-arrangement has to happen to ensure that no pulp/django code gets used until the celery app has been instantiated and done it's "fixup" of django. It is also not allowed for pulp

Additionally, the pulp.tasking package itself could have a more intuitive organization. We are taking this opportunity to separate it into three categories of code:

  • code related to individual tasks
  • code related to running and managing services like workers and celerybeat
  • the minimum code necessary to instantiate and configure the celery app

The new layout will be:

pulp.tasking will contain celery_app and celery_instance
pulp.tasking.services for code that relates to workers, beat, etc.
pulp.tasking.tasks for Task base classes and the reservation tasks
pulp.tasking.util for get_current_task_id() and cancel()
pulp.app.tasks for pulp tasks that do stuff with pulp data


Checklist

Associated revisions

Revision 1ef8c359 View on GitHub
Added by mhrivnak almost 3 years ago

Workers can run and execute tasks.

Thanks to ttereshc aka @goosemania who contributed some of this code.

There is a known issue that resource-reserving tasks still cannot run, but
there is a fix in the works that will follow this PR.

Fixes bugs in pulp.tasking
Rearranges pulp.tasking to put related code together
Adds two tasks (importer and publisher delete) in pulp.app.tasks
Fixes import order problems of django app vs. celery app

https://pulp.plan.io/issues/2440

re #2440
re #2400
re #2408

Revision 1ef8c359 View on GitHub
Added by mhrivnak almost 3 years ago

Workers can run and execute tasks.

Thanks to ttereshc aka @goosemania who contributed some of this code.

There is a known issue that resource-reserving tasks still cannot run, but
there is a fix in the works that will follow this PR.

Fixes bugs in pulp.tasking
Rearranges pulp.tasking to put related code together
Adds two tasks (importer and publisher delete) in pulp.app.tasks
Fixes import order problems of django app vs. celery app

https://pulp.plan.io/issues/2440

re #2440
re #2400
re #2408

Revision 1ef8c359 View on GitHub
Added by mhrivnak almost 3 years ago

Workers can run and execute tasks.

Thanks to ttereshc aka @goosemania who contributed some of this code.

There is a known issue that resource-reserving tasks still cannot run, but
there is a fix in the works that will follow this PR.

Fixes bugs in pulp.tasking
Rearranges pulp.tasking to put related code together
Adds two tasks (importer and publisher delete) in pulp.app.tasks
Fixes import order problems of django app vs. celery app

https://pulp.plan.io/issues/2440

re #2440
re #2400
re #2408

Revision 3fe96dcb View on GitHub
Added by bmbouter almost 3 years ago

Fixes the Pulp tasking system

This PR introduces three fixes:

- Updates the Celery dedicated queue name to be 'C.dq2'.
This is a Celery backwards incompatible change so
Pulp3 is only compatible with Celery 4.0.

- Fixes a bug in the worker selection which was dispatching
work to non-dedicated workers.

- Resolves a traceback where a TaskStatus was being created
twice during normal reserved task execution.

https://pulp.plan.io/issues/2440
re #2440

Revision 3fe96dcb View on GitHub
Added by bmbouter almost 3 years ago

Fixes the Pulp tasking system

This PR introduces three fixes:

- Updates the Celery dedicated queue name to be 'C.dq2'.
This is a Celery backwards incompatible change so
Pulp3 is only compatible with Celery 4.0.

- Fixes a bug in the worker selection which was dispatching
work to non-dedicated workers.

- Resolves a traceback where a TaskStatus was being created
twice during normal reserved task execution.

https://pulp.plan.io/issues/2440
re #2440

Revision 3fe96dcb View on GitHub
Added by bmbouter almost 3 years ago

Fixes the Pulp tasking system

This PR introduces three fixes:

- Updates the Celery dedicated queue name to be 'C.dq2'.
This is a Celery backwards incompatible change so
Pulp3 is only compatible with Celery 4.0.

- Fixes a bug in the worker selection which was dispatching
work to non-dedicated workers.

- Resolves a traceback where a TaskStatus was being created
twice during normal reserved task execution.

https://pulp.plan.io/issues/2440
re #2440

History

#1 Updated by ttereshc almost 3 years ago

Pulp 3's pulp.tasking package does not work as-is with celery worker processes. Some re-arrangement has to happen to ensure that no pulp/django code gets used until the celery app has been instantiated and done it's "fixup" of django.

I would also add that because celery's django "fixup" does django.setup() we could not use it in our code. (Because the first and easy thing one may want to do is to put django.setup into our code.)

If you think it is an implementation detail, I am ok to skip it. The idea is just to introduce more reasoning here.

#2 Updated by mhrivnak almost 3 years ago

  • Description updated (diff)

ttereshc wrote:

Pulp 3's pulp.tasking package does not work as-is with celery worker processes. Some re-arrangement has to happen to ensure that no pulp/django code gets used until the celery app has been instantiated and done it's "fixup" of django.

I would also add that because celery's django "fixup" does django.setup() we could not use it in our code. (Because the first and easy thing one may want to do is to put django.setup into our code.)

If you think it is an implementation detail, I am ok to skip it. The idea is just to introduce more reasoning here.

I think that might be too much of an implementation detail to put in the description, but it is a great point to capture in the comments for anyone interested in more context later.

#3 Updated by bmbouter almost 3 years ago

I think the current thinking is to move the task code itself moving to pulp.app.task. The task could be clarified a little bit on the new location. I imaging all tasks would go there including all three of these task types:

  • tasks user know about (descendants of PulpUserFacingTask)
  • tasks users care about directly but don't see (we may not have many of these in pulp3 but in 2.y these were things like reaper tasks. They are not descendants of PulpUserFacingTask)
  • tasks and supporting classes that they don't need to know about (_queue_reserve_resource, PulpUserFacingTask, etc)

If all ^ move to pulp.app.task Wouldn't pulp.tasking only be left with the things that are part of pulp.tasking.services? My hope is that we can just have pulp.tasking and not pulp.tasking.<anotherlayerhere>.

#4 Updated by mhrivnak almost 3 years ago

My understanding of the "current thinking" is a little different, but not a lot. :)

The base pulp tasks, like PulpTask and UserFacingTask, in addition to the reservation-related tasks, are somewhat coupled to the "services" code that manages workers, and the celery app itself. They use the celery controller, create and delete working directories, etc. Likewise, the "services" code needs the ability to cancel tasks, so it uses the task cancellation code.

More generally, I think there is a basic set of tooling that is necessary to provide the tasking features we need. That tooling includes the base tasks, the reservation logic, and some worker management code. All of that is related, and to some extent inter-dependent. I think it makes sense for it to be in one distinct place. It's also worth noting that there's nothing pulp-specific about this functionality.

Then there are actual pulp tasks that get stuff done with pulp data. They only use the above functionality. I think these tasks are logically different from the tooling described above.

Thus the current in-flight layout is:

pulp.tasking contains the general tasking features we need, including the celery app itself
pulp.tasking.services has service-management code.
pulp.tasking.tasks has the base task classes, reservation tasks, cancellation, etc.

pulp.app.tasks has pulp tasks that do stuff with pulp data

I don't think there is a circular import concern as long as our model layer doesn't start importing tasks. That said, one interesting idea would be to make pulp.tasking a full django app and move the related models into it. I'm not sure what the implications would be for the REST API. But this would be the fullest extent of keeping the task system self-contained and separate from the tasks that use it. I'm not sold on it, but it's an idea.

#5 Updated by bmbouter almost 3 years ago

To point out one quick variation on your last idea @mhrivnak, pulp.tasking could become a Django app maybe named 'pulp_tasking' or something like that. pulp_app would require pulp_tasking to run and in all runtimes it would be present. This would allow those tasks to become @shared_tasks and we could then use the django+celery discovery to avoid importing pulp.tasking.tasks from pulp.app.tasks.

This is a subset of your idea since the proposal was to do all ^ but also move the models. Maybe they should move too (what you suggested).

#6 Updated by bmbouter almost 3 years ago

@mhrivnak: would you want to take the "in-flight" plan from comment 4, revise it based on our latest discussion and post it to the main task description?

#7 Updated by mhrivnak almost 3 years ago

  • Checklist item fix all the import statements for pulp.tasking, even those in pulp 2 code added

#8 Updated by mhrivnak almost 3 years ago

  • Description updated (diff)

#9 Updated by bmbouter almost 3 years ago

I think this is ready to be groomed.

#10 Updated by bmbouter almost 3 years ago

  • Sprint/Milestone set to 29
  • Groomed changed from No to Yes

Moving onto sprint per IRC convo in #pulp-dev

#11 Updated by mhrivnak almost 3 years ago

  • Status changed from NEW to ASSIGNED
  • Assignee set to mhrivnak

#12 Updated by mhrivnak almost 3 years ago

  • Description updated (diff)

#13 Updated by mhrivnak almost 3 years ago

  • Checklist item Get workers and celerybeat running with pulp 3 code set to Done
  • Checklist item separate service vs. task code in pulp.tasking set to Done
  • Checklist item fix up systemd unit files to use new code set to Done
  • Checklist item define the home where pulp tasks should live set to Done
  • Checklist item fix all the import statements for pulp.tasking, even those in pulp 2 code set to Done

#14 Updated by mhrivnak almost 3 years ago

  • Status changed from ASSIGNED to MODIFIED

#15 Updated by bmbouter over 1 year ago

  • Sprint set to Sprint 11

#16 Updated by bmbouter over 1 year ago

  • Sprint/Milestone deleted (29)

#17 Updated by daviddavis 6 months ago

  • Sprint/Milestone set to 3.0

#18 Updated by bmbouter 6 months ago

  • Tags deleted (Pulp 3)

Please register to edit this issue

Also available in: Atom PDF