Task #3076
closedDelete pulp_celerybeat
100%
Description
Pulp celerybeat can be removed from the Pulp3 architecture entirely if we can transition its two remaining software responsibilities to the workers themselves. Specifically:
- Celerybeat looks to find missing workers and call "mark_worker_offline()". That call effectively cancels any reserved tasks assigned to the worker and removes the worker's records from the database.
- Check that at least one resource_manager process is running and that at least one worker process is running and log loudly as necessary if they aren't.
These responsibilities should move to the worker heartbeat code here All workers should effectively run this cleanup whenever they see it needs to be done, this makes a shared responsibility to across all workers.
The scheduler.py should be deleted along with any orphaned code that was exclusively used by scheduler.py.
This will require a few other updates too:
- searching and updating the documentation to remove celerybeat references
- updating the devel environment to not deal with celerybeat or its units
- updating the galaxy playbooks to not deal with celerybeat or its units.
Also there are two correctness points to verify to ensure that any old records won't cause correctness problems:
- Verify the status API filters out records older than 30 seconds
- Verify the resource manager filters out workers who haven't checked in within 30 seconds
Updated by dalley about 7 years ago
- Description updated (diff)
- Groomed changed from No to Yes
This looks good to me. My one question would be - is it acceptable for all workers to be logging "XYZ offline" messages individually? And if not, we should find a way to avoid flooding the logs with those messages.
Updated by ipanova@redhat.com about 7 years ago
bmbouter once we complete this task, do we want to close all issues related to celerybeat from our backlog?
Updated by bmbouter about 7 years ago
dalley, The worst-case logging situation with this change would be that there are many workers and 0 resource managers or many resource managers and 0 workers. I think it's ok for all of them to be logging loudly then since these situations should be relatively rare. I don't think an implementation that makes distributed logging more coordinated would be worth the effort and risk of creating that implementation.
@ipanova, I think we should wait until Pulp3 is GA and then close the celerybeat bugs. I think we can close any celerybeat issues that have the Pulp3 tag once this work is done, but I don't know of any of those.
Updated by daviddavis about 7 years ago
- Status changed from NEW to ASSIGNED
- Assignee set to daviddavis
Updated by daviddavis about 7 years ago
- Status changed from ASSIGNED to POST
Added by daviddavis about 7 years ago
Updated by daviddavis about 7 years ago
- Status changed from POST to MODIFIED
- % Done changed from 0 to 100
Applied in changeset pulp|0414f3a668cda91b4f46f31f351d75de960a4034.
Updated by bmbouter almost 5 years ago
- Status changed from MODIFIED to CLOSED - CURRENTRELEASE
Removing celerybeat code
fixes #3076 https://pulp.plan.io/issues/3076