Project

Profile

Help

Task #3076

Delete pulp_celerybeat

Added by bmbouter almost 2 years ago. Updated 6 months ago.

Status:
MODIFIED
Priority:
Normal
Assignee:
Category:
-
Sprint/Milestone:
Start date:
Due date:
% Done:

100%

Platform Release:
Blocks Release:
Backwards Incompatible:
No
Groomed:
Yes
Sprint Candidate:
Yes
Tags:
QA Contact:
Complexity:
Smash Test:
Verified:
No
Verification Required:
No
Sprint:
Sprint 28

Description

Pulp celerybeat can be removed from the Pulp3 architecture entirely if we can transition its two remaining software responsibilities to the workers themselves. Specifically:

  • Celerybeat looks to find missing workers and call "mark_worker_offline()". That call effectively cancels any reserved tasks assigned to the worker and removes the worker's records from the database.
  • Check that at least one resource_manager process is running and that at least one worker process is running and log loudly as necessary if they aren't.

These responsibilities should move to the worker heartbeat code here All workers should effectively run this cleanup whenever they see it needs to be done, this makes a shared responsibility to across all workers.

The scheduler.py should be deleted along with any orphaned code that was exclusively used by scheduler.py.

This will require a few other updates too:
  • searching and updating the documentation to remove celerybeat references
  • updating the devel environment to not deal with celerybeat or its units
  • updating the galaxy playbooks to not deal with celerybeat or its units.
Also there are two correctness points to verify to ensure that any old records won't cause correctness problems:
  • Verify the status API filters out records older than 30 seconds
  • Verify the resource manager filters out workers who haven't checked in within 30 seconds

Checklist

History

#1 Updated by dalley almost 2 years ago

  • Description updated (diff)
  • Groomed changed from No to Yes

This looks good to me. My one question would be - is it acceptable for all workers to be logging "XYZ offline" messages individually? And if not, we should find a way to avoid flooding the logs with those messages.

#2 Updated by ipanova@redhat.com almost 2 years ago

@bmbouter once we complete this task, do we want to close all issues related to celerybeat from our backlog?

#3 Updated by bmbouter almost 2 years ago

@dalley, The worst-case logging situation with this change would be that there are many workers and 0 resource managers or many resource managers and 0 workers. I think it's ok for all of them to be logging loudly then since these situations should be relatively rare. I don't think an implementation that makes distributed logging more coordinated would be worth the effort and risk of creating that implementation.

@ipanova, I think we should wait until Pulp3 is GA and then close the celerybeat bugs. I think we can close any celerybeat issues that have the Pulp3 tag once this work is done, but I don't know of any of those.

#4 Updated by jortel@redhat.com almost 2 years ago

  • Sprint/Milestone set to 46

#5 Updated by daviddavis almost 2 years ago

  • Status changed from NEW to ASSIGNED
  • Assignee set to daviddavis

#6 Updated by mhrivnak almost 2 years ago

  • Sprint/Milestone changed from 46 to 47

#8 Updated by daviddavis almost 2 years ago

  • Status changed from POST to MODIFIED
  • % Done changed from 0 to 100

#9 Updated by bmbouter over 1 year ago

  • Sprint set to Sprint 28

#10 Updated by bmbouter over 1 year ago

  • Sprint/Milestone deleted (47)

#11 Updated by daviddavis 6 months ago

  • Sprint/Milestone set to 3.0

#12 Updated by bmbouter 6 months ago

  • Tags deleted (Pulp 3)

Please register to edit this issue

Also available in: Atom PDF