Story #1060
closed
As a user I can run Pulp with multiple celeryBeat instances
Status:
CLOSED - CURRENTRELEASE
Description
pulp_celerybeat currently can only be run in one process per deployment, and it is one of the single points of failure in Pulp. Multiple instances of pulp_celerybeat should be able to run concurrently.
Deliverables:
1) Modify Pulp to be able to run multiple pulp_celerybeat process concurrently
2) Ensure that Pulp scheduled calls occur only once even with multiple pulp_celerybeat processes running
3) Ensure that even if pulp_celerybeat processes start being killed, that scheduled calls still dispatch only once.
4) Document changes
5) Add a release note
6) Verify that the failure watcher code is safe to run concurrently
7) Verify that the EventMonitor code is safe to run concurrently
8) Verify that the WorkerTimeoutMonitor code is safe to run concurrently
sbhawsin wrote:
Celerybeat is one of the single point of failure in pulp. We should be able to make pulp highly available by making celerybeat highly available.
Deliverables:
1) Able to run multiple celerybeat process
2) By killing one of the process, pulp should not fail to work
Design Doc: https://fedorahosted.org/pulp/wiki/PulpHACelerybeat
- Subject changed from To be able to run pulp with multiple celeryBeat instances and make it HA to As a user I can run Pulp with multiple celeryBeat instances
- Description updated (diff)
- Description updated (diff)
- Description updated (diff)
- Description updated (diff)
- Groomed changed from No to Yes
- Sprint Candidate changed from No to Yes
- Status changed from ASSIGNED to POST
- Blocks Issue #1113: If an instance of pulp_celerybeat dies unexpectedly, Pulp incorrectly tries to "cancel all tasks in its queue" added
- Status changed from POST to MODIFIED
- % Done changed from 0 to 100
QE, you should use a clustered setup to run pulp_celerybeat on two machines. Then configure some scheduled operations and ensure that the expected number of operations occur. If you kill -9 one of the pulp_celerybeat and leave the other running everything should still work including manual sync+publish operations, and scheduled operations. Note, a sync may take up to 270 seconds to failover after killing a pulp_celerybeat, but it is still expected to recover.
- Platform Release changed from master to 2.8.0
- Status changed from MODIFIED to 5
- Status changed from 5 to CLOSED - CURRENTRELEASE
Also available in: Atom
PDF
Story 1060: Pulp with multiple celerybeat instances
https://pulp/plan.io/issues/1060
closes #1060