Issue #2958: Ensure that queued tasks are not lost by enabling task_reject_on_worker_lost

Issue #2958

This is a clone of #2954 for pulp 3. 

 From #2954: 

 The resource_manager queue loses a currently running _queue_reserved_task if the resource manager is restarted with <code>sudo systemctl restart pulp_resource_manager</code>. 

 The task is lost from the queue but still has an incorrect TaskStatus record showing as waiting which will never run. 

 Note that if you <code>sudo pkill -9 -f resource_manager</code> and the <code>sudo systemctl start pulp_resource_manager</code> it does not lose the task. 

 <pre> 
 sudo systemctl stop pulp_workers 
 pulp-admin rpm repo sync run --repo-id zoo 
 qpid-stat -q                          <<-- observe that the queue depth of the resource_manager queue is 1 
 sudo systemctl restart pulp_resource_manager 
 qpid-stat -q                          <<-- observe that the queue depth of the resource_manager queue is 0 
 pulp-admin tasks list -s waiting      <<-- observe that the task which is gone is listed as 'waiting', but it will never run because it is gone 
 </pre> 

 We need to make sure that this doesn't happen in Celery 4. There's a config task that should prevent this: 

 http://docs.celeryproject.org/en/latest/userguide/configuration.html#task-reject-on-worker-lost

Back

Project

Profile

Help

Pulp

Issue #2958