Project

Profile

Help

Issue #2664

closed

Workers Canceling Tasks on Startup Fail if the Broker is Down

Added by bmbouter about 7 years ago. Updated over 3 years ago.

Status:
CLOSED - WONTFIX
Priority:
Normal
Assignee:
-
Category:
-
Sprint/Milestone:
-
Start date:
Due date:
Estimated time:
Severity:
2. Medium
Version:
Platform Release:
OS:
Triaged:
Yes
Groomed:
Yes
Sprint Candidate:
No
Tags:
Pulp 2
Sprint:
Quarter:

Description

To reproduce:

Create a 'zoo' repo that syncs from the zoo fixture repo

pulp-admin rpm repo create --repo-id tester --feed https://repos.fedorapeople.org/repos/pulp/pulp/fixtures/rpm/

Sync 'zoo' manually once to see it working. Sync it again, and just when the task begins run:

sudo pkill -9 -f reserved_resource_worker

Then stop qpidd with sudo systemctl stop qpidd

Then restart the pulp_workers sudo systemctl restart pulp_workers

Observe the traceback:

Mar 24 17:25:50 dev pulp[10046]: pulp.server.async.tasks:INFO: Cleaning up shutdown worker 'reserved_resource_worker-0@dev'.
Mar 24 17:25:50 dev celery[10046]: Traceback (most recent call last):
Mar 24 17:25:50 dev celery[10046]:   File "/usr/lib64/python2.7/multiprocessing/util.py", line 274, in _run_finalizers
Mar 24 17:25:50 dev celery[10046]:     finalizer()
Mar 24 17:25:50 dev celery[10046]:   File "/usr/lib64/python2.7/multiprocessing/util.py", line 207, in __call__
Mar 24 17:25:50 dev celery[10046]:     res = self._callback(*self._args, **self._kwargs)
Mar 24 17:25:50 dev celery[10046]:   File "/usr/lib/python2.7/site-packages/celery/worker/__init__.py", line 201, in _send_worker_shutdown
Mar 24 17:25:50 dev celery[10046]:     signals.worker_shutdown.send(sender=self)
Mar 24 17:25:50 dev celery[10046]:   File "/usr/lib/python2.7/site-packages/celery/utils/dispatch/signal.py", line 166, in send
Mar 24 17:25:50 dev celery[10046]:     response = receiver(signal=self, sender=sender, **named)
Mar 24 17:25:50 dev celery[10046]:   File "/home/vagrant/devel/pulp/server/pulp/server/async/app.py", line 194, in shutdown_worker
Mar 24 17:25:50 dev celery[10046]:     tasks._delete_worker(sender.hostname, normal_shutdown=True)
Mar 24 17:25:50 dev celery[10046]:   File "/home/vagrant/devel/pulp/server/pulp/server/async/tasks.py", line 273, in _delete_worker
Mar 24 17:25:50 dev celery[10046]:     cancel(task_status['task_id'])
Mar 24 17:25:50 dev celery[10046]:   File "/home/vagrant/devel/pulp/server/pulp/server/async/tasks.py", line 642, in cancel
Mar 24 17:25:50 dev celery[10046]:     controller.revoke(task_id, terminate=True)
Mar 24 17:25:50 dev celery[10046]:   File "/usr/lib/python2.7/site-packages/celery/app/control.py", line 172, in revoke
Mar 24 17:25:50 dev celery[10046]:     'signal': signal}, **kwargs)
Mar 24 17:25:50 dev celery[10046]:   File "/usr/lib/python2.7/site-packages/celery/app/control.py", line 316, in broadcast
Mar 24 17:25:50 dev celery[10046]:     limit, callback, channel=channel,
Mar 24 17:25:50 dev celery[10046]:   File "/usr/lib/python2.7/site-packages/kombu/pidbox.py", line 283, in _broadcast
Mar 24 17:25:50 dev celery[10046]:     chan = channel or self.connection.default_channel
Mar 24 17:25:50 dev celery[10046]:   File "/usr/lib/python2.7/site-packages/kombu/connection.py", line 756, in default_channel
Mar 24 17:25:50 dev celery[10046]:     self.connection
Mar 24 17:25:50 dev celery[10046]:   File "/usr/lib/python2.7/site-packages/kombu/connection.py", line 741, in connection
Mar 24 17:25:50 dev celery[10046]:     self._connection = self._establish_connection()
Mar 24 17:25:50 dev celery[10046]:   File "/usr/lib/python2.7/site-packages/kombu/connection.py", line 696, in _establish_connection
Mar 24 17:25:50 dev celery[10046]:     conn = self.transport.establish_connection()
Mar 24 17:25:50 dev celery[10046]:   File "/usr/lib/python2.7/site-packages/kombu/transport/qpid.py", line 1600, in establish_connection
Mar 24 17:25:50 dev celery[10046]:     conn = self.Connection(**opts)
Mar 24 17:25:50 dev celery[10046]:   File "/usr/lib/python2.7/site-packages/kombu/transport/qpid.py", line 1261, in __init__
Mar 24 17:25:50 dev celery[10046]:     self._qpid_conn = establish(**self.connection_options)
Mar 24 17:25:50 dev celery[10046]:   File "/usr/lib/python2.7/site-packages/qpid/messaging/endpoints.py", line 112, in establish
Mar 24 17:25:50 dev celery[10046]:     conn.open(timeout=timeout)
Mar 24 17:25:50 dev celery[10046]:   File "<string>", line 6, in open
Mar 24 17:25:50 dev celery[10046]:   File "/usr/lib/python2.7/site-packages/qpid/messaging/endpoints.py", line 323, in open
Mar 24 17:25:50 dev celery[10046]:     self.attach(timeout=timeout)
Mar 24 17:25:50 dev celery[10046]:   File "<string>", line 6, in attach
Mar 24 17:25:50 dev celery[10046]:   File "/usr/lib/python2.7/site-packages/qpid/messaging/endpoints.py", line 341, in attach
Mar 24 17:25:50 dev celery[10046]:     if not self._ewait(lambda: self._transport_connected and not self._unlinked(), timeout=timeout):
Mar 24 17:25:50 dev celery[10046]:   File "/usr/lib/python2.7/site-packages/qpid/messaging/endpoints.py", line 274, in _ewait
Mar 24 17:25:50 dev celery[10046]:     self.check_error()
Mar 24 17:25:50 dev celery[10046]:   File "/usr/lib/python2.7/site-packages/qpid/messaging/endpoints.py", line 267, in check_error
Mar 24 17:25:50 dev celery[10046]:     raise e
Mar 24 17:25:50 dev celery[10046]: ConnectError: [Errno 111] Connection refused

Then start qpidd again sudo systemctl start qpidd

Note that the pulp_workers never started. They emit the traceback and shut back down.

Also available in: Atom PDF