Issue #686
closedPulp task hangs when pulp services were restarted (upstart)
Description
Description of problem:
orphan remove task hangs on RHEL 6 if pulp services were restarted.
Version-Release number of selected component (if applicable):
2.5-release
How reproducible:
Always
Steps to Reproduce:
I created a script to make this easy:
$ cat all.sh
sudo service qpidd $1
sudo service pulp_celerybeat $1
sudo service pulp_resource_manager $1
sudo service pulp_workers $1
sudo service httpd $1
1. $ ./all.sh restart
2. $ pulp-admin orphan remove --all
Actual results:
Waiting to begin... (hangs here)
Expected results:
Task Succeeded
Additional info:
The services need to be stopped and then started. From this state,
1. $ ./all.sh stop
2. $ ./all.sh start
3. $ pulp-admin orphan remove --all
Task Succeeded
+ This bug was cloned from Bugzilla Bug #1188755 +
Updated by amacdona@redhat.com about 8 years ago
I was able to reproduce this on 2.6-testing also.
+ This comment was cloned from Bugzilla #1188755 comment 1 +
Updated by cduryee about 8 years ago
This bz caused me a lot of confusion. Can this be put on 2.6.0?
+ This comment was cloned from Bugzilla #1188755 comment 2 +
Added by rbarlow about 8 years ago
Added by rbarlow about 8 years ago
Upstart use stop/start instead of restart_workers.
This commit updates our Upstart init scripts to call stop_workers and then start_workers, instead of restart_workers(). There seems to be an issue with Celery multi, restarting, and Pulp. This commit works around that issue by fully stopping the workers instead of issuing a restart to them.
https://pulp.plan.io/issues/686
closes #686
Updated by rbarlow about 8 years ago
- Status changed from ASSIGNED to POST
- Tags Easy Fix added
Updated by rbarlow about 8 years ago
- Status changed from POST to MODIFIED
- % Done changed from 0 to 100
Applied in changeset pulp|1fb36484f37d68afb578f95d3fd726c12c00d769.
Updated by pthomas@redhat.com almost 8 years ago
- Status changed from 5 to 6
verified
[root@cloud-qe-12 ~]# rpm -qa pulp-server
pulp-server-2.6.1-0.2.beta.el6.noarch
[root@cloud-qe-12 ~]#
[root@cloud-qe-12 ~]#
[root@cloud-qe-12 ~]# ./all.sh restart
Stopping Qpid AMQP daemon: [ OK ]
Starting Qpid AMQP daemon: [ OK ]
celery init v10.0.
Using configuration: /etc/default/pulp_workers, /etc/default/pulp_celerybeat
Restarting celery periodic task scheduler
Stopping pulp_celerybeat... OK
Starting pulp_celerybeat...
celery init v10.0.
Using config script: /etc/default/pulp_resource_manager
celery multi v3.1.11 (Cipater)
> Stopping nodes...
> resource_manager@cloud-qe-12.idmqe.lab.eng.bos.redhat.com: QUIT -> 18130
> Waiting for 1 node -> 18130.....
> resource_manager@cloud-qe-12.idmqe.lab.eng.bos.redhat.com: OK
celery multi v3.1.11 (Cipater)
> Starting nodes...
> resource_manager@cloud-qe-12.idmqe.lab.eng.bos.redhat.com: OK
celery init v10.0.
Using config script: /etc/default/pulp_workers
celery multi v3.1.11 (Cipater)
> Stopping nodes...
> reserved_resource_worker-2@cloud-qe-12.idmqe.lab.eng.bos.redhat.com: QUIT -> 18344
> reserved_resource_worker-1@cloud-qe-12.idmqe.lab.eng.bos.redhat.com: QUIT -> 18313
> reserved_resource_worker-0@cloud-qe-12.idmqe.lab.eng.bos.redhat.com: QUIT -> 18284
> reserved_resource_worker-3@cloud-qe-12.idmqe.lab.eng.bos.redhat.com: QUIT -> 18377
> Waiting for 4 nodes -> 18344, 18313, 18284, 18377........
> reserved_resource_worker-2@cloud-qe-12.idmqe.lab.eng.bos.redhat.com: OK
> Waiting for 3 nodes -> 18313, 18284, 18377....
> reserved_resource_worker-1@cloud-qe-12.idmqe.lab.eng.bos.redhat.com: OK
> Waiting for 2 nodes -> 18284, 18377....
> reserved_resource_worker-0@cloud-qe-12.idmqe.lab.eng.bos.redhat.com: OK
> Waiting for 1 node -> 18377....
> reserved_resource_worker-3@cloud-qe-12.idmqe.lab.eng.bos.redhat.com: OK
celery multi v3.1.11 (Cipater)
> Starting nodes...
> reserved_resource_worker-0@cloud-qe-12.idmqe.lab.eng.bos.redhat.com: OK
> reserved_resource_worker-1@cloud-qe-12.idmqe.lab.eng.bos.redhat.com: OK
> reserved_resource_worker-2@cloud-qe-12.idmqe.lab.eng.bos.redhat.com: OK
> reserved_resource_worker-3@cloud-qe-12.idmqe.lab.eng.bos.redhat.com: OK
Stopping httpd: [ OK ]
Starting httpd: [ OK ]
[root@cloud-qe-12 ~]# pulp-admin orphan remove --all
This command may be exited via ctrl+c without affecting the request.
[-]
Running...
Task Succeeded
<\pre>
Updated by dkliban@redhat.com almost 8 years ago
- Status changed from 6 to CLOSED - CURRENTRELEASE
Upstart use stop/start instead of restart_workers.
This commit updates our Upstart init scripts to call stop_workers and then start_workers, instead of restart_workers(). There seems to be an issue with Celery multi, restarting, and Pulp. This commit works around that issue by fully stopping the workers instead of issuing a restart to them.
https://pulp.plan.io/issues/686
closes #686