Project

Profile

Help

Issue #1838

Updated by bmbouter about 6 years ago

I now have 3 tasks that are stuck in "Waiting". 

 We have 2 hosts that run as an HA cluster, with corosync as the heartbeat. Celery runs on both, so both should process tasks. The resource manager runs only on one, and gets moved to the other if corosync determines the primary is dead. 

 Here is some debug output: 

 <pre> 
 


 2016-04-12 09:22:45,763 - DEBUG - sending GET request to /pulp/api/v2/tasks/622041ac-e9e4-4a15-bd7c-7c98a17782e0/ 
 2016-04-12 09:22:46,023 - INFO - GET request to /pulp/api/v2/tasks/622041ac-e9e4-4a15-bd7c-7c98a17782e0/ with parameters None 
 2016-04-12 09:22:46,023 - INFO - Response status : 200  

 2016-04-12 09:22:46,023 - INFO - Response body : 
  { 
   "exception": null,  
   "task_type": "pulp.server.managers.repo.publish.publish",  
   "_href": "/pulp/api/v2/tasks/622041ac-e9e4-4a15-bd7c-7c98a17782e0/",  
   "task_id": "622041ac-e9e4-4a15-bd7c-7c98a17782e0",  
   "tags": [ 
     "pulp:repository:thirdparty-snapshot-rpm-latest",  
     "pulp:action:publish" 
   ],  
   "finish_time": null,  
   "_ns": "task_status",  
   "start_time": null,  
   "traceback": null,  
   "spawned_tasks": [],  
   "progress_report": {},  
   "queue": "None.dq",  
   "state": "waiting",  
   "worker_name": null,  
   "result": null,  
   "error": null,  
   "_id": { 
     "$oid": "5705bd46cbdef6e14906bf98" 
   },  
   "id": "5705bd46cbdef6e14906bf98" 
 } 

 Operations:         publish 
 Resources:          thirdparty-snapshot-rpm-latest (repository) 
 State:              Waiting 
 Start Time:         Unstarted 
 Finish Time:        Incomplete 
 Result:             Incomplete 
 Task Id:            622041ac-e9e4-4a15-bd7c-7c98a17782e0 
 Progress Report:   
 </pre> 


   



 Output of ps afuxw | grep celery: 

 On host1: 

 <pre> 
 root        2921    0.0    0.0 112640     960 pts/2      S+     09:31     0:00    |                         \_ grep --color=auto celery 
 apache     21996    0.1    0.0 519060 62080 ?          Ssl    Apr06    10:43 /usr/bin/python /usr/bin/celery worker -n reserved_resource_worker-0@%h -A pulp.server.async.app -c 1 --events --umask 18 --pidfile=/var/run/pulp/reserved_resource_worker-0.pid --heartbeat-interval=30 
 apache     22119    2.6    0.1 654736 193452 ?         Rl     Apr06 220:36    \_ /usr/bin/python /usr/bin/celery worker -n reserved_resource_worker-0@%h -A pulp.server.async.app -c 1 --events --umask 18 --pidfile=/var/run/pulp/reserved_resource_worker-0.pid --heartbeat-interval=30 
 apache     21998    0.1    0.0 518364 61656 ?          Ssl    Apr06    10:12 /usr/bin/python /usr/bin/celery worker -n reserved_resource_worker-1@%h -A pulp.server.async.app -c 1 --events --umask 18 --pidfile=/var/run/pulp/reserved_resource_worker-1.pid --heartbeat-interval=30 
 apache     22124    0.3    0.0 544160 80196 ?          Sl     Apr06    25:32    \_ /usr/bin/python /usr/bin/celery worker -n reserved_resource_worker-1@%h -A pulp.server.async.app -c 1 --events --umask 18 --pidfile=/var/run/pulp/reserved_resource_worker-1.pid --heartbeat-interval=30 
 apache     22000    0.1    0.0 519052 61984 ?          Ssl    Apr06    10:56 /usr/bin/python /usr/bin/celery worker -n reserved_resource_worker-2@%h -A pulp.server.async.app -c 1 --events --umask 18 --pidfile=/var/run/pulp/reserved_resource_worker-2.pid --heartbeat-interval=30 
 apache     22129    2.3    0.2 669752 208464 ?         Dl     Apr06 198:42    \_ /usr/bin/python /usr/bin/celery worker -n reserved_resource_worker-2@%h -A pulp.server.async.app -c 1 --events --umask 18 --pidfile=/var/run/pulp/reserved_resource_worker-2.pid --heartbeat-interval=30 
 apache     22002    0.1    0.0 518980 62028 ?          Ssl    Apr06    10:50 /usr/bin/python /usr/bin/celery worker -n reserved_resource_worker-3@%h -A pulp.server.async.app -c 1 --events --umask 18 --pidfile=/var/run/pulp/reserved_resource_worker-3.pid --heartbeat-interval=30 
 apache     22126    2.5    0.4 867344 405440 ?         Dl     Apr06 217:02    \_ /usr/bin/python /usr/bin/celery worker -n reserved_resource_worker-3@%h -A pulp.server.async.app -c 1 --events --umask 18 --pidfile=/var/run/pulp/reserved_resource_worker-3.pid --heartbeat-interval=30 
 apache     22004    0.1    0.0 518972 62176 ?          Ssl    Apr06    10:41 /usr/bin/python /usr/bin/celery worker -n reserved_resource_worker-4@%h -A pulp.server.async.app -c 1 --events --umask 18 --pidfile=/var/run/pulp/reserved_resource_worker-4.pid --heartbeat-interval=30 
 apache     22128    2.3    0.2 681192 219840 ?         Dl     Apr06 196:41    \_ /usr/bin/python /usr/bin/celery worker -n reserved_resource_worker-4@%h -A pulp.server.async.app -c 1 --events --umask 18 --pidfile=/var/run/pulp/reserved_resource_worker-4.pid --heartbeat-interval=30 
 apache     22006    0.1    0.0 518500 61580 ?          Ssl    Apr06    10:17 /usr/bin/python /usr/bin/celery worker -n reserved_resource_worker-5@%h -A pulp.server.async.app -c 1 --events --umask 18 --pidfile=/var/run/pulp/reserved_resource_worker-5.pid --heartbeat-interval=30 
 apache     22132    0.0    0.0 518960 54696 ?          Sl     Apr06     7:16    \_ /usr/bin/python /usr/bin/celery worker -n reserved_resource_worker-5@%h -A pulp.server.async.app -c 1 --events --umask 18 --pidfile=/var/run/pulp/reserved_resource_worker-5.pid --heartbeat-interval=30 
 apache     22008    0.1    0.0 518364 61624 ?          Ssl    Apr06    10:20 /usr/bin/python /usr/bin/celery worker -n reserved_resource_worker-6@%h -A pulp.server.async.app -c 1 --events --umask 18 --pidfile=/var/run/pulp/reserved_resource_worker-6.pid --heartbeat-interval=30 
 apache     22120    0.3    0.0 519700 57868 ?          Dl     Apr06    31:11    \_ /usr/bin/python /usr/bin/celery worker -n reserved_resource_worker-6@%h -A pulp.server.async.app -c 1 --events --umask 18 --pidfile=/var/run/pulp/reserved_resource_worker-6.pid --heartbeat-interval=30 
 apache     22010    0.1    0.0 518700 61616 ?          Ssl    Apr06    10:24 /usr/bin/python /usr/bin/celery worker -n reserved_resource_worker-7@%h -A pulp.server.async.app -c 1 --events --umask 18 --pidfile=/var/run/pulp/reserved_resource_worker-7.pid --heartbeat-interval=30 
 apache     22121    1.6    0.2 671912 210604 ?         Rl     Apr06 138:42    \_ /usr/bin/python /usr/bin/celery worker -n reserved_resource_worker-7@%h -A pulp.server.async.app -c 1 --events --umask 18 --pidfile=/var/run/pulp/reserved_resource_worker-7.pid --heartbeat-interval=30 
 apache     21270    0.3    0.0 487004 27936 ?          Ssl    Apr11     2:41 /usr/bin/python /usr/bin/celery beat --app=pulp.server.async.celery_instance.celery --scheduler=pulp.server.async.scheduler.Scheduler 
 apache     17185    0.5    0.0 522104 65144 ?          Ssl    08:59     0:10 /usr/bin/python /usr/bin/celery worker -A pulp.server.async.app -n resource_manager@%h -Q resource_manager -c 1 --events --umask 18 --pidfile=/var/run/pulp/resource_manager.pid --heartbeat-interval=30 
 apache     17289    5.9    0.0 518356 54268 ?          Sl     08:59     1:55    \_ /usr/bin/python /usr/bin/celery worker -A pulp.server.async.app -n resource_manager@%h -Q resource_manager -c 1 --events --umask 18 --pidfile=/var/run/pulp/resource_manager.pid --heartbeat-interval=30 
 </pre> 

 


 On host2: 

 <pre> 
 root        4431    0.0    0.0 112640     960 pts/0      S+     09:32     0:00    |                         \_ grep --color=auto celery 
 apache     14669    0.1    0.0 520664 63784 ?          Ssl    Apr06    12:17 /usr/bin/python /usr/bin/celery worker -n reserved_resource_worker-0@%h -A pulp.server.async.app -c 1 --events --umask 18 --pidfile=/var/run/pulp/reserved_resource_worker-0.pid --heartbeat-interval=30 
 apache     15042    1.9    0.1 652572 190552 ?         Dl     Apr06 166:59    \_ /usr/bin/python /usr/bin/celery worker -n reserved_resource_worker-0@%h -A pulp.server.async.app -c 1 --events --umask 18 --pidfile=/var/run/pulp/reserved_resource_worker-0.pid --heartbeat-interval=30 
 apache     14671    0.1    0.0 520672 63668 ?          Ssl    Apr06    12:24 /usr/bin/python /usr/bin/celery worker -n reserved_resource_worker-1@%h -A pulp.server.async.app -c 1 --events --umask 18 --pidfile=/var/run/pulp/reserved_resource_worker-1.pid --heartbeat-interval=30 
 apache     15046    2.4    0.1 618272 153048 ?         Sl     Apr06 205:57    \_ /usr/bin/python /usr/bin/celery worker -n reserved_resource_worker-1@%h -A pulp.server.async.app -c 1 --events --umask 18 --pidfile=/var/run/pulp/reserved_resource_worker-1.pid --heartbeat-interval=30 
 apache     14674    0.1    0.0 520168 63324 ?          Ssl    Apr06    12:07 /usr/bin/python /usr/bin/celery worker -n reserved_resource_worker-2@%h -A pulp.server.async.app -c 1 --events --umask 18 --pidfile=/var/run/pulp/reserved_resource_worker-2.pid --heartbeat-interval=30 
 apache     15044    2.7    0.1 645860 184516 ?         Rl     Apr06 234:59    \_ /usr/bin/python /usr/bin/celery worker -n reserved_resource_worker-2@%h -A pulp.server.async.app -c 1 --events --umask 18 --pidfile=/var/run/pulp/reserved_resource_worker-2.pid --heartbeat-interval=30 
 apache     14676    0.1    0.0 520672 63816 ?          Ssl    Apr06    12:12 /usr/bin/python /usr/bin/celery worker -n reserved_resource_worker-3@%h -A pulp.server.async.app -c 1 --events --umask 18 --pidfile=/var/run/pulp/reserved_resource_worker-3.pid --heartbeat-interval=30 
 apache     15048    2.7    0.2 665080 203128 ?         Dl     Apr06 230:19    \_ /usr/bin/python /usr/bin/celery worker -n reserved_resource_worker-3@%h -A pulp.server.async.app -c 1 --events --umask 18 --pidfile=/var/run/pulp/reserved_resource_worker-3.pid --heartbeat-interval=30 
 apache     14678    0.1    0.0 520664 63724 ?          Ssl    Apr06    12:18 /usr/bin/python /usr/bin/celery worker -n reserved_resource_worker-4@%h -A pulp.server.async.app -c 1 --events --umask 18 --pidfile=/var/run/pulp/reserved_resource_worker-4.pid --heartbeat-interval=30 
 apache     15045    2.3    0.2 680920 219648 ?         Rl     Apr06 201:53    \_ /usr/bin/python /usr/bin/celery worker -n reserved_resource_worker-4@%h -A pulp.server.async.app -c 1 --events --umask 18 --pidfile=/var/run/pulp/reserved_resource_worker-4.pid --heartbeat-interval=30 
 apache     14681    0.1    0.0 520680 63792 ?          Ssl    Apr06    12:07 /usr/bin/python /usr/bin/celery worker -n reserved_resource_worker-5@%h -A pulp.server.async.app -c 1 --events --umask 18 --pidfile=/var/run/pulp/reserved_resource_worker-5.pid --heartbeat-interval=30 
 apache     15041    2.6    0.2 666260 204232 ?         Dl     Apr06 223:23    \_ /usr/bin/python /usr/bin/celery worker -n reserved_resource_worker-5@%h -A pulp.server.async.app -c 1 --events --umask 18 --pidfile=/var/run/pulp/reserved_resource_worker-5.pid --heartbeat-interval=30 
 apache     14684    0.1    0.0 520168 63304 ?          Ssl    Apr06    11:44 /usr/bin/python /usr/bin/celery worker -n reserved_resource_worker-6@%h -A pulp.server.async.app -c 1 --events --umask 18 --pidfile=/var/run/pulp/reserved_resource_worker-6.pid --heartbeat-interval=30 
 apache     15043    0.1    0.0 534632 71388 ?          Sl     Apr06    13:16    \_ /usr/bin/python /usr/bin/celery worker -n reserved_resource_worker-6@%h -A pulp.server.async.app -c 1 --events --umask 18 --pidfile=/var/run/pulp/reserved_resource_worker-6.pid --heartbeat-interval=30 
 apache     14693    0.1    0.0 520940 64036 ?          Ssl    Apr06    13:41 /usr/bin/python /usr/bin/celery worker -n reserved_resource_worker-7@%h -A pulp.server.async.app -c 1 --events --umask 18 --pidfile=/var/run/pulp/reserved_resource_worker-7.pid --heartbeat-interval=30 
 apache     15047    2.8    0.2 667648 205668 ?         Rl     Apr06 240:37    \_ /usr/bin/python /usr/bin/celery worker -n reserved_resource_worker-7@%h -A pulp.server.async.app -c 1 --events --umask 18 --pidfile=/var/run/pulp/reserved_resource_worker-7.pid --heartbeat-interval=30 
 apache      1909    0.4    0.0 521980 64864 ?          Ssl    08:57     0:09 /usr/bin/python /usr/bin/celery worker -A pulp.server.async.app -n resource_manager@%h -Q resource_manager -c 1 --events --umask 18 --pidfile=/var/run/pulp/resource_manager.pid --heartbeat-interval=30 
 apache      2020    5.4    0.0 518348 54256 ?          Sl     08:57     1:51    \_ /usr/bin/python /usr/bin/celery worker -A pulp.server.async.app -n resource_manager@%h -Q resource_manager -c 1 --events --umask 18 --pidfile=/var/run/pulp/resource_manager.pid --heartbeat-interval=30 
 </pre>

Back