Project

Profile

Help

Issue #2045

closed

Task stuck at waiting if child process segfaults

Added by bmbouter almost 8 years ago. Updated over 2 years ago.

Status:
CLOSED - DUPLICATE
Priority:
Normal
Assignee:
-
Category:
-
Sprint/Milestone:
-
Start date:
Due date:
Estimated time:
Severity:
3. High
Version:
Platform Release:
OS:
Triaged:
Yes
Groomed:
No
Sprint Candidate:
No
Tags:
Pulp 2
Sprint:
Quarter:

Description

1. Start a sync or publish
2. While the sync or publish is running have the child celery task segfault
3. Observe a traceback like this one in the logs:

Jun 28 10:42:01 hp-dl380pgen8-02-vm-9 celery: (process:557): GLib-GIO-CRITICAL **: g_simple_async_result_run_in_thread: assertion 'G_IS_SIMPLE_ASYNC_RESULT (simple)' failed
Jun 28 10:42:01 hp-dl380pgen8-02-vm-9 celery: (process:557): GLib-GIO-CRITICAL **: g_simple_async_result_new: assertion '!source_object || G_IS_OBJECT (source_object)' failed
Jun 28 10:42:01 hp-dl380pgen8-02-vm-9 celery: (process:557): GLib-GIO-CRITICAL **: g_simple_async_result_set_op_res_gpointer: assertion 'G_IS_SIMPLE_ASYNC_RESULT (simple)' failed
Jun 28 10:42:01 hp-dl380pgen8-02-vm-9 celery: (process:557): GLib-GIO-CRITICAL **: g_simple_async_result_run_in_thread: assertion 'G_IS_SIMPLE_ASYNC_RESULT (simple)' failed
Jun 28 10:42:01 hp-dl380pgen8-02-vm-9 pulp: celery.worker.job:ERROR: (32700-91808) Task pulp.server.managers.repo.sync.sync[252ad894-d037-4bdb-bcd6-cc1b623fcc5e] raised unexpected: WorkerLostError('Worker exited prematurely: signal 11 (SIGSEGV).',)
Jun 28 10:42:01 hp-dl380pgen8-02-vm-9 pulp: celery.worker.job:ERROR: (32700-91808) Traceback (most recent call last):
Jun 28 10:42:01 hp-dl380pgen8-02-vm-9 pulp: celery.worker.job:ERROR: (32700-91808)   File "/usr/lib64/python2.7/site-packages/billiard/pool.py", line 1169, in mark_as_worker_lost
Jun 28 10:42:01 hp-dl380pgen8-02-vm-9 pulp: celery.worker.job:ERROR: (32700-91808)     human_status(exitcode)),
Jun 28 10:42:01 hp-dl380pgen8-02-vm-9 pulp: celery.worker.job:ERROR: (32700-91808) WorkerLostError: Worker exited prematurely: signal 11 (SIGSEGV).
Jun 28 10:42:01 hp-dl380pgen8-02-vm-9 celery: reserved_resource_worker-1@hp-dl380pgen8-02-vm-9.lab.bos.redhat.com ready.

4. Observe that parent celery process spawns an additional worker and begins processing additional tasks as normal
5. Observe that the task which was running when the segfault occured never leaves the RUNNING state and it never will


Related issues

Related to Pulp - Issue #1673: Pulp's worker watcher does not notice workers that got killed by OOM killer and their tasks stay "running" foreverCLOSED - CURRENTRELEASEbmbouterActions

Also available in: Atom PDF