Project

Profile

Help

Issue #5502

closed

worker fails to mark a task as failed when critical conditions are encountered

Added by mihai.ibanescu@gmail.com about 5 years ago. Updated over 3 years ago.

Status:
CLOSED - CURRENTRELEASE
Priority:
Normal
Assignee:
Category:
-
Sprint/Milestone:
-
Start date:
Due date:
Estimated time:
Severity:
2. Medium
Version:
2.15.3
Platform Release:
2.21.5
OS:
Triaged:
Yes
Groomed:
No
Sprint Candidate:
No
Tags:
Pulp 2
Sprint:
Sprint 87
Quarter:

Description

In my case, I was syncing a docker repo and I ran out of space on the partition that had /var/cache/pulp/

Sep 25 15:54:31 pulptest11.unx.sas.com pulp[13366]: pulp.server.async.tasks:INFO: [a121ac15] Task failed : [fa40464f-ad88-4f7a-913a-1cbd498e298e] : Worker terminated abnormally while processing task fa40464f-ad88-4f7a-913a-1cbd498e298e.  Check the logs for details
Sep 25 15:54:31 pulptest11.unx.sas.com pulp[13366]: celery.app.trace:ERROR: [a121ac15] (13366-14816) Task pulp.server.async.tasks._release_resource[a121ac15-1698-4628-8322-fa2e3543bdda] raised unexpected: AttributeError("'NoneType' object has no attribute 'top'",)
Sep 25 15:54:31 pulptest11.unx.sas.com pulp[13366]: celery.app.trace:ERROR: [a121ac15] (13366-14816) Traceback (most recent call last):
Sep 25 15:54:31 pulptest11.unx.sas.com pulp[13366]: celery.app.trace:ERROR: [a121ac15] (13366-14816)   File "/usr/lib/python2.7/site-packages/celery/app/trace.py", line 367, in trace_task
Sep 25 15:54:31 pulptest11.unx.sas.com pulp[13366]: celery.app.trace:ERROR: [a121ac15] (13366-14816)     R = retval = fun(*args, **kwargs)
Sep 25 15:54:31 pulptest11.unx.sas.com pulp[13366]: celery.app.trace:ERROR: [a121ac15] (13366-14816)   File "/usr/lib/python2.7/site-packages/pulp/server/async/tasks.py", line 107, in __call__
Sep 25 15:54:31 pulptest11.unx.sas.com pulp[13366]: celery.app.trace:ERROR: [a121ac15] (13366-14816)     return super(PulpTask, self).__call__(*args, **kwargs)
Sep 25 15:54:31 pulptest11.unx.sas.com pulp[13366]: celery.app.trace:ERROR: [a121ac15] (13366-14816)   File "/usr/lib/python2.7/site-packages/celery/app/trace.py", line 622, in __protected_call__
Sep 25 15:54:31 pulptest11.unx.sas.com pulp[13366]: celery.app.trace:ERROR: [a121ac15] (13366-14816)     return self.run(*args, **kwargs)
Sep 25 15:54:31 pulptest11.unx.sas.com pulp[13366]: celery.app.trace:ERROR: [a121ac15] (13366-14816)   File "/usr/lib/python2.7/site-packages/pulp/server/async/tasks.py", line 296, in _release_resource
Sep 25 15:54:31 pulptest11.unx.sas.com pulp[13366]: celery.app.trace:ERROR: [a121ac15] (13366-14816)     new_task.on_failure(exception, task_id, (), {}, MyEinfo)
Sep 25 15:54:31 pulptest11.unx.sas.com pulp[13366]: celery.app.trace:ERROR: [a121ac15] (13366-14816)   File "/usr/lib/python2.7/site-packages/pulp/server/async/tasks.py", line 602, in on_failure
Sep 25 15:54:31 pulptest11.unx.sas.com pulp[13366]: celery.app.trace:ERROR: [a121ac15] (13366-14816)     if not self.request.called_directly:
Sep 25 15:54:31 pulptest11.unx.sas.com pulp[13366]: celery.app.trace:ERROR: [a121ac15] (13366-14816)   File "/usr/lib/python2.7/site-packages/celery/app/task.py", line 978, in _get_request
Sep 25 15:54:31 pulptest11.unx.sas.com pulp[13366]: celery.app.trace:ERROR: [a121ac15] (13366-14816)     req = self.request_stack.top
Sep 25 15:54:31 pulptest11.unx.sas.com pulp[13366]: celery.app.trace:ERROR: [a121ac15] (13366-14816) AttributeError: 'NoneType' object has no attribute 'top'

This looks similar to https://pulp.plan.io/issues/2849 but I've applied that fix already.

I believe the problem with not canceling the whole task (and generating the traceback above) is at https://github.com/pulp/pulp/blob/2-master/server/pulp/server/async/tasks.py#L376 (while I'm on 2.15.3, the bug seems to still be in the 2-master branch).

new_task was never bound to an app, so its request_stack was never initialized.

If that is indeed the case, maybe at the same time we can also make the inner class at line 373 be moved higher so it doesn't have to be re-generated with every call - although in this case, if you're in this code path, things are kind of bad anyway).

Also available in: Atom PDF