Project

Profile

Help

Issue #5502

worker fails to mark a task as failed when critical conditions are encountered

Added by mihai.ibanescu@gmail.com about 1 year ago. Updated 12 days ago.

Status:
NEW
Priority:
Normal
Assignee:
-
Category:
-
Sprint/Milestone:
-
Start date:
Due date:
Estimated time:
Severity:
2. Medium
Version:
2.15.3
Platform Release:
OS:
Triaged:
Yes
Groomed:
No
Sprint Candidate:
No
Tags:
Pulp 2
Sprint:
Quarter:

Description

In my case, I was syncing a docker repo and I ran out of space on the partition that had /var/cache/pulp/

Sep 25 15:54:31 pulptest11.unx.sas.com pulp[13366]: pulp.server.async.tasks:INFO: [a121ac15] Task failed : [fa40464f-ad88-4f7a-913a-1cbd498e298e] : Worker terminated abnormally while processing task fa40464f-ad88-4f7a-913a-1cbd498e298e.  Check the logs for details
Sep 25 15:54:31 pulptest11.unx.sas.com pulp[13366]: celery.app.trace:ERROR: [a121ac15] (13366-14816) Task pulp.server.async.tasks._release_resource[a121ac15-1698-4628-8322-fa2e3543bdda] raised unexpected: AttributeError("'NoneType' object has no attribute 'top'",)
Sep 25 15:54:31 pulptest11.unx.sas.com pulp[13366]: celery.app.trace:ERROR: [a121ac15] (13366-14816) Traceback (most recent call last):
Sep 25 15:54:31 pulptest11.unx.sas.com pulp[13366]: celery.app.trace:ERROR: [a121ac15] (13366-14816)   File "/usr/lib/python2.7/site-packages/celery/app/trace.py", line 367, in trace_task
Sep 25 15:54:31 pulptest11.unx.sas.com pulp[13366]: celery.app.trace:ERROR: [a121ac15] (13366-14816)     R = retval = fun(*args, **kwargs)
Sep 25 15:54:31 pulptest11.unx.sas.com pulp[13366]: celery.app.trace:ERROR: [a121ac15] (13366-14816)   File "/usr/lib/python2.7/site-packages/pulp/server/async/tasks.py", line 107, in __call__
Sep 25 15:54:31 pulptest11.unx.sas.com pulp[13366]: celery.app.trace:ERROR: [a121ac15] (13366-14816)     return super(PulpTask, self).__call__(*args, **kwargs)
Sep 25 15:54:31 pulptest11.unx.sas.com pulp[13366]: celery.app.trace:ERROR: [a121ac15] (13366-14816)   File "/usr/lib/python2.7/site-packages/celery/app/trace.py", line 622, in __protected_call__
Sep 25 15:54:31 pulptest11.unx.sas.com pulp[13366]: celery.app.trace:ERROR: [a121ac15] (13366-14816)     return self.run(*args, **kwargs)
Sep 25 15:54:31 pulptest11.unx.sas.com pulp[13366]: celery.app.trace:ERROR: [a121ac15] (13366-14816)   File "/usr/lib/python2.7/site-packages/pulp/server/async/tasks.py", line 296, in _release_resource
Sep 25 15:54:31 pulptest11.unx.sas.com pulp[13366]: celery.app.trace:ERROR: [a121ac15] (13366-14816)     new_task.on_failure(exception, task_id, (), {}, MyEinfo)
Sep 25 15:54:31 pulptest11.unx.sas.com pulp[13366]: celery.app.trace:ERROR: [a121ac15] (13366-14816)   File "/usr/lib/python2.7/site-packages/pulp/server/async/tasks.py", line 602, in on_failure
Sep 25 15:54:31 pulptest11.unx.sas.com pulp[13366]: celery.app.trace:ERROR: [a121ac15] (13366-14816)     if not self.request.called_directly:
Sep 25 15:54:31 pulptest11.unx.sas.com pulp[13366]: celery.app.trace:ERROR: [a121ac15] (13366-14816)   File "/usr/lib/python2.7/site-packages/celery/app/task.py", line 978, in _get_request
Sep 25 15:54:31 pulptest11.unx.sas.com pulp[13366]: celery.app.trace:ERROR: [a121ac15] (13366-14816)     req = self.request_stack.top
Sep 25 15:54:31 pulptest11.unx.sas.com pulp[13366]: celery.app.trace:ERROR: [a121ac15] (13366-14816) AttributeError: 'NoneType' object has no attribute 'top'

This looks similar to https://pulp.plan.io/issues/2849 but I've applied that fix already.

I believe the problem with not canceling the whole task (and generating the traceback above) is at https://github.com/pulp/pulp/blob/2-master/server/pulp/server/async/tasks.py#L376 (while I'm on 2.15.3, the bug seems to still be in the 2-master branch).

new_task was never bound to an app, so its request_stack was never initialized.

If that is indeed the case, maybe at the same time we can also make the inner class at line 373 be moved higher so it doesn't have to be re-generated with every call - although in this case, if you're in this code path, things are kind of bad anyway).

History

#1 Updated by fao89 about 1 year ago

  • Sprint/Milestone set to 2.20.0
  • Triaged changed from No to Yes

#2 Updated by dalley about 1 year ago

I'm pretty sure this broke after we moved to Celery 4.0.2: https://pulp.plan.io/issues/2990

I think my judgement at the time about whether that issue was likely to present a problem was flawed. It should probably be fixed. Would you be willing to look into it, Mihai?

#3 Updated by dkliban@redhat.com about 2 months ago

  • Status changed from NEW to CLOSED - WONTFIX
  • Tags Pulp 2 added

Pulp 2 is in maintenance mode.

#4 Updated by bmbouter 12 days ago

  • Status changed from CLOSED - WONTFIX to NEW
  • Sprint/Milestone deleted (2.20.0)

Please register to edit this issue

Also available in: Atom PDF