Project

Profile

Help

Issue #1114

If an instance of pulp_resource_manager dies unexpectedly, Pulp incorrectly tries to "cancel all tasks in its queue"

Added by bmbouter over 6 years ago. Updated almost 3 years ago.

Status:
CLOSED - WONTFIX
Priority:
Normal
Assignee:
-
Category:
-
Sprint/Milestone:
-
Start date:
Due date:
Estimated time:
Severity:
1. Low
Version:
Master
Platform Release:
OS:
Triaged:
Yes
Groomed:
No
Sprint Candidate:
No
Tags:
Easy Fix, Pulp 2
Sprint:
Quarter:

Description

1. Start a clustered pulp installation with two machines in the cluster. Suppose these hostnames are called boxA and boxB. Start all pulp_* and httpd services on boxA.
2. Start a second pulp_resource_manager instance on boxB.
3. Use the /status/ API to verify that you can see both entries. They should show as 'resource_manager@boxA' and 'resource_manager@boxB'.
4. kill -9 the pulp_resource_manager service on boxB.
5. Wait for 6 or 7 minutes
6. Observe a traceback similar to the following in the logs of boxA.

pulp.server.async.scheduler:ERROR: Workers 'resource_manager@boxB' has gone missing, removing from list of workers
pulp.server.async.tasks:ERROR: The worker named resource_manager@boxB is missing. Canceling the tasks in its queue.

Two things are wrong with this, and both of them are located in this section of code.

(1) It should never call _delete_worker(worker.name) which attempts to cancel tasks, log, and clean up reservations, none of which make sense to do for pulp_resource_manager. Instead it should delete the worker record synchronously and continue.

(2) The error message is misleading. I'll suggest it should read something like:

resource_manager@boxB has gone missing.

Related issues

Related to Pulp - Issue #1113: If an instance of pulp_celerybeat dies unexpectedly, Pulp incorrectly tries to "cancel all tasks in its queue"CLOSED - CURRENTRELEASE<a title="Actions" class="icon-only icon-actions js-contextmenu" href="#">Actions</a>
Blocked by Pulp - Story #898: As a user I can run multiple pulp_resource_managers concurrently with all of them actively participatingCLOSED - CURRENTRELEASE

<a title="Actions" class="icon-only icon-actions js-contextmenu" href="#">Actions</a>

History

#1 Updated by bmbouter over 6 years ago

  • Related to Issue #1113: If an instance of pulp_celerybeat dies unexpectedly, Pulp incorrectly tries to "cancel all tasks in its queue" added

#2 Updated by bmbouter over 6 years ago

  • Description updated (diff)

#3 Updated by bmbouter over 6 years ago

  • Blocked by Story #898: As a user I can run multiple pulp_resource_managers concurrently with all of them actively participating added

#4 Updated by bmbouter over 6 years ago

  • Triaged changed from No to Yes

#5 Updated by bmbouter almost 3 years ago

  • Status changed from NEW to CLOSED - WONTFIX

#6 Updated by bmbouter almost 3 years ago

Pulp 2 is approaching maintenance mode, and this Pulp 2 ticket is not being actively worked on. As such, it is being closed as WONTFIX. Pulp 2 is still accepting contributions though, so if you want to contribute a fix for this ticket, please reopen or comment on it. If you don't have permissions to reopen this ticket, or you want to discuss an issue, please reach out via the developer mailing list.

#7 Updated by bmbouter almost 3 years ago

  • Tags Pulp 2 added

Also available in: Atom PDF