Project

Profile

Help

Issue #7119

Tasks stay in waiting state if worker that had resource reservation gone

Added by osapryki about 1 month ago. Updated about 1 month ago.

Status:
CLOSED - NOTABUG
Priority:
High
Assignee:
-
Category:
-
Start date:
Due date:
Estimated time:
Severity:
3. High
Version:
Platform Release:
OS:
Triaged:
Yes
Groomed:
No
Sprint Candidate:
No
Tags:
Sprint:
Sprint 77

Description

The problem appears in pulp_ansible plugin but root cause is related to how Pulp schedules tasks.

Steps to reproduce:

  1. Spawn worker with name worker-1
  2. Trigger import task, that uses resource reservation.
  3. Delete worker.
  4. Spawn worker with name worker-2
  5. Trigger another import task for the same pulp repository.

Expected behavior:

Task is assigned to worker-2

Actual behavior:

Task is trying to be assigned to worker-1, which has gone, therefore task is staying in waiting state forever.

Note 1: This behavior is critical for running pulp in containerized environment such as Kubernetes, where containers are created and destroyed periodically. Worker instances names are based on container hostname which is randomly generated and unique for each container.

Workaround: To avoid this situation worker can be run with predictable name, however it prevents pulp workers from scaling and it is not possible to run more than a single worker at a time or a limited set of workers with hardcoded names.

Note 2: It doesn't seem pulp has a mechanism to cancel jobs in waiting state by timeout.


Related issues

Related to Pulp - Issue #6449: Tasks stuck in Waiting statePOST<a title="Actions" class="icon-only icon-actions js-contextmenu" href="#">Actions</a>

History

#1 Updated by fao89 about 1 month ago

  • Priority changed from Normal to High
  • Triaged changed from No to Yes
  • Sprint set to Sprint 77

#2 Updated by osapryki about 1 month ago

  • Status changed from NEW to CLOSED - NOTABUG

Closing. Cannot reproduce.

#3 Updated by dalley 29 days ago

  • Related to Issue #6449: Tasks stuck in Waiting state added

Please register to edit this issue

Also available in: Atom PDF