Project

Profile

Help

Issue #5673

closed

Ansible Plugin - Story #5517: [EPIC] Automation Hub Release Blockers

Resource reservations are not cleaned up if worker is killed

Added by osapryki about 5 years ago. Updated almost 5 years ago.

Status:
CLOSED - CURRENTRELEASE
Priority:
Normal
Assignee:
Category:
-
Sprint/Milestone:
Start date:
Due date:
Estimated time:
Severity:
4. Urgent
Version:
Platform Release:
OS:
Triaged:
Yes
Groomed:
No
Sprint Candidate:
No
Tags:
Sprint:
Sprint 62
Quarter:

Description

It pulp worker is killed while executing task that has reserved resources, resources are not cleaned up.
All subsequent task that use any of the reserved resources are assigned to the same worker (which is dead).

Steps to reproduce:

1. Spawn import_collection task from pulp_ansible (T1).
2. While task is running kill worker (W1).
3. Start another worker (W2)
4. Spawn import_collection task from pulp_ansible (T2)

Expected behavior:

Task T2 is assigned on worker W2 or Cancelled if assigned on W1 before cleanup is performed.

Actual behavior:

Task T2 is assigned on worker W1 and remains in waiting state forever.
Resources are not cleaned up.

Environment:

pulpcore + pulp_ansible.
1 worker

Actions #1

Updated by bmbouter about 5 years ago

  • Parent issue set to #5517

Adding to the blockers list for automation hub

Actions #2

Updated by osapryki about 5 years ago

Note: To reproduce this issue worker W1 and W2 should have different names.

In containerized environment container hostname used as a worker name component are unique and not persistent.

Actions #3

Updated by daviddavis about 5 years ago

  • Sprint/Milestone set to 3.0.0
  • Triaged changed from No to Yes
Actions #4

Updated by bmbouter about 5 years ago

+1 to using the various worker names.

To do that easily in pulplift don't start the workers with systemd, start them in the foreground using the same args systemd does. Looking at the ps output will show all the options. Also the worker name would be randomized with bash in the systemd unit file itself.

https://github.com/pulp/ansible-pulp/blob/master/roles/pulp-workers/templates/pulpcore-worker%40.service.j2

Actions #5

Updated by bmbouter almost 5 years ago

  • Status changed from NEW to ASSIGNED
  • Assignee set to bmbouter
Actions #6

Updated by bmbouter almost 5 years ago

  • Sprint set to Sprint 62

Adding to sprint as an Automation Hub blocker. Also it's a core bug and we need to fix it.

Added by bmbouter almost 5 years ago

Revision a1f60dd6 | View on GitHub

Release resources when worker is cleaned up

https://pulp.plan.io/issues/5673 closes #5673

Actions #7

Updated by bmbouter almost 5 years ago

  • Status changed from ASSIGNED to POST
Actions #8

Updated by bmbouter almost 5 years ago

  • Status changed from POST to MODIFIED

Added by bmbouter almost 5 years ago

Revision ce358f93 | View on GitHub

Release resources when worker is cleaned up

https://pulp.plan.io/issues/5673 closes #5673

(cherry picked from commit a1f60dd6c753d87862eb218ea9117c321b2208e4)

Actions #10

Updated by bmbouter almost 5 years ago

  • Status changed from MODIFIED to CLOSED - CURRENTRELEASE

Also available in: Atom PDF