Project

Profile

Help

Issue #5673

Ansible Plugin - Story #5517: [EPIC] Automation Hub Release Blockers

Resource reservations are not cleaned up if worker is killed

Added by osapryki 3 months ago. Updated about 1 month ago.

Status:
CLOSED - CURRENTRELEASE
Priority:
Normal
Assignee:
Category:
-
Sprint/Milestone:
Start date:
Due date:
Severity:
4. Urgent
Version:
Platform Release:
Blocks Release:
OS:
Backwards Incompatible:
No
Triaged:
Yes
Groomed:
No
Sprint Candidate:
No
Tags:
QA Contact:
Complexity:
Smash Test:
Verified:
No
Verification Required:
No
Sprint:
Sprint 62

Description

It pulp worker is killed while executing task that has reserved resources, resources are not cleaned up.
All subsequent task that use any of the reserved resources are assigned to the same worker (which is dead).

Steps to reproduce:

1. Spawn import_collection task from pulp_ansible (T1).
2. While task is running kill worker (W1).
3. Start another worker (W2)
4. Spawn import_collection task from pulp_ansible (T2)

Expected behavior:

Task T2 is assigned on worker W2 or Cancelled if assigned on W1 before cleanup is performed.

Actual behavior:

Task T2 is assigned on worker W1 and remains in waiting state forever.
Resources are not cleaned up.

Environment:

pulpcore + pulp_ansible.
1 worker

Associated revisions

Revision a1f60dd6 View on GitHub
Added by bmbouter 2 months ago

Release resources when worker is cleaned up

https://pulp.plan.io/issues/5673
closes #5673

Revision ce358f93 View on GitHub
Added by bmbouter 2 months ago

Release resources when worker is cleaned up

https://pulp.plan.io/issues/5673
closes #5673

(cherry picked from commit a1f60dd6c753d87862eb218ea9117c321b2208e4)

History

#1 Updated by bmbouter 3 months ago

  • Parent task set to #5517

Adding to the blockers list for automation hub

#2 Updated by osapryki 3 months ago

Note: To reproduce this issue worker W1 and W2 should have different names.

In containerized environment container hostname used as a worker name component are unique and not persistent.

#3 Updated by daviddavis 3 months ago

  • Sprint/Milestone set to 3.0.0
  • Triaged changed from No to Yes

#4 Updated by bmbouter 2 months ago

+1 to using the various worker names.

To do that easily in pulplift don't start the workers with systemd, start them in the foreground using the same args systemd does. Looking at the ps output will show all the options. Also the worker name would be randomized with bash in the systemd unit file itself.

https://github.com/pulp/ansible-pulp/blob/master/roles/pulp-workers/templates/pulpcore-worker%40.service.j2

#5 Updated by bmbouter 2 months ago

  • Status changed from NEW to ASSIGNED
  • Assignee set to bmbouter

#6 Updated by bmbouter 2 months ago

  • Sprint set to Sprint 62

Adding to sprint as an Automation Hub blocker. Also it's a core bug and we need to fix it.

#7 Updated by bmbouter 2 months ago

  • Status changed from ASSIGNED to POST

#8 Updated by bmbouter 2 months ago

  • Status changed from POST to MODIFIED

#10 Updated by bmbouter about 1 month ago

  • Status changed from MODIFIED to CLOSED - CURRENTRELEASE

Please register to edit this issue

Also available in: Atom PDF