Project

Profile

Help

Issue #8988

closed

`pulpcore-worker` startup should remove old worker records

Added by bmbouter about 1 year ago. Updated 10 months ago.

Status:
CLOSED - CURRENTRELEASE
Priority:
Low
Assignee:
Category:
-
Sprint/Milestone:
Start date:
Due date:
Estimated time:
Severity:
2. Medium
Version:
Platform Release:
OS:
Triaged:
Yes
Groomed:
No
Sprint Candidate:
No
Tags:
Sprint:
Sprint 109
Quarter:

Description

To reproduce

  1. Start a pulpcore-worker against an empty database
  2. Observe that the status API shows that worker
  3. kill -9 your worker
  4. Observe the status API after 30 seconds no longer shows your worker
  5. Start your pulpcore-worker again
  6. Go into shell_plus and observe this query shows 2 workers present: Workers.objects.count()

Expected result

In [1]: Worker.objects.count()
Out[1]: 1

Solution

Have workers that startup run a query that delete any workers that haven't issued a heartbeat in say 7 days. This does not need to be configurable. The 7 day idea is to make something that won't let records accumulate on the long term, but leave them in place for someone to look at the db post-mortem and still see them for investigation.

Actions #1

Updated by dkliban@redhat.com about 1 year ago

  • Priority changed from Normal to Low
  • Triaged changed from No to Yes
  • Sprint set to Sprint 100
Actions #2

Updated by rchan about 1 year ago

  • Sprint changed from Sprint 100 to Sprint 101
Actions #3

Updated by ipanova@redhat.com about 1 year ago

  • Sprint changed from Sprint 101 to Sprint 102
Actions #4

Updated by rchan about 1 year ago

  • Sprint changed from Sprint 102 to Sprint 103
Actions #5

Updated by rchan about 1 year ago

  • Sprint changed from Sprint 103 to Sprint 104
Actions #6

Updated by rchan about 1 year ago

  • Sprint changed from Sprint 104 to Sprint 105
Actions #7

Updated by rchan about 1 year ago

  • Sprint changed from Sprint 105 to Sprint 106
Actions #8

Updated by rchan 12 months ago

  • Sprint changed from Sprint 106 to Sprint 107
Actions #9

Updated by rchan 11 months ago

  • Sprint changed from Sprint 107 to Sprint 108
Actions #10

Updated by rchan 11 months ago

  • Sprint changed from Sprint 108 to Sprint 109
Actions #11

Updated by bmbouter 11 months ago

  • Status changed from NEW to ASSIGNED
  • Assignee set to bmbouter
Actions #12

Updated by pulpbot 11 months ago

  • Status changed from ASSIGNED to POST

Added by bmbouter 11 months ago

Revision f23c3180

Cleanup missing worker entries after seven days

Missing worker entries are now kept in the database for seven days before being cleaned up. This gives time for post-mortem analysis.

It also switches the cleanup of those records to be a bulk operation.

closes #8988

Actions #13

Updated by bmbouter 11 months ago

  • Status changed from POST to MODIFIED
Actions #14

Updated by pulpbot 10 months ago

  • Sprint/Milestone set to 3.17.0
Actions #15

Updated by pulpbot 10 months ago

  • Status changed from MODIFIED to CLOSED - CURRENTRELEASE

Also available in: Atom PDF