Actions
Issue #8988
closed`pulpcore-worker` startup should remove old worker records
Start date:
Due date:
Estimated time:
Severity:
2. Medium
Version:
Platform Release:
OS:
Triaged:
Yes
Groomed:
No
Sprint Candidate:
No
Tags:
Sprint:
Sprint 109
Quarter:
Description
To reproduce¶
- Start a pulpcore-worker against an empty database
- Observe that the status API shows that worker
- kill -9 your worker
- Observe the status API after 30 seconds no longer shows your worker
- Start your pulpcore-worker again
- Go into shell_plus and observe this query shows 2 workers present:
Workers.objects.count()
Expected result¶
In [1]: Worker.objects.count()
Out[1]: 1
Solution¶
Have workers that startup run a query that delete any workers that haven't issued a heartbeat in say 7 days. This does not need to be configurable. The 7 day idea is to make something that won't let records accumulate on the long term, but leave them in place for someone to look at the db post-mortem and still see them for investigation.
Actions
Cleanup missing worker entries after seven days
Missing worker entries are now kept in the database for seven days before being cleaned up. This gives time for post-mortem analysis.
It also switches the cleanup of those records to be a bulk operation.
closes #8988