Actions
Issue #7540
closedworkers and resource-manager go missing during large migration
Start date:
Due date:
Estimated time:
Severity:
2. Medium
Platform Release:
OS:
Triaged:
Yes
Groomed:
No
Sprint Candidate:
No
Tags:
Katello
Sprint:
Sprint 85
Quarter:
Description
When doing a large migration with ~300K rpms, my workers and resource-manager went missing. Upon further investigation, it appeared that postgresql was stuck in a large IO wait trying to commit a large transaction, for ~10-15 minutes.
My guess is that there is a very large transaction that needs to be broken up into smaller ones, probably around saving artifacts (although this is just a guess)
Updated by ttereshc over 2 years ago
- Triaged changed from No to Yes
- Sprint set to Sprint 82
Updated by jsherril@redhat.com over 2 years ago
- Priority changed from Normal to High
Updated by ttereshc over 2 years ago
- Status changed from NEW to ASSIGNED
- Assignee set to ttereshc
Updated by ttereshc over 2 years ago
- Status changed from ASSIGNED to CLOSED - CURRENTRELEASE
Resolved by multiple fixes released in 0.5.0 and 0.5.1.
The main problem was a memory leaking in createrepo_c which caused a system to use swap and slow everything down. Workers were going missing because heartbeat update was way too slow. Createrepo_ c fixes (dalley++):
Actions