Project

Profile

Help

Task #3121

closed

Simplify tracking of worker 'online' state

Added by dalley over 6 years ago. Updated over 4 years ago.

Status:
CLOSED - CURRENTRELEASE
Priority:
Normal
Assignee:
Category:
-
Sprint/Milestone:
Start date:
Due date:
% Done:

100%

Estimated time:
Platform Release:
Groomed:
Yes
Sprint Candidate:
Yes
Tags:
Sprint:
Sprint 34
Quarter:

Description

The "online" state of worker is currently dependent on two factors. The 'online' field, present on the Worker model, and also whether the last recorded heartbeat is within the timeout interval (30 seconds) of the present time. These two conditions are set in multiple different places, and the online status of the worker based upon those two values is evaluated in multiple different places.

I propose a few different changes to DRY this up.

The canonical reference for whether a worker is online (based on the multiple different criteria) should be a property on the worker model named "online". The field currently named "online" should be renamed to "gracefully_stopped". The "online" property should then check both the current-ness of the timestamp as well as the "gracefully_stopped" field The worker serializer should be updated to include both of these. Currently there exist several parts of the code which set or reference the "online" field, and these should be removed.

The "gracefully_stopped" field should be True only when the worker is shutdown via normal procedures. During operation, and after a hard-kill (e.g. OOM), this value should be false. The default value of this field should be "True", and it should be updated to "False" whenever "save_heartbeat()" is called for that worker, so that the value will be set to "False" during normal operation. It should be set to True as part of the shutdown sequence for the worker.


Related issues

Related to Pulp - Story #3143: As an authenticated user, I can filter workers.CLOSED - CURRENTRELEASEdalley

Actions

Also available in: Atom PDF