Project

Profile

Help

Task #3121

Updated by dalley over 6 years ago

The "online" state of worker is currently dependent on two factors.    The 'online' field, present on the Worker model, and also whether the last recorded heartbeat is within the timeout interval (30 seconds) of the present time. These two conditions are set in multiple different places, and the online status of the worker based upon those two values is evaluated in multiple different places. 

 I propose a few different changes to DRY this up. 

 First, the canonical reference for whether a worker is online (based on the multiple different criteria) should be a property on the worker model named "is_online".    The worker serializer should include this value as the representation of worker state instead of the "online" field currently exposed from the model directly.   

 Second, "online=True" should not be set anywhere except for the save_heartbeat() method on the worker model.    If a worker is heartbeating, it is online.    If a worker was offline, was started, but has not yet heartbeat, it shouldn't be considered online.    There are currently a few different places where "online=True" is being set externally, but there is no need for it to be that way, and changing it would be less error-prone. 
 

 Lastly, an issue currently exists to create a new method for determining offline workers on the worker manager [0].    This task would remove the logic from this method, and instead filter by the 'is_online' property which is performing the same check. 



 [0] https://pulp.plan.io/issues/2659

Back