Project

Profile

Help

Task #3121

Simplify tracking of worker 'online' state

Added by dalley about 3 years ago. Updated about 1 year ago.

Status:
CLOSED - CURRENTRELEASE
Priority:
Normal
Assignee:
Category:
-
Sprint/Milestone:
Start date:
Due date:
% Done:

100%

Estimated time:
Platform Release:
Groomed:
Yes
Sprint Candidate:
Yes
Tags:
Sprint:
Sprint 34
Quarter:

Description

The "online" state of worker is currently dependent on two factors. The 'online' field, present on the Worker model, and also whether the last recorded heartbeat is within the timeout interval (30 seconds) of the present time. These two conditions are set in multiple different places, and the online status of the worker based upon those two values is evaluated in multiple different places.

I propose a few different changes to DRY this up.

The canonical reference for whether a worker is online (based on the multiple different criteria) should be a property on the worker model named "online". The field currently named "online" should be renamed to "gracefully_stopped". The "online" property should then check both the current-ness of the timestamp as well as the "gracefully_stopped" field The worker serializer should be updated to include both of these. Currently there exist several parts of the code which set or reference the "online" field, and these should be removed.

The "gracefully_stopped" field should be True only when the worker is shutdown via normal procedures. During operation, and after a hard-kill (e.g. OOM), this value should be false. The default value of this field should be "True", and it should be updated to "False" whenever "save_heartbeat()" is called for that worker, so that the value will be set to "False" during normal operation. It should be set to True as part of the shutdown sequence for the worker.


Checklist


Related issues

Related to Pulp - Story #3143: As an authenticated user, I can filter workers.CLOSED - CURRENTRELEASE

<a title="Actions" class="icon-only icon-actions js-contextmenu" href="#">Actions</a>

Associated revisions

Revision f852da10 View on GitHub
Added by dalley almost 3 years ago

Simplify worker 'online' state tracking

closes #3121 https://pulp.plan.io/issues/3121

Revision f852da10 View on GitHub
Added by dalley almost 3 years ago

Simplify worker 'online' state tracking

closes #3121 https://pulp.plan.io/issues/3121

Revision 501cdd8a View on GitHub
Added by dalley almost 3 years ago

Additional cleanup and refactoring

re: #3121 https://pulp.plan.io/issues/3121

Revision 501cdd8a View on GitHub
Added by dalley almost 3 years ago

Additional cleanup and refactoring

re: #3121 https://pulp.plan.io/issues/3121

History

#1 Updated by dalley about 3 years ago

  • Tags Pulp 3 added

#2 Updated by dalley about 3 years ago

  • Description updated (diff)

#3 Updated by bmbouter about 3 years ago

This story is well written, and the 'is_online' property changes all make perfect sense.

In terms of keeping the 'online' boolean itself that is really just for the graceful shutdown case right? If we got rid of it entirely the only issue users would have is that when shutting down a worker gracefully is_online would show True for an additional 30 seconds right?

#4 Updated by dalley about 3 years ago

That's my understanding, yes

#5 Updated by dalley about 3 years ago

Actually, there is another scenario. When a worker first goes offline we probably want to log some message about it. If this happens in a non-gentle way, we need some way to distinguish the workers which are supposed to be online (but not heartbeating) apart from the others so that we can print a message once, and only once, since we are keeping the old worker records around instead of deleting them. This would be accomplished by setting online=False after logging the message.

#6 Updated by bmbouter about 3 years ago

Having the 'is_online' and 'online' will be confusing for developers. What about renaming 'online' to be 'gracefully_shutdown' which we could record to True during graceful shutdown and set to False in all other places? It would also be included in the viewset and would be user read-only. I like that better because it creates more value (knowledge) for the user and solves the user case.

What do you think about ^?

#7 Updated by dalley about 3 years ago

That sounds good to me

#8 Updated by dalley about 3 years ago

  • Description updated (diff)

#9 Updated by dkliban@redhat.com about 3 years ago

bmbouter wrote:

Having the 'is_online' and 'online' will be confusing for developers. What about renaming 'online' to be 'gracefully_shutdown' which we could record to True during graceful shutdown and set to False in all other places? It would also be included in the viewset and would be user read-only. I like that better because it creates more value (knowledge) for the user and solves the user case.

What do you think about ^?

I would prefer 'online' and 'gracefully_stopped'.

#10 Updated by bmbouter about 3 years ago

  • Description updated (diff)
  • Groomed changed from No to Yes
  • Sprint Candidate changed from No to Yes

I'm +1 on 'gracefully_stopped' also because it more clearly past tense and seems like simpler language.

I think this story is good to go, so I'm grooming.

#11 Updated by mhrivnak about 3 years ago

  • Sprint/Milestone set to 47

#12 Updated by dalley about 3 years ago

  • Status changed from NEW to ASSIGNED
  • Assignee set to dalley

#13 Updated by daviddavis about 3 years ago

  • Related to Story #3143: As an authenticated user, I can filter workers. added

#14 Updated by rchan about 3 years ago

  • Sprint/Milestone changed from 47 to 48

#15 Updated by daviddavis about 3 years ago

  • Tags Pulp 3 MVP added

#16 Updated by rchan about 3 years ago

  • Sprint/Milestone changed from 48 to 52

#17 Updated by dalley about 3 years ago

  • Status changed from ASSIGNED to NEW

I still plan to do this but it's been moved to the very end of my list of priorities by other work.

#18 Updated by dalley about 3 years ago

  • Assignee deleted (dalley)

#19 Updated by rchan about 3 years ago

  • Sprint/Milestone changed from 52 to 53

#20 Updated by dalley almost 3 years ago

  • Status changed from NEW to ASSIGNED

#21 Updated by dalley almost 3 years ago

  • Assignee set to dalley

#22 Updated by jortel@redhat.com almost 3 years ago

  • Sprint/Milestone changed from 53 to 54

#23 Updated by rchan almost 3 years ago

  • Sprint/Milestone changed from 54 to 56

#24 Updated by dalley almost 3 years ago

  • Assignee deleted (dalley)
  • Sprint/Milestone deleted (56)

#25 Updated by dalley almost 3 years ago

  • Sprint/Milestone set to 56

#26 Updated by dalley almost 3 years ago

  • Assignee set to dalley

#27 Updated by daviddavis almost 3 years ago

Added a checklist item to create a testcase in pulp-smash ;)

#28 Updated by bmbouter almost 3 years ago

  • Sprint set to Sprint 33

#29 Updated by bmbouter almost 3 years ago

  • Sprint/Milestone deleted (56)

#30 Updated by jortel@redhat.com almost 3 years ago

  • Sprint changed from Sprint 33 to Sprint 34

#31 Updated by dalley almost 3 years ago

  • Status changed from ASSIGNED to POST

#32 Updated by dalley almost 3 years ago

  • Status changed from POST to MODIFIED
  • % Done changed from 0 to 100

#33 Updated by dkliban@redhat.com almost 3 years ago

  • Sprint/Milestone set to 3.0.0

#34 Updated by bmbouter over 1 year ago

  • Tags deleted (Pulp 3, Pulp 3 MVP)

#35 Updated by bmbouter about 1 year ago

  • Status changed from MODIFIED to CLOSED - CURRENTRELEASE

Please register to edit this issue

Also available in: Atom PDF