Project

Profile

Help

Issue #2903

Process Recycling feature causes HTTP event notifier to be unreliable

Added by bmbouter over 2 years ago. Updated 9 months ago.

Status:
CLOSED - CURRENTRELEASE
Priority:
High
Assignee:
Category:
-
Sprint/Milestone:
-
Start date:
Due date:
Severity:
3. High
Version:
Platform Release:
2.14.0
Blocks Release:
OS:
Backwards Incompatible:
No
Triaged:
Yes
Groomed:
No
Sprint Candidate:
No
Tags:
Pulp 2
QA Contact:
Complexity:
Smash Test:
Verified:
Yes
Verification Required:
No
Sprint:
Sprint 22

Description

For some non-negligible probability, the HTTP event notifier will fail to send its notification when process recycling is enabled.

To reproduce:

1. Enable Process Recycling
2. Configure the event notifier
3. Ensure (1) and (2) are both working as expected
4. Dispatch lots of tasks
5. Observe that not all events are notified

History

#1 Updated by bmbouter over 2 years ago

  • Subject changed from HTTP event notifier leaks one thread with every message sent to Process Recycling feature causes HTTP event notifier to be unreliable
  • Description updated (diff)
  • Priority changed from Low to High
  • Severity changed from 2. Medium to 3. High

#2 Updated by bmbouter over 2 years ago

The root cause of this is the use of Daemon threads in the http event notifier. Here is the problematic workflow:

1. a pulp task starts running
2. an http notifier is requested with a call to event.http.handle_event
3. It spawns a thread "here:https://github.com/pulp/pulp/blob/a6f8f83514fb7255d540a61028a842860a8fb422/server/pulp/server/event/http.py#L33
4. The celery task completes process recycling occurs since it's enabled which kills the daemon thread since daemon threads won't hold a process open like non-daemon thread will.
5. The _send_post task may or may not complete because steps 2-4 occur faster than the other thread can reliably execute.
6. The message is never sent

#4 Updated by bmbouter over 2 years ago

To fix this I recommend we stop using threading at all here. I believe using threading here made sense when Pulp ran entirely in httpd in Pulp 2.3-. With the tasking system it makes no sense to try to send the notification asynchronously. We should combine all of the send logic into the handle_event() call and remove threading entirely from the http event notifier module.

#5 Updated by cduryee over 2 years ago

that sounds good to me as well. It's already done in the task anyway so there's no harm in not threading.

#6 Updated by ttereshc over 2 years ago

  • Sprint/Milestone set to 41
  • Triaged changed from No to Yes

#7 Updated by mhrivnak over 2 years ago

The reason threading was kept is so that an unresponsive web server could not stall Pulp tasks. Consider a web server that takes a long time to respond, or is in poor health and takes indefinitely. We need some way to make sure that does not cause Pulp tasks to grind to a halt.

Another way to do this, at least with the http notifier, is to make sure there is a reasonably short timeout on the request. I think that could be a reasonable alternative to the threading approach.

#8 Updated by pcreech over 2 years ago

  • Status changed from NEW to POST
  • Assignee set to pcreech

#9 Updated by pcreech over 2 years ago

  • Status changed from POST to MODIFIED

#10 Updated by pthomas@redhat.com over 2 years ago

  • Smash Test set to 714

#11 Updated by pcreech over 2 years ago

  • Platform Release set to 2.14.0

#12 Updated by pcreech over 2 years ago

  • Status changed from MODIFIED to ON_QA

#14 Updated by pcreech over 2 years ago

  • Status changed from ON_QA to CLOSED - CURRENTRELEASE

#15 Updated by kersom about 2 years ago

  • Verified changed from No to Yes

This issue was manually verified.

See: Smash 714

#16 Updated by bmbouter almost 2 years ago

  • Sprint set to Sprint 22

#17 Updated by bmbouter almost 2 years ago

  • Sprint/Milestone deleted (41)

#18 Updated by bmbouter 9 months ago

  • Tags Pulp 2 added

Please register to edit this issue

Also available in: Atom PDF