Issue #4296: Stages API could deadlock when "discovering" content due to minsize - Pulp

Actions

Send by e-mail Copy link

Issue #4296

closed

Stages API could deadlock when "discovering" content due to minsize

Added by bmbouter about 6 years ago. Updated about 5 years ago.

Status:

CLOSED - CURRENTRELEASE

Priority:

Normal

Assignee:

Category:

-

Sprint/Milestone:

Start date:

Due date:

Estimated time:

Severity:

2. Medium

Version:

Platform Release:

OS:

Triaged:

Yes

Groomed:

No

Sprint Candidate:

No

Tags:

Sprint:

Quarter:

Description

This was originally described by mdellweg in this comment. What he is describing is a pipeline where content is "discovered" as it goes, similar to what issue 4294 will support.

Generally there are situations where the pipeline is holding content in a batch which is needed to in-turn fill up the batch to minsize.

The Solution¶

We add a field `does_batch` to the `DeclarativeContent` that defaults to True.
This value can be overwritten by an argument to `__init__`, and it gets cleared, when a future is requested.
When does_block is false, the batching mechanism does not wait for any further content, that is not immediately available, to prevent blocking and thereby deadlocking.

Related issues

Actions

#1

Updated by bmbouter about 6 years ago

Subject changed from Stages API could deadlock when "discovering" content to Stages API could deadlock when "discovering" content due to minsize
Description updated (diff)

Actions

#2

Updated by bmbouter about 6 years ago

mdellweg had an idea that the DeclarativeContent could be marked somehow for it to not be batched. This solves the issue by disabling batching without having to pass options to all of the stages.

Another option is to introduce a timer to Stage.batches. Overall I think that would be an important safety feature to avoiding deadlock. For example we could add a max_timer=1000 where max_timer is the maximum number of milliseconds to hold > 0 items before returning the batch even if minsize is not met. Maybe plugin writers would need to override this option also?

Actions

#3

Updated by mdellweg about 6 years ago

In my last version of a proposed solution, i used the tagged DeclarativeContent objects to stop waiting for the current batch to fill. So the would not circumvent the batching completely, just cutting them shorter.

And there is another solution i can think of:
We could introduce, like `None` as the pipeline ends marker, a `flush` marker that thaws all pending batches on the way, but not finish the loop. This would, however, require that the DCs do not reorder from step to step.

Actions

#4

Updated by bmbouter about 6 years ago

mdellweg wrote:

In my last version of a proposed solution, i used the tagged DeclarativeContent objects to stop waiting for the current batch to fill. So the would not circumvent the batching completely, just cutting them shorter.

I imagined tagging all DeclarativeContent, but you're identifying you would just tag some of them. Can you describe that more? How do you know which DeclarativeContent items could cause deadlock?

And there is another solution i can think of:
We could introduce, like `None` as the pipeline ends marker, a `flush` marker that thaws all pending batches on the way, but not finish the loop. This would, however, require that the DCs do not reorder from step to step.

We have some stages that do reordering now so I already have that assumption as a property of the Stages API.

Actions

#5

Updated by CodeHeeler about 6 years ago

Triaged changed from No to Yes

Actions

#6

Updated by mdellweg about 6 years ago

bmbouter wrote:

I imagined tagging all DeclarativeContent, but you're identifying you would just tag some of them. Can you describe that more? How do you know which DeclarativeContent items could cause deadlock?

I think it is as simple as "Can this content unit give raise to more content units? Then make it prioritized."
In the case of Debian Repositories, the discovery chain has 3 Levels: Release -> PackageIndex -> Package. So I give the priority Flag to all but the last Level. Since the hugely overwhelming part of content units are leafs in this picture, this i not a big performance concern.

If you only have one content type, and you never know in advance, whether it can produce more, you are out of luck here. Then I could think of a kind of barrier (flush marker), that gets inserted whenever the first stage has nothing more to add, and you must make sure, no content unit get's overtaken by that barrier (It might help performance a little if the barrier was semipermeable).

Actions

#7

Updated by amacdona@redhat.com about 6 years ago

Blocks Refactor #4173: Change the multilayered design to use Futures to handle nested content added

Actions

#8

Updated by mdellweg about 6 years ago

Before changing the flow control of the pipeline, i would like to change the plugin api in a way, that the abstracts the interface to the queues away from plugin writers.

RFC: https://github.com/mdellweg/pulpcore-plugin/tree/refactor_stages

Actions

#9

Updated by bherring almost 6 years ago

Related to Test #4359: 2.18.1 Testing added

Actions

#10

Updated by bherring almost 6 years ago

Related to deleted (Test #4359: 2.18.1 Testing)

Actions

#11

Updated by bmbouter almost 6 years ago

Status changed from NEW to POST
Assignee set to mdellweg

PR available at: https://github.com/pulp/pulpcore-plugin/pull/48

Actions

#12

Updated by mdellweg almost 6 years ago

Description updated (diff)

Actions

#13

Updated by mdellweg almost 6 years ago

Status changed from POST to MODIFIED

Applied in changeset commit:pulpcore-plugin|42f17e72a0b399c7eb6c61e9722da12021a278e6.

Actions

#14

Updated by daviddavis over 5 years ago

Sprint/Milestone set to 3.0.0

Actions

#15

Updated by bmbouter over 5 years ago

Tags deleted (~~Pulp 3~~)

Actions

#16

Updated by bmbouter about 5 years ago

Status changed from MODIFIED to CLOSED - CURRENTRELEASE

Actions

Send by e-mail Copy link

Also available in: Atom PDF