Story #4016: As a plugin writer I have the "batches" interface for the Stages API - Pulp

Actions

Send by e-mail Copy link

Story #4016

closed

As a plugin writer I have the "batches" interface for the Stages API

Added by bmbouter over 6 years ago. Updated about 5 years ago.

Status:

CLOSED - CURRENTRELEASE

Priority:

Normal

Assignee:

gmbnomis

Category:

Sprint/Milestone:

3.0.0

Start date:

Due date:

% Done:

100%

Estimated time:

Platform Release:

Groomed:

Sprint Candidate:

Tags:

Sprint:

Quarter:

Description

This story should:

1. Add the "batches() interface" to the plugin API
2. Refactor the existing Stages API stages to use it

As quoted from the email link below:

https://www.redhat.com/archives/pulp-dev/2018-September/msg00013.html

The trigger for the discussion was to get rid of boilerplate code like
this [0], [1] to handle batches in the stages API. This becomes a
single line [2] when using an asynchronous generator [3]. Adding the
`batches()` async generator to Pulp core would simplify existing
stages and ease implementation of stages in plugins.

[0] https://github.com/pulp/pulp/blob/631031e38270c5c7c2b2289ff4ab87a058447c5e/plugin/pulpcore/plugin/stages/content_unit_stages.py#L47-L59
[1] https://github.com/pulp/pulp/blob/631031e38270c5c7c2b2289ff4ab87a058447c5e/plugin/pulpcore/plugin/stages/artifact_stages.py#L48-L60
[2] https://github.com/gmbnomis/pulp_cookbook/blob/ca4882cecab16995c5713d27131da8112a5f5a0c/pulp_cookbook/app/tasks/synchronizing.py#L98
[3] https://github.com/gmbnomis/pulp_cookbook/blob/d44ed593925b78c046e1b568810b15acbdad5ac4/pulp_cookbook/app/tasks/synchronizing.py#L26

Actions

Copy link

Updated by bmbouter over 6 years ago

Subject changed from As a plugin writer I have the "batcher interface" fr to As a plugin writer I have the "batches" interface for the Stages API

Actions

Copy link

Updated by gmbnomis over 6 years ago

I could not find a way to assign this to myself(?)

I can work on this topic.

One question on ContentUnitSaver._pre_save and ContentUnitSaver._post_save: The documentation mentions that "... one item in batch is None indicating it is the end of the pipeline". Is this just a side-effect of the current implementation or is this "None" needed (for finalization)?

Actions

Copy link

Updated by bmbouter over 6 years ago

Status changed from NEW to ASSIGNED
Assignee set to gmbnomis

I updated your user permission so you can do a lot more in Redmine now (including take issues as assigned).

It's a side-effect of the Stages API finalizing the pipeline with None. As batches are passed into the hooks, this None value is not filtered out. It doesn't benefit the hook in my usage in pulp_rpm [0], but if None is filtered out there isn't a way for a batch to know it's the last one. I have no use case for that need though.

I'm open to changes here. How could this be better?

[0]: https://github.com/pulp/pulp_rpm/blob/master/pulp_rpm/app/tasks/synchronizing.py#L275-L314

Actions

Copy link

Updated by gmbnomis over 6 years ago

bmbouter wrote:

I updated your user permission so you can do a lot more in Redmine now (including take issues as assigned).

Thanks a lot!

It's a side-effect of the Stages API finalizing the pipeline with None. As batches are passed into the hooks, this None value is not filtered out. It doesn't benefit the hook in my usage in pulp_rpm [0], but if None is filtered out there isn't a way for a batch to know it's the last one. I have no use case for that need though.

I'm open to changes here. How could this be better?

Currently, the None is not necessarily part of the last batch with declarative content. It may also happen that the None is the only element of the last batch. A subclass overriding these functions which needs some sort of "finalizing" must cope with this situation as well.

If we ever need such a functionality, we could mimic this and implement a _finalize() hook that is called when all batches have been processed. Currently, I would like to just drop the "None" part from the documentation of _pre_save and _post_save.

Actions

Copy link

Updated by bmbouter over 6 years ago

gmbnomis wrote:

Currently, the None is not necessarily part of the last batch with declarative content. It may also happen that the None is the only element of the last batch. A subclass overriding these functions which needs some sort of "finalizing" must cope with this situation as well.

Agreed

If we ever need such a functionality, we could mimic this and implement a _finalize() hook that is called when all batches have been processed. Currently, I would like to just drop the "None" part from the documentation of _pre_save and _post_save.

Agreed and +1 that sounds fine.

Actions

Copy link