https://pulp.plan.io/https://pulp.plan.io/favicon.ico2014-12-18T21:20:44ZPulpRPM Support - Story #20: As a user, my applicability data is calculated in parallelhttps://pulp.plan.io/issues/20?journal_id=822014-12-18T21:20:44Zrbarlow
<ul><li><strong>Project</strong> changed from <i>22</i> to <i>RPM Support</i></li></ul> RPM Support - Story #20: As a user, my applicability data is calculated in parallelhttps://pulp.plan.io/issues/20?journal_id=46732015-06-04T14:58:20Zrbarlow
<ul><li><strong>Groomed</strong> set to <i>No</i></li><li><strong>Sprint Candidate</strong> set to <i>Yes</i></li></ul> RPM Support - Story #20: As a user, my applicability data is calculated in parallelhttps://pulp.plan.io/issues/20?journal_id=46742015-06-04T15:00:20Zrbarlow
<ul></ul><p>It might be worth thinking about whether we can make a patch that will apply cleanly against 2.4 since there are users who are having problems with DB cursor timeouts. Patching against 2.6 might also be fine if we are comfortable requiring users to upgrade to a newer Pulp to fix this.</p> RPM Support - Story #20: As a user, my applicability data is calculated in parallelhttps://pulp.plan.io/issues/20?journal_id=46852015-06-04T18:05:31Zrbarlow
<ul></ul><p>On 06/04/2015 11:00 AM, Pulp wrote:</p>
<blockquote>
<p>It might be worth thinking about whether we can make a patch that will<br>
apply cleanly against 2.4 since there are users who are having problems<br>
with DB cursor timeouts. Patching against 2.6 might also be fine if we<br>
are comfortable requiring users to upgrade to a newer Pulp to fix this.</p>
</blockquote>
<p>On second thought, this might have to be done with "spawned tasks" which<br>
would change the API to the task. One way to work around this not being<br>
backwards-incompatible would be to add an optional boolean to the API<br>
call that lets the user state whether they want to do the calculation in<br>
parallel or not, and if the bool isn't provided we default to the<br>
current behavior. Then, with Pulp 3.0 we can just change to always doing<br>
it in parallel and drop the boolean.</p>
<p>--<br>
Randy Barlow</p> RPM Support - Story #20: As a user, my applicability data is calculated in parallelhttps://pulp.plan.io/issues/20?journal_id=47362015-06-08T12:16:20Zmhrivnakmhrivnak@redhat.com
<ul><li><strong>Priority</strong> changed from <i>Normal</i> to <i>High</i></li></ul> RPM Support - Story #20: As a user, my applicability data is calculated in parallelhttps://pulp.plan.io/issues/20?journal_id=47392015-06-08T12:38:50Zdkliban@redhat.com
<ul></ul><p>Here is a possible implementation:</p>
<p>Define TaskMonitorTask as a regular celery task that takes two parameters: 'parent_task_id' and 'tasks'. 'tasks' is a list of task id's for tasks that need to be monitored. The task will check the status of all tasks in the list and then update the status of parent task. If not all of the tasks are in a final state, the task dispatches itself again with a list of remaining tasks and the same parent task id. Each time this task is dispatched with a delay of 5 minutes or another configurable value.</p>
<p>Define RepoProfileApplicabilityCalculation task as a celery task that takes an existing repo profile applicability and perform the work here [0]</p>
<p>Create a new RepoApplicabilityCalculationTask as a Pulp Task that will dispatch 1 RepoProfileApplicabilityCalculation task for each repo applicability profile that needs to be updated. Then it dispatches TaskMonitorTask and passes it the list of RepoProfileApplicabilityCalculation tasks that were dispatched and the id of itself (RepoApplicabilityCalculationTask)</p>
<p>[0] <a href="https://github.com/pulp/pulp/blob/2.6-dev/server/pulp/server/managers/consumer/applicability.py#L141:L158" class="external">https://github.com/pulp/pulp/blob/2.6-dev/server/pulp/server/managers/consumer/applicability.py#L141:L158</a></p> RPM Support - Story #20: As a user, my applicability data is calculated in parallelhttps://pulp.plan.io/issues/20?journal_id=47432015-06-08T13:24:39Zmhrivnakmhrivnak@redhat.com
<ul><li><strong>Groomed</strong> changed from <i>No</i> to <i>Yes</i></li></ul><p>Please review the final plan with the team before implementing.</p>
<p>This diff must apply cleanly on 2.6, but may have to be released with 2.7.</p> RPM Support - Story #20: As a user, my applicability data is calculated in parallelhttps://pulp.plan.io/issues/20?journal_id=47732015-06-08T21:07:44Zbmbouterbmbouter@redhat.com
<ul></ul><p>@dkliban When you say RepoApplicabilityCalculationTask is a Pulp task do you mean it inherits from <a href="https://github.com/pulp/pulp/blob/432d068e3f5ffb8884c8d96bdd55b90d2ca59cd2/server/pulp/server/async/tasks.py#L327" class="external">Pulp's base Task</a>? If so then it will be auto-marked as completed as soon as it is finished because of the <a href="https://github.com/pulp/pulp/blob/432d068e3f5ffb8884c8d96bdd55b90d2ca59cd2/server/pulp/server/async/tasks.py#L395" class="external">on_success</a> or <a href="https://github.com/pulp/pulp/blob/432d068e3f5ffb8884c8d96bdd55b90d2ca59cd2/server/pulp/server/async/tasks.py#L439" class="external">on_failure</a> handlers that provides.</p>
<p>That would need to be somehow disabled and the final call to TaskMonitorTask would need to set it specifically. What can we do in that area?</p>
<p>Also one other important point to consider is using apply_async versus apply_async_with_reservation. Does the task RepoApplicabilityCalculationTask need a reservation to ensure a repo operation doesn't happen underneath it? What do you think?</p> RPM Support - Story #20: As a user, my applicability data is calculated in parallelhttps://pulp.plan.io/issues/20?journal_id=56232015-07-21T18:41:20Zdkliban@redhat.com
<ul><li><strong>Status</strong> changed from <i>NEW</i> to <i>ASSIGNED</i></li><li><strong>Assignee</strong> set to <i>dkliban@redhat.com</i></li></ul><p>I looked into using Celery chords to do this work, however, I have discovered that Celery chords rely on using the results backend [0]. Since we are trying to move away from depending on the results backend, Brian and I have come up with a plan to introduce an implementation of ParallelTasks using the TaskStatus in the database. I'll update this story once I have the plan fully written out.</p>
<p>[0] <a href="http://blog.untrod.com/2015/03/how-celery-chord-synchronization-works.html" class="external">http://blog.untrod.com/2015/03/how-celery-chord-synchronization-works.html</a></p> RPM Support - Story #20: As a user, my applicability data is calculated in parallelhttps://pulp.plan.io/issues/20?journal_id=56272015-07-21T21:57:11Zrbarlow
<ul></ul><p>On 07/21/2015 02:41 PM, Pulp wrote:</p>
<blockquote>
<p>Since we are trying to move away from depending on the results backend</p>
</blockquote>
<p>IMO, it's OK to use the broker as a results backend for this purpose.<br>
Have you considered that since it may be easier?</p> RPM Support - Story #20: As a user, my applicability data is calculated in parallelhttps://pulp.plan.io/issues/20?journal_id=58722015-08-21T19:55:56Zdkliban@redhat.com
<ul><li><strong>Blocked by</strong> <i><a class="issue tracker-3 status-9 priority-6 priority-default closed" href="/issues/1205">Story #1205</a>: As a developer I can dispatch a task that can dispatch a group of tasks</i> added</li></ul> RPM Support - Story #20: As a user, my applicability data is calculated in parallelhttps://pulp.plan.io/issues/20?journal_id=58732015-08-21T19:56:15Zdkliban@redhat.com
<ul><li><strong>Blocks</strong> <i><a class="issue tracker-3 status-11 priority-6 priority-default closed" href="/issues/1206">Story #1206</a>: As an API user, I can get summary status for a task group</i> added</li></ul> RPM Support - Story #20: As a user, my applicability data is calculated in parallelhttps://pulp.plan.io/issues/20?journal_id=58752015-08-21T19:56:53Zdkliban@redhat.com
<ul><li><strong>Blocks</strong> deleted (<i><a class="issue tracker-3 status-11 priority-6 priority-default closed" href="/issues/1206">Story #1206</a>: As an API user, I can get summary status for a task group</i>)</li></ul> RPM Support - Story #20: As a user, my applicability data is calculated in parallelhttps://pulp.plan.io/issues/20?journal_id=58782015-08-21T19:57:06Zdkliban@redhat.com
<ul><li><strong>Blocked by</strong> <i><a class="issue tracker-3 status-11 priority-6 priority-default closed" href="/issues/1206">Story #1206</a>: As an API user, I can get summary status for a task group</i> added</li></ul> RPM Support - Story #20: As a user, my applicability data is calculated in parallelhttps://pulp.plan.io/issues/20?journal_id=61122015-09-14T12:44:21Zdkliban@redhat.com
<ul><li><strong>Status</strong> changed from <i>ASSIGNED</i> to <i>NEW</i></li></ul> RPM Support - Story #20: As a user, my applicability data is calculated in parallelhttps://pulp.plan.io/issues/20?journal_id=64062015-10-01T17:48:17Zmhrivnakmhrivnak@redhat.com
<ul><li><strong>Assignee</strong> deleted (<del><i>dkliban@redhat.com</i></del>)</li><li><strong>Platform Release</strong> set to <i>2.8.0</i></li></ul> RPM Support - Story #20: As a user, my applicability data is calculated in parallelhttps://pulp.plan.io/issues/20?journal_id=72372015-12-02T20:05:17Zmhrivnakmhrivnak@redhat.com
<ul><li><strong>Status</strong> changed from <i>NEW</i> to <i>ASSIGNED</i></li><li><strong>Assignee</strong> set to <i>dkliban@redhat.com</i></li></ul> RPM Support - Story #20: As a user, my applicability data is calculated in parallelhttps://pulp.plan.io/issues/20?journal_id=74522015-12-16T18:48:41Zdkliban@redhat.com
<ul><li><strong>Status</strong> changed from <i>ASSIGNED</i> to <i>POST</i></li></ul><p><a href="https://github.com/pulp/pulp/pull/2235" class="external">https://github.com/pulp/pulp/pull/2235</a></p> RPM Support - Story #20: As a user, my applicability data is calculated in parallelhttps://pulp.plan.io/issues/20?journal_id=74582015-12-16T19:51:23Zdkliban@redhat.com
<ul><li><strong>Status</strong> changed from <i>POST</i> to <i>MODIFIED</i></li><li><strong>% Done</strong> changed from <i>0</i> to <i>100</i></li></ul><p>Applied in changeset <a class="changeset" title="Parallelizes applicability regeneration for updated repository This patch provides a new Celery ..." href="https://pulp.plan.io/projects/pulp/repository/pulp/revisions/0ecc2dfdb9a2d5e1af2ed39c71ba387b2a2565b4">pulp:pulp|0ecc2dfdb9a2d5e1af2ed39c71ba387b2a2565b4</a>.</p> RPM Support - Story #20: As a user, my applicability data is calculated in parallelhttps://pulp.plan.io/issues/20?journal_id=75522015-12-22T17:25:56Zdkliban@redhat.com
<ul><li><strong>Blocked by</strong> deleted (<i><a class="issue tracker-3 status-9 priority-6 priority-default closed" href="/issues/1205">Story #1205</a>: As a developer I can dispatch a task that can dispatch a group of tasks</i>)</li></ul> RPM Support - Story #20: As a user, my applicability data is calculated in parallelhttps://pulp.plan.io/issues/20?journal_id=88782016-02-11T21:25:50Zrbarlow
<ul><li><strong>Status</strong> changed from <i>MODIFIED</i> to <i>5</i></li></ul> RPM Support - Story #20: As a user, my applicability data is calculated in parallelhttps://pulp.plan.io/issues/20?journal_id=101192016-03-23T18:47:44Zdkliban@redhat.com
<ul><li><strong>Status</strong> changed from <i>5</i> to <i>CLOSED - CURRENTRELEASE</i></li></ul> RPM Support - Story #20: As a user, my applicability data is calculated in parallelhttps://pulp.plan.io/issues/20?journal_id=407922019-04-15T21:22:45Zbmbouterbmbouter@redhat.com
<ul><li><strong>Tags</strong> <i>Pulp 2</i> added</li></ul>