Project

Profile

Help

Issue #8637

Issue #8912: [EPIC] Issues with the traditional tasking system

Possible race condition where task's reservations get deleted

Added by daviddavis 5 months ago. Updated 3 months ago.

Status:
CLOSED - CURRENTRELEASE
Priority:
High
Assignee:
Category:
-
Sprint/Milestone:
Start date:
Due date:
Estimated time:
Severity:
3. High
Version:
Platform Release:
OS:
Triaged:
Yes
Groomed:
No
Sprint Candidate:
No
Tags:
Sprint:
Sprint 99
Quarter:

Description

This was something I noticed while working on #8603. I haven't actually reproduced it so it's purely theoretical.

The TaskReservedResource has a cascade delete on its ReservedResource FK, which could cause a race condition:

  1. Task 1 confirms that the reserved resource has no other tasks
  2. Task 2 reserves the resource
  3. Task 1 calls delete which cascades and deletes Task 2's reservations of the resource

Related issues

Related to Pulp - Issue #8603: possible tasking race condition: update or delete on table "core_reservedresource" violates foreign key constraint "core_taskreservedres_resource_id_ee0b7c62_fk_core_rese" on table "core_taskreservedresource" CLOSED - CURRENTRELEASE<a title="Actions" class="icon-only icon-actions js-contextmenu" href="#">Actions</a>
Related to Pulp - Issue #8568: Overall task status remains in 'running' state while all the reports are in the final stateCLOSED - WONTFIX<a title="Actions" class="icon-only icon-actions js-contextmenu" href="#">Actions</a>

Associated revisions

Revision cf2cde7e View on GitHub
Added by daviddavis 3 months ago

Fix race condition during reserved resource cleanup

fixes #8637

History

#1 Updated by daviddavis 5 months ago

  • Related to Issue #8603: possible tasking race condition: update or delete on table "core_reservedresource" violates foreign key constraint "core_taskreservedres_resource_id_ee0b7c62_fk_core_rese" on table "core_taskreservedresource" added

#2 Updated by daviddavis 5 months ago

  • Description updated (diff)

#3 Updated by dalley 5 months ago

  • Related to Issue #8568: Overall task status remains in 'running' state while all the reports are in the final state added

#4 Updated by fao89 5 months ago

  • Triaged changed from No to Yes
  • Sprint set to Sprint 95

#5 Updated by rchan 5 months ago

  • Sprint changed from Sprint 95 to Sprint 96

#6 Updated by daviddavis 5 months ago

  • Category set to 29

#7 Updated by rchan 4 months ago

  • Sprint changed from Sprint 96 to Sprint 97

#8 Updated by dalley 4 months ago

  • Priority changed from Normal to High

#9 Updated by dalley 4 months ago

  • Severity changed from 2. Medium to 3. High

#10 Updated by rchan 4 months ago

  • Sprint changed from Sprint 97 to Sprint 98

#11 Updated by mdellweg 3 months ago

Is the cascaded delete needed or even useful here? Isn't the cleanup process, that the association (TaskReservedResource) is deleted first and the then the ReservedResource is cleaned up if no longer needed? If we change the cascade to protect here, it should be fine, right? And if another task is assigned to the Resource in the meantime, we get a ConsistencyException we can simply ignore.

#12 Updated by daviddavis 3 months ago

Yup, that's my thinking as well.

#13 Updated by bmbouter 3 months ago

I want to confirm the thinking here. When you say "the cleanup process, that the association (TaskReservedResource) is deleted first and the then the ReservedResource is cleaned up " you mean here right?.

If that's the case then I agree the cascade delete isn't needed. So say we got rid of the cascade delete ... couldn't this race condition still happen just between these two lines?.

#14 Updated by daviddavis 3 months ago

bmbouter wrote:

I want to confirm the thinking here. When you say "the cleanup process, that the association (TaskReservedResource) is deleted first and the then the ReservedResource is cleaned up " you mean here right?.

If that's the case then I agree the cascade delete isn't needed. So say we got rid of the cascade delete ... couldn't this race condition still happen just between these two lines?.

If you go back to my example, then when task 1 goes to clean up the ReservedResource, then a cascade protect from task 2 will prevent the ReservedResource from being deleted. This would avoid the race condition.

#15 Updated by rchan 3 months ago

  • Sprint changed from Sprint 98 to Sprint 99

#16 Updated by bmbouter 3 months ago

Ah this does make sense @david. +1 to this plan, let's change this line to PROTECT.

#17 Updated by daviddavis 3 months ago

  • Status changed from NEW to ASSIGNED
  • Assignee set to daviddavis

#18 Updated by pulpbot 3 months ago

  • Status changed from ASSIGNED to POST

#19 Updated by bmbouter 3 months ago

  • Parent task set to #8912

#20 Updated by daviddavis 3 months ago

  • Status changed from POST to MODIFIED

#21 Updated by pulpbot 3 months ago

  • Sprint/Milestone set to 3.14.0

#22 Updated by pulpbot 3 months ago

  • Status changed from MODIFIED to CLOSED - CURRENTRELEASE

Please register to edit this issue

Also available in: Atom PDF