Task #7778

Ensure a migration can be interrupted within reasonable time at any stage

Added by ttereshc 12 months ago. Updated 7 months ago.

Start date:
Due date:
% Done:


Estimated time:
Platform Release:
Sprint Candidate:
Sprint 92



Users who have pulp 2 and pulp 3 on the same machine need to have a way to control when migration is running. The migration plugin inevitably provides certain load on the system and uses resources. Users might want to choose windows when they run a migration. E.g. Run every Saturday for 4 hours at most.

If the system is large, it can take hours to migrate it, users need a way to cancel the task and free all the related resource within a reasonable timeframe.

Potential issues

When there is a very slow db query and the task which triggered that query is cancelled, how quickly the db query will be aborted as well?


#1 Updated by ttereshc 11 months ago

  • Priority changed from Normal to High
  • Tags Katello added

#2 Updated by dalley 9 months ago

  • Sprint set to Sprint 88

#3 Updated by ttereshc 9 months ago

  • Sprint/Milestone set to 0.9.0

#4 Updated by rchan 9 months ago

  • Sprint changed from Sprint 88 to Sprint 89

#5 Updated by rchan 8 months ago

  • Sprint changed from Sprint 89 to Sprint 90

#7 Updated by rchan 8 months ago

  • Sprint changed from Sprint 90 to Sprint 91

#8 Updated by rchan 7 months ago

  • Sprint changed from Sprint 91 to Sprint 92

#9 Updated by ttereshc 7 months ago

  • Status changed from NEW to ASSIGNED
  • Assignee set to dalley

#10 Updated by dalley 7 months ago

  • Status changed from ASSIGNED to CLOSED - COMPLETE

 <knbk> dalley: I guess it largely depends on when the DB notices the connection is lost. If it's streaming the response for a server-side cursor it should get a write error reasonably quickly and probably kill the query. If there's a big upfront cost where it's not interacting with the connection, that's more likely to continue for a while
 <knbk> and of course it depends on the exact implementation details of the db

We use .iterator() in most places, and the places where we don't tend to be very small queries. I'm still investigating mongoengine but I expect it's roughly the same story.

#11 Updated by dalley 7 months ago

For MongoDB:

<d4rkp1r4t3> dalley, if the client dies mid-query, similar to a socket timeout, the cursor would remain open in mongodb until it trys to return the resultset back to the client. you can verify this by checking db.currentOp() when this happens. for time sensitive queries use the maxTimeMS setting in the client instead of the socket timeout setting to ensure the cursor is killed after a specific time has passed

The general guidelines here seem to be - just make sure that we're using reasonable batch sizes that don't take too long to calculate. For both PostgreSQL and MongoDB.

Please register to edit this issue

Also available in: Atom PDF