Project

Profile

Help

Task #157

closed

Update the failure and recovery scenarios in our user docs.

Added by bmbouter about 9 years ago. Updated about 5 years ago.

Status:
CLOSED - CURRENTRELEASE
Priority:
Normal
Category:
-
Sprint/Milestone:
-
Start date:
Due date:
% Done:

100%

Estimated time:
Platform Release:
Groomed:
Yes
Sprint Candidate:
Yes
Tags:
Documentation, Pulp 2
Sprint:
March 2015
Quarter:

Description

The user docs have a section on failure and recovery that is somewhat vague. Recently some new notes were written (below). These new notes should be added into this section or incorporated somehow.

======== NOTES ==========

For a recap of the Pulp components read here [0].

If a pulp_worker dies, the task currently being worked on, and possibly a small number of related tasks, will not be processed. They will be marked cancelled after 5 minutes, or whenever the worker re-starts, whichever comes first. Until the tasks are marked as cancelled, they will show the task state when the failure occurred. Cancellation after 5 minutes is dependent on pulp_celerybeat running.

If pulp_celerybeat dies, if new workers then start, they won't be given work. If existing workers stop, Pulp will continue assigning them work. Once restarted, pulp_celerybeat will synchronize with the current state of all workers. Scheduled tasks will not run while pulp_celerybeat is down, but they will instead run when celerybeat is restarted.

If pulp_resource_manager dies, the Pulp tasking system will halt. Once restarted it will resume.

If the webserver dies the API will become unavailable until it is restored.

==Important new Related Features==

  • In Pulp 2.6.0, the /status/ url will show the health of all Pulp components. Read more about it here [1], which includes sample response output.

[0]: http://pulp.readthedocs.org/en/latest/user-guide/server.html#components
[1]: http://pulp.readthedocs.org/en/latest/dev-guide/integration/rest-api/status.html

=========== END NOTES =============

Also available in: Atom PDF