Story #4021
closedAs a plugin writer or user, I have a pipeline performance data collector
100%
Description
Performance of a multi-stage queueing network is greatly benefited by some instrumentation that measures the traditional queueing statistics. This ticket creates a feature that gathers that data.
This can be used by users to send to developers. It could also be to performance test the pipeline nightly and report on it's performance over time when running in a resource controlled environment.
Data Collected¶
For each item at each stage we'll record the waiting time and the service time. Also upon entry to each queue we'll record the queue length. Finally the inter-arrival time to each queue will be recorded. Formal definitions of these are below:
waiting time - The number of seconds an item was waiting in a specific Queue
service time - The number of seconds an item was being handled by a stage
queue_length - The number of waiting items in the queue, as measured upon ingress of a new item
interarrival_time - The number of seconds since the previous arrival to this Queue
The data should be written to a sqlite3 database in the /var/lib/pulp/debug/
with the filename being the UUID of the task it is running inside of. This will cause many sqlite3 dbs to be made, but it will allow them to be sent around and uploaded easily.
We need to also understand what order and which types of stages are being used so that the data for each queue and stage can be understood. This should be recorded when the pipeline is assembled with create_pipeline() This also needs to be saved into the db somehow.
If any tooling is developed it could be cool to add it as a pulp-manager command here: https://github.com/pulp/pulp/tree/master/pulpcore/pulpcore/app/management/commands
Enabling the Feature¶
This feature can be enabled with PROFILE_STAGES_API = True
. It is disabled by default.
Sqlite db-layout¶
TBD??
Updated by bmbouter about 6 years ago
- Status changed from NEW to ASSIGNED
- Assignee set to bmbouter
Updated by bmbouter about 6 years ago
- Status changed from ASSIGNED to POST
PR available at: https://github.com/pulp/pulp/pull/3669
Added by bmbouter about 6 years ago
Added by bmbouter about 6 years ago
Revision 85eae21c | View on GitHub
Adds a performance profiler for the Stages API
- adds docs for it
- adds the PROFILE_STAGES_API setting
- adds a management command to summarize the statistics
- adds an extra_data field to DeclarativeContent and DeclarativeArtifact
Updated by bmbouter about 6 years ago
- Status changed from POST to MODIFIED
- % Done changed from 0 to 100
Applied in changeset pulp|85eae21c9c0c99e7b62e6c8dcfd7b3de8d522226.
Updated by bmbouter almost 5 years ago
- Status changed from MODIFIED to CLOSED - CURRENTRELEASE
Adds a performance profiler for the Stages API
https://pulp.plan.io/issues/4021 closes #4021