Project

Profile

Help

Story #4021

As a plugin writer or user, I have a pipeline performance data collector

Added by bmbouter about 1 year ago. Updated 6 months ago.

Status:
MODIFIED
Priority:
Normal
Assignee:
Category:
-
Sprint/Milestone:
Start date:
Due date:
% Done:

100%

Platform Release:
Blocks Release:
Backwards Incompatible:
No
Groomed:
No
Sprint Candidate:
No
Tags:
QA Contact:
Complexity:
Smash Test:
Verified:
No
Verification Required:
No
Sprint:

Description

Performance of a multi-stage queueing network is greatly benefited by some instrumentation that measures the traditional queueing statistics. This ticket creates a feature that gathers that data.

This can be used by users to send to developers. It could also be to performance test the pipeline nightly and report on it's performance over time when running in a resource controlled environment.

Data Collected

For each item at each stage we'll record the waiting time and the service time. Also upon entry to each queue we'll record the queue length. Finally the inter-arrival time to each queue will be recorded. Formal definitions of these are below:

waiting time - The number of seconds an item was waiting in a specific Queue
service time - The number of seconds an item was being handled by a stage
queue_length - The number of waiting items in the queue, as measured upon ingress of a new item
interarrival_time - The number of seconds since the previous arrival to this Queue

The data should be written to a sqlite3 database in the /var/lib/pulp/debug/ with the filename being the UUID of the task it is running inside of. This will cause many sqlite3 dbs to be made, but it will allow them to be sent around and uploaded easily.

We need to also understand what order and which types of stages are being used so that the data for each queue and stage can be understood. This should be recorded when the pipeline is assembled with create_pipeline() This also needs to be saved into the db somehow.

If any tooling is developed it could be cool to add it as a pulp-manager command here: https://github.com/pulp/pulp/tree/master/pulpcore/pulpcore/app/management/commands

Enabling the Feature

This feature can be enabled with PROFILE_STAGES_API = True. It is disabled by default.

Sqlite db-layout

TBD??

Associated revisions

Revision 85eae21c View on GitHub
Added by bmbouter about 1 year ago

Adds a performance profiler for the Stages API

- adds docs for it
- adds the PROFILE_STAGES_API setting
- adds a management command to summarize the statistics
- adds an extra_data field to DeclarativeContent and DeclarativeArtifact

https://pulp.plan.io/issues/4021
closes #4021

Revision 85eae21c View on GitHub
Added by bmbouter about 1 year ago

Adds a performance profiler for the Stages API

- adds docs for it
- adds the PROFILE_STAGES_API setting
- adds a management command to summarize the statistics
- adds an extra_data field to DeclarativeContent and DeclarativeArtifact

https://pulp.plan.io/issues/4021
closes #4021

Revision 85eae21c View on GitHub
Added by bmbouter about 1 year ago

Adds a performance profiler for the Stages API

- adds docs for it
- adds the PROFILE_STAGES_API setting
- adds a management command to summarize the statistics
- adds an extra_data field to DeclarativeContent and DeclarativeArtifact

https://pulp.plan.io/issues/4021
closes #4021

History

#1 Updated by bmbouter about 1 year ago

  • Status changed from NEW to ASSIGNED
  • Assignee set to bmbouter

#2 Updated by bmbouter about 1 year ago

  • Status changed from ASSIGNED to POST

#3 Updated by bmbouter about 1 year ago

  • Status changed from POST to MODIFIED
  • % Done changed from 0 to 100

#4 Updated by daviddavis 6 months ago

  • Sprint/Milestone set to 3.0

#5 Updated by bmbouter 6 months ago

  • Tags deleted (Pulp 3)

Please register to edit this issue

Also available in: Atom PDF