Project

Profile

Help

Story #4021

closed

As a plugin writer or user, I have a pipeline performance data collector

Added by bmbouter over 5 years ago. Updated over 4 years ago.

Status:
CLOSED - CURRENTRELEASE
Priority:
Normal
Assignee:
Category:
-
Sprint/Milestone:
Start date:
Due date:
% Done:

100%

Estimated time:
Platform Release:
Groomed:
No
Sprint Candidate:
No
Tags:
Sprint:
Quarter:

Description

Performance of a multi-stage queueing network is greatly benefited by some instrumentation that measures the traditional queueing statistics. This ticket creates a feature that gathers that data.

This can be used by users to send to developers. It could also be to performance test the pipeline nightly and report on it's performance over time when running in a resource controlled environment.

Data Collected

For each item at each stage we'll record the waiting time and the service time. Also upon entry to each queue we'll record the queue length. Finally the inter-arrival time to each queue will be recorded. Formal definitions of these are below:

waiting time - The number of seconds an item was waiting in a specific Queue
service time - The number of seconds an item was being handled by a stage
queue_length - The number of waiting items in the queue, as measured upon ingress of a new item
interarrival_time - The number of seconds since the previous arrival to this Queue

The data should be written to a sqlite3 database in the /var/lib/pulp/debug/ with the filename being the UUID of the task it is running inside of. This will cause many sqlite3 dbs to be made, but it will allow them to be sent around and uploaded easily.

We need to also understand what order and which types of stages are being used so that the data for each queue and stage can be understood. This should be recorded when the pipeline is assembled with create_pipeline() This also needs to be saved into the db somehow.

If any tooling is developed it could be cool to add it as a pulp-manager command here: https://github.com/pulp/pulp/tree/master/pulpcore/pulpcore/app/management/commands

Enabling the Feature

This feature can be enabled with PROFILE_STAGES_API = True. It is disabled by default.

Sqlite db-layout

TBD??

Actions #1

Updated by bmbouter over 5 years ago

  • Status changed from NEW to ASSIGNED
  • Assignee set to bmbouter
Actions #2

Updated by bmbouter over 5 years ago

  • Status changed from ASSIGNED to POST

Added by bmbouter over 5 years ago

Revision 85eae21c | View on GitHub

Adds a performance profiler for the Stages API

  • adds docs for it
  • adds the PROFILE_STAGES_API setting
  • adds a management command to summarize the statistics
  • adds an extra_data field to DeclarativeContent and DeclarativeArtifact

https://pulp.plan.io/issues/4021 closes #4021

Added by bmbouter over 5 years ago

Revision 85eae21c | View on GitHub

Adds a performance profiler for the Stages API

  • adds docs for it
  • adds the PROFILE_STAGES_API setting
  • adds a management command to summarize the statistics
  • adds an extra_data field to DeclarativeContent and DeclarativeArtifact

https://pulp.plan.io/issues/4021 closes #4021

Actions #3

Updated by bmbouter over 5 years ago

  • Status changed from POST to MODIFIED
  • % Done changed from 0 to 100
Actions #4

Updated by daviddavis almost 5 years ago

  • Sprint/Milestone set to 3.0.0
Actions #5

Updated by bmbouter almost 5 years ago

  • Tags deleted (Pulp 3)
Actions #6

Updated by bmbouter over 4 years ago

  • Status changed from MODIFIED to CLOSED - CURRENTRELEASE

Also available in: Atom PDF