Project

Profile

Help

Story #2324

As a user, I can see the first 8 characters of a task id in every log statement emitted from a running task

Added by mhrivnak about 3 years ago. Updated 7 months ago.

Status:
CLOSED - CURRENTRELEASE
Priority:
Normal
Assignee:
Category:
-
Sprint/Milestone:
-
Start date:
Due date:
% Done:

100%

Platform Release:
2.13.0
Blocks Release:
Backwards Incompatible:
No
Groomed:
Yes
Sprint Candidate:
Yes
Tags:
Pulp 2
QA Contact:
Complexity:
Smash Test:
Verified:
No
Verification Required:
No
Sprint:
Sprint 14

Description

Problem

It can be challenging for a user to analyze Pulp logs. A user should be able to answer these questions:

1. What task did this log statement come from?
2. What are all the log statements emitted by a specific task?

Solution

To remedy this we will add the first 8 characters of the task ID within brackets to each log line emitted from within a running task. Note that celery logs the entire task id before the task is run, so even with us logging a limited portion of the task-id you can easily determine the entire task-id.

Taking an example from Comment 3, the contents of the log line would be:

[0ca4399c] Starting new HTTPS connection (1):

Checklist

Associated revisions

Revision 431b5d2c View on GitHub
Added by daviddavis almost 3 years ago

Appending 8 character task id to log output

Appending a 8 character task id to log output if there is a relevant
task id.

fixes #2324
https://pulp.plan.io/issues/2324

Revision 431b5d2c View on GitHub
Added by daviddavis almost 3 years ago

Appending 8 character task id to log output

Appending a 8 character task id to log output if there is a relevant
task id.

fixes #2324
https://pulp.plan.io/issues/2324

Revision 431b5d2c View on GitHub
Added by daviddavis almost 3 years ago

Appending 8 character task id to log output

Appending a 8 character task id to log output if there is a relevant
task id.

fixes #2324
https://pulp.plan.io/issues/2324

History

#1 Updated by bmbouter almost 3 years ago

It would make the logs statements long, but we should have every Pulp log statement running inside a task did include the Task id. Pids are not as useful because the task id currently logged is the parent's pid and the rest of the log messages are the child's different pid. Also having task ids would be better for clustered installs where pids could collide. We could figure out how to implement this I think.

Regarding the pid's, if we did that we would publish a syslog config which would include the pids. But we should just add the task ids.

Instead of Dec 10 19:56:55 dev pulp[32137]: requests.packages.urllib3.connectionpool:INFO: Starting new HTTPS connection (1): repos.fedorapeople.org

It could be: Dec 10 19:56:55 dev pulp[32137]: requests.packages.urllib3.connectionpool:INFO: [0ca4399c-51eb-47c8-9a07-145e61e389a5] Starting new HTTPS connection (1): repos.fedorapeople.org

Or it could be: Dec 10 19:56:55 dev pulp[32137]: [0ca4399c-51eb-47c8-9a07-145e61e389a5] requests.packages.urllib3.connectionpool:INFO: Starting new HTTPS connection (1): repos.fedorapeople.org

I prefer the former I think so that all log statements have the position of the python package and log level in the same place when running inside or outside of a task. Log statements running outside of tasks will not get the UUID.

What do others think?

#2 Updated by semyers almost 3 years ago

  • Groomed changed from No to Yes
  • Sprint Candidate changed from No to Yes

I consider the groomed, provided we take one of Brian's suggested solutions in comment 1 (pids aren't so useful in log messages, but task IDs are very useful).

#3 Updated by mhrivnak almost 3 years ago

Just an idea: what about only logging the first 8 characters of the task ID? That should be plenty of uniqueness to avoid collision while providing a good reference.

Dec 10 19:56:55 dev pulp[32137]: requests.packages.urllib3.connectionpool:INFO: [0ca4399c-51eb-47c8-9a07-145e61e389a5] Starting new HTTPS connection (1): 

vs

Dec 10 19:56:55 dev pulp[32137]: requests.packages.urllib3.connectionpool:INFO: [0ca4399c] Starting new HTTPS connection (1): 

#4 Updated by bmbouter almost 3 years ago

@mhrivnak +1 to that idea. Retaining 8 characters will still keep collision probabilities low. 1 in 36^8 = 2,821,109,907,456 by my count.

#5 Updated by bmbouter almost 3 years ago

  • Checklist item release note for this change added
  • Checklist item documentation updates in the user-guide/troubleshooting page added
  • Checklist item two exapmles in the user-guide/troubleshooting page including an example line emitted inside a running task and one without added
  • Checklist item update the Pulp logger to emit the task id if running inside a task added

#6 Updated by bmbouter almost 3 years ago

  • Smash Test set to 483

#7 Updated by bmbouter almost 3 years ago

  • Subject changed from As a user, I can correlate log statements to a task ID to As a user, I can see the first 8 characters of a task id in every log statement emitted from a running task

#8 Updated by bmbouter almost 3 years ago

  • Description updated (diff)

#9 Updated by mhrivnak almost 3 years ago

  • Sprint/Milestone set to 31

#10 Updated by mhrivnak almost 3 years ago

  • Sprint/Milestone changed from 31 to 32

#11 Updated by daviddavis almost 3 years ago

  • Status changed from NEW to ASSIGNED
  • Assignee set to daviddavis

#12 Updated by daviddavis almost 3 years ago

My initial thought (and please tell me if this is a bad idea) was to write a Filter to set a task id on the LogRecord and then have the logger output that using a custom Formatter. However, that would require some global way to be able to get the task id but I didn't see any code for doing that. Is there a way to do that?

#13 Updated by bmbouter almost 3 years ago

Here is a global way to get the task id [0]. I can't speak to the implementation approach, but whatever it does it needs to work well with our existing logger [1].

[0]: https://github.com/bmbouter/pulp/blob/f9355d21559fa880fd7654134d8f34dcf6b85acf/server/pulp/server/async/tasks.py#L643-L643
[1]: https://github.com/pulp/pulp/blob/master/server/pulp/server/logs.py

#14 Updated by daviddavis almost 3 years ago

  • Status changed from ASSIGNED to POST

#15 Updated by daviddavis almost 3 years ago

  • Status changed from POST to MODIFIED
  • % Done changed from 0 to 100

#16 Updated by semyers over 2 years ago

  • Platform Release set to 2.13.0

#17 Updated by pcreech over 2 years ago

  • Status changed from MODIFIED to ON_QA

#18 Updated by pcreech over 2 years ago

  • Status changed from ON_QA to CLOSED - CURRENTRELEASE

#19 Updated by bmbouter over 1 year ago

  • Sprint set to Sprint 16

#20 Updated by bmbouter over 1 year ago

  • Sprint changed from Sprint 16 to Sprint 14

#21 Updated by bmbouter over 1 year ago

  • Sprint/Milestone deleted (32)

#22 Updated by bmbouter 7 months ago

  • Tags Pulp 2 added

Please register to edit this issue

Also available in: Atom PDF