Story #2324
closedAs a user, I can see the first 8 characters of a task id in every log statement emitted from a running task
100%
Description
Problem¶
It can be challenging for a user to analyze Pulp logs. A user should be able to answer these questions:
1. What task did this log statement come from?
2. What are all the log statements emitted by a specific task?
Solution¶
To remedy this we will add the first 8 characters of the task ID within brackets to each log line emitted from within a running task. Note that celery logs the entire task id before the task is run, so even with us logging a limited portion of the task-id you can easily determine the entire task-id.
Taking an example from Comment 3, the contents of the log line would be:
[0ca4399c] Starting new HTTPS connection (1):
Updated by bmbouter about 8 years ago
It would make the logs statements long, but we should have every Pulp log statement running inside a task did include the Task id. Pids are not as useful because the task id currently logged is the parent's pid and the rest of the log messages are the child's different pid. Also having task ids would be better for clustered installs where pids could collide. We could figure out how to implement this I think.
Regarding the pid's, if we did that we would publish a syslog config which would include the pids. But we should just add the task ids.
Instead of Dec 10 19:56:55 dev pulp[32137]: requests.packages.urllib3.connectionpool:INFO: Starting new HTTPS connection (1): repos.fedorapeople.org
It could be: Dec 10 19:56:55 dev pulp[32137]: requests.packages.urllib3.connectionpool:INFO: [0ca4399c-51eb-47c8-9a07-145e61e389a5] Starting new HTTPS connection (1): repos.fedorapeople.org
Or it could be: Dec 10 19:56:55 dev pulp[32137]: [0ca4399c-51eb-47c8-9a07-145e61e389a5] requests.packages.urllib3.connectionpool:INFO: Starting new HTTPS connection (1): repos.fedorapeople.org
I prefer the former I think so that all log statements have the position of the python package and log level in the same place when running inside or outside of a task. Log statements running outside of tasks will not get the UUID.
What do others think?
Updated by semyers almost 8 years ago
- Groomed changed from No to Yes
- Sprint Candidate changed from No to Yes
I consider the groomed, provided we take one of Brian's suggested solutions in comment 1 (pids aren't so useful in log messages, but task IDs are very useful).
Updated by mhrivnak almost 8 years ago
Just an idea: what about only logging the first 8 characters of the task ID? That should be plenty of uniqueness to avoid collision while providing a good reference.
Dec 10 19:56:55 dev pulp[32137]: requests.packages.urllib3.connectionpool:INFO: [0ca4399c-51eb-47c8-9a07-145e61e389a5] Starting new HTTPS connection (1):
vs
Dec 10 19:56:55 dev pulp[32137]: requests.packages.urllib3.connectionpool:INFO: [0ca4399c] Starting new HTTPS connection (1):
Updated by bmbouter almost 8 years ago
mhrivnak +1 to that idea. Retaining 8 characters will still keep collision probabilities low. 1 in 36^8 = 2,821,109,907,456 by my count.
Updated by bmbouter almost 8 years ago
- Subject changed from As a user, I can correlate log statements to a task ID to As a user, I can see the first 8 characters of a task id in every log statement emitted from a running task
Updated by daviddavis almost 8 years ago
- Status changed from NEW to ASSIGNED
- Assignee set to daviddavis
Updated by daviddavis almost 8 years ago
My initial thought (and please tell me if this is a bad idea) was to write a Filter to set a task id on the LogRecord and then have the logger output that using a custom Formatter. However, that would require some global way to be able to get the task id but I didn't see any code for doing that. Is there a way to do that?
Updated by bmbouter almost 8 years ago
Here is a global way to get the task id [0]. I can't speak to the implementation approach, but whatever it does it needs to work well with our existing logger [1].
[0]: https://github.com/bmbouter/pulp/blob/f9355d21559fa880fd7654134d8f34dcf6b85acf/server/pulp/server/async/tasks.py#L643-L643
[1]: https://github.com/pulp/pulp/blob/master/server/pulp/server/logs.py
Updated by daviddavis almost 8 years ago
- Status changed from ASSIGNED to POST
Added by daviddavis almost 8 years ago
Added by daviddavis almost 8 years ago
Revision 431b5d2c | View on GitHub
Appending 8 character task id to log output
Appending a 8 character task id to log output if there is a relevant task id.
Updated by daviddavis almost 8 years ago
- Status changed from POST to MODIFIED
- % Done changed from 0 to 100
Applied in changeset pulp|431b5d2c832ad50c33daccb3f852be31583a4b9c.
Updated by pcreech over 7 years ago
- Status changed from 5 to CLOSED - CURRENTRELEASE
Updated by bmbouter almost 7 years ago
- Sprint changed from Sprint 16 to Sprint 14
Appending 8 character task id to log output
Appending a 8 character task id to log output if there is a relevant task id.
fixes #2324 https://pulp.plan.io/issues/2324