Project

Profile

Help

Issue #1669

closed

Memory leak in Pulp celery processes

Added by bmbouter about 8 years ago. Updated about 5 years ago.

Status:
CLOSED - WORKSFORME
Priority:
Normal
Assignee:
-
Category:
-
Sprint/Milestone:
-
Start date:
Due date:
Estimated time:
Severity:
2. Medium
Version:
Platform Release:
2.8.1
OS:
Triaged:
Yes
Groomed:
No
Sprint Candidate:
No
Tags:
Pulp 2
Sprint:
Quarter:

Description

Via IRC I received a report that on at least two occasions, Pulp workers of a large Pulp installation showed signs of memory leak. The leak consumed so much memory that at a certain point, all subsequent tasks fail with "Cannot allocate memory".

When they are in this situation there are no tasks waiting or running in Pulp as is evident by their output provided for:

No running on waiting tasks in pulp

pulp:PRIMARY> db.task_status.find({"state":{"$in":["running", "waiting"]}})
pulp:PRIMARY> 

Also you can clearly see the offending processes are Pulp celery processes which are idle but consuming large amounts of memory.

bash-4.1# ps u 16012
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
apache   16012  0.2 15.1 3016808 2475812 ?     Sl   Feb10   4:17 /usr/bin/python -m celery.__main__ worker -c 1 -n reserved_resource_worker-1@pulp04.example.com --events --a
bash-4.1# ps u 16121
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
apache   16121  0.3 13.6 2767072 2224064 ?     Sl   Feb10   5:24 /usr/bin/python -m celery.__main__ worker -c 1 -n reserved_resource_worker-5@pulp04.example.com --events --a
bash-4.1# ps u 16206
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
apache   16206  0.2 12.8 3092120 2105596 ?     Sl   Feb10   3:34 /usr/bin/python -m celery.__main__ worker -c 1 -n reserved_resource_worker-7@pulp04.example.com --events --a
bash-4.1# ps u 16141
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
apache   16141  0.3 12.5 2621912 2056276 ?     Sl   Feb10   4:31 /usr/bin/python -m celery.__main__ worker -c 1 -n reserved_resource_worker-6@pulp04.example.com --events --a

It's not yet clear if this is happening over time versus instantaneously, or if it is related to a specific task type. It has been suggested that this may be related[0]. We should also consider that this could be an upstream Celery issue also which has several memory leak related bug reports[1].

[0]: https://dzone.com/articles/diagnosing-memory-leaks-python
[1]: https://github.com/celery/celery/issues?utf8=%E2%9C%93&q=is%3Aissue+is%3Aopen+leak

Also available in: Atom PDF