Project

Profile

Help

Issue #1669

closed

Memory leak in Pulp celery processes

Added by bmbouter about 8 years ago. Updated about 5 years ago.

Status:
CLOSED - WORKSFORME
Priority:
Normal
Assignee:
-
Category:
-
Sprint/Milestone:
-
Start date:
Due date:
Estimated time:
Severity:
2. Medium
Version:
Platform Release:
2.8.1
OS:
Triaged:
Yes
Groomed:
No
Sprint Candidate:
No
Tags:
Pulp 2
Sprint:
Quarter:

Description

Via IRC I received a report that on at least two occasions, Pulp workers of a large Pulp installation showed signs of memory leak. The leak consumed so much memory that at a certain point, all subsequent tasks fail with "Cannot allocate memory".

When they are in this situation there are no tasks waiting or running in Pulp as is evident by their output provided for:

No running on waiting tasks in pulp

pulp:PRIMARY> db.task_status.find({"state":{"$in":["running", "waiting"]}})
pulp:PRIMARY> 

Also you can clearly see the offending processes are Pulp celery processes which are idle but consuming large amounts of memory.

bash-4.1# ps u 16012
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
apache   16012  0.2 15.1 3016808 2475812 ?     Sl   Feb10   4:17 /usr/bin/python -m celery.__main__ worker -c 1 -n reserved_resource_worker-1@pulp04.example.com --events --a
bash-4.1# ps u 16121
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
apache   16121  0.3 13.6 2767072 2224064 ?     Sl   Feb10   5:24 /usr/bin/python -m celery.__main__ worker -c 1 -n reserved_resource_worker-5@pulp04.example.com --events --a
bash-4.1# ps u 16206
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
apache   16206  0.2 12.8 3092120 2105596 ?     Sl   Feb10   3:34 /usr/bin/python -m celery.__main__ worker -c 1 -n reserved_resource_worker-7@pulp04.example.com --events --a
bash-4.1# ps u 16141
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
apache   16141  0.3 12.5 2621912 2056276 ?     Sl   Feb10   4:31 /usr/bin/python -m celery.__main__ worker -c 1 -n reserved_resource_worker-6@pulp04.example.com --events --a

It's not yet clear if this is happening over time versus instantaneously, or if it is related to a specific task type. It has been suggested that this may be related[0]. We should also consider that this could be an upstream Celery issue also which has several memory leak related bug reports[1].

[0]: https://dzone.com/articles/diagnosing-memory-leaks-python
[1]: https://github.com/celery/celery/issues?utf8=%E2%9C%93&q=is%3Aissue+is%3Aopen+leak

Actions #1

Updated by bmbouter about 8 years ago

  • Platform Release set to 2.8.1
  • Triaged changed from No to Yes
Actions #2

Updated by bmbouter about 8 years ago

  • Status changed from NEW to CLOSED - WORKSFORME

I was contacted by the original reporter and they identified the issue as not being a root cause in Pulp code. They had made some modifications to the code which introduced a memory issue. They are not exactly sure of the root cause, but by modifying their implementation they confirmed it was resolved.

If other users observe this, please re-open.

Actions #3

Updated by bmbouter about 5 years ago

  • Tags Pulp 2 added

Also available in: Atom PDF