Issue #1885
closedPulp uploads are failing with "too many open files" error
Description
We are seeing uploads to pulp fail like:
Apr 30 08:35:48 pulp07 pulp: pulp.server.webservices.middleware.exception:ERROR: (12507-64512) [Errno 24] Too many open files: '/var/lib/pulp/uploads/a1fce30f-f731-4109-ab2d-9cc15464727d'
Apr 30 08:35:48 pulp07 pulp: pulp.server.webservices.middleware.exception:ERROR: (12507-64512) Traceback (most recent call last):
Apr 30 08:35:48 pulp07 pulp: pulp.server.webservices.middleware.exception:ERROR: (12507-64512) File "/usr/lib/python2.6/site-packages/django/core/handlers/base.py", line 109, in get_response
Apr 30 08:35:48 pulp07 pulp: pulp.server.webservices.middleware.exception:ERROR: (12507-64512) File "/usr/lib/python2.6/site-packages/django/views/generic/base.py", line 48, in view
Apr 30 08:35:48 pulp07 pulp: pulp.server.webservices.middleware.exception:ERROR: (12507-64512) File "/usr/lib/python2.6/site-packages/django/views/generic/base.py", line 69, in dispatch
Apr 30 08:35:48 pulp07 pulp: pulp.server.webservices.middleware.exception:ERROR: (12507-64512) File "/usr/lib/python2.6/site-packages/pulp/server/webservices/views/decorators.py", line 237, in _auth_decorator
Apr 30 08:35:48 pulp07 pulp: pulp.server.webservices.middleware.exception:ERROR: (12507-64512) File "/usr/lib/python2.6/site-packages/pulp/server/webservices/views/decorators.py", line 191, in _verify_auth
Apr 30 08:35:48 pulp07 pulp: pulp.server.webservices.middleware.exception:ERROR: (12507-64512) File "/usr/lib/python2.6/site-packages/pulp/server/webservices/views/util.py", line 111, in wrapper
Apr 30 08:35:48 pulp07 pulp: pulp.server.webservices.middleware.exception:ERROR: (12507-64512) File "/usr/lib/python2.6/site-packages/pulp/server/webservices/views/content.py", line 433, in post
Apr 30 08:35:48 pulp07 pulp: pulp.server.webservices.middleware.exception:ERROR: (12507-64512) File "/usr/lib/python2.6/site-packages/pulp/server/managers/content/upload.py", line 48, in initialize_upload
Apr 30 08:35:48 pulp07 pulp: pulp.server.webservices.middleware.exception:ERROR: (12507-64512) IOError: [Errno 24] Too many open files: '/var/lib/pulp/uploads/a1fce30f-f731-4109-ab2d-9cc15464727d'
The apache user limit for open files is the default of 1024 and we have 8 pulp processes. I don't know if the problem is leaking file descriptors or is 1024 just isn't enough for Pulp.
Updated by bmbouter over 8 years ago
@dgregor, Can we make this a public issue?
We should try to reproduce with pulp-admin called in a loop to upload a huge amount of fake rpms. We need a fake rpm generator.
Updated by mhrivnak over 8 years ago
bmbouter wrote:
We should try to reproduce with pulp-admin called in a loop to upload a huge amount of fake rpms. We need a fake rpm generator.
Updated by bmbouter over 8 years ago
This is likely a scalability limit of Pulp, which could be better documented in the scalability or troubleshooting guide. It could also be a software defect within Pulp due to a file descriptor leak or something similar to that.
@dgregor jluza during triage the question of a clear reproducer came up. Could reproducer steps be put on this issue? At triage it was decided to not triage the issue until there are reproducer steps.
Updated by jluza over 8 years ago
we don't know what causes that therefore we can't reproduce it. We tried to set apache hard nofile 63536 in limits.conf but even when lsof showed about 8000 records, pulp failed with message above
Updated by mhrivnak over 8 years ago
I created 6000 fake RPMs and uploaded them to pulp. I uploaded 3 at a time using pulp-admin with xargs. I did not encounter any problems.
I used 2.8.3-0.2.beta on centos 7.
Updated by mhrivnak over 8 years ago
I tried the same thing with pulp 2.7.1 on centos 7 and also did not see any problems. I also didn't see any unusual activity with holding on to open files.
Updated by bmbouter over 8 years ago
- Status changed from NEW to CLOSED - WORKSFORME
- Triaged changed from No to Yes
Per the comments above, we've tried to reproduce this but have not been able to. Without a way to reproduce the issue we are closing it for now. If you experience this again please reopen it and post reproducer steps.