Issue #6045
closedPulp content app looses database connection
Description
After a long time running pulp-content-app all requests begin failing with the following error message:
Traceback (most recent call last):
File "/venv/lib64/python3.6/site-packages/aiohttp/web_protocol.py", line 418, in start
resp = await task
File "/venv/lib64/python3.6/site-packages/aiohttp/web_app.py", line 458, in _handle
resp = await handler(request)
File "/venv/lib64/python3.6/site-packages/pulpcore/content/handler.py", line 117, in stream_content
return await self._match_and_stream(path, request)
File "/venv/lib64/python3.6/site-packages/pulpcore/content/handler.py", line 309, in _match_and_stream
distro = self._match_distribution(path)
File "/venv/lib64/python3.6/site-packages/pulpcore/content/handler.py", line 158, in _match_distribution
return BaseDistribution.objects.get(base_path__in=base_paths).cast()
File "/venv/lib64/python3.6/site-packages/django/db/models/manager.py", line 82, in manager_method
return getattr(self.get_queryset(), name)(*args, **kwargs)
File "/venv/lib64/python3.6/site-packages/django/db/models/query.py", line 402, in get
num = len(clone)
File "/venv/lib64/python3.6/site-packages/django/db/models/query.py", line 256, in __len__
self._fetch_all()
File "/venv/lib64/python3.6/site-packages/django/db/models/query.py", line 1242, in _fetch_all
self._result_cache = list(self._iterable_class(self))
File "/venv/lib64/python3.6/site-packages/django/db/models/query.py", line 55, in __iter__
results = compiler.execute_sql(chunked_fetch=self.chunked_fetch, chunk_size=self.chunk_size)
File "/venv/lib64/python3.6/site-packages/django/db/models/sql/compiler.py", line 1131, in execute_sql
cursor = self.connection.cursor()
File "/venv/lib64/python3.6/site-packages/django/db/backends/base/base.py", line 256, in cursor
return self._cursor()
File "/venv/lib64/python3.6/site-packages/django/db/backends/base/base.py", line 235, in _cursor
return self._prepare_cursor(self.create_cursor(name))
File "/venv/lib64/python3.6/site-packages/django/db/utils.py", line 89, in __exit__
raise dj_exc_value.with_traceback(traceback) from exc_value
File "/venv/lib64/python3.6/site-packages/django/db/backends/base/base.py", line 235, in _cursor
return self._prepare_cursor(self.create_cursor(name))
File "/venv/lib64/python3.6/site-packages/django/db/backends/postgresql/base.py", line 223, in create_cursor
cursor = self.connection.cursor()
django.db.utils.InterfaceError: connection already closed
Possible reason is that pulp-content-app doesn't re-establish database connection whenever it closes.
Related issues
Updated by bmbouter almost 5 years ago
Thank you for reporting this. I expect Django to re-establish the connection. I see this from its management docs here
How can I reproduce this issue also, any pointers for me?
Updated by osapryki almost 5 years ago
I think this discussion might be related:
https://stackoverflow.com/questions/31504591/interfaceerror-connection-already-closed-using-django-celery-scrapy
Updated by bmbouter almost 5 years ago
This error suggests the connection is closing from the Postgresql side. I was chatting about the issue in their channel and they indicated Postgresql is not closing the connection. They suggested that a firewall in the middle (openshift perhaps?) is firewalling the idle connection.
Was there anything in the postgresql logs that indicates it is closing the connection or that django maybe closed it (aka the firewall?)
Updated by bmbouter almost 5 years ago
I had two ideas.
One: we could check the postgresql connection during the pre_request event http://docs.gunicorn.org/en/latest/settings.html#pre-request but it would be costly in terms of the request-response runtime increase.
Two: I looked for some sort of process recycling in gunicorn (which runs the content app) but I didn't see it.
Updated by ironfroggy almost 5 years ago
bmbouter, maybe you want gunicorn's max_requests
setting, which will restart workers after N
number of requests handled?
https://docs.gunicorn.org/en/stable/settings.html#max-requests
Updated by osapryki almost 5 years ago
- Description updated (diff)
@ironfroggy Limiting number of connections won't help because connection can terminate any time within this limit.
RCA: Django manages connections implicitly. It sets up signal handlers to handle dead or expired connections before and after each request [1]. This handler executes close_if_unusable_or_obsolete method, which closes connection if it times out when CONN_MAX_AGE is exceeded or if is_usable() [3] check fails.
[1] https://github.com/django/django/blob/stable/2.2.x/django/db/__init__.py#L60
[2] https://github.com/django/django/blob/stable/2.2.x/django/db/backends/base/base.py#L492
[3] https://github.com/django/django/blob/stable/2.2.x/django/db/backends/postgresql/base.py#L249
Since content-app is asyncio application and uses django connection, it will share single connection between coroutines. Possible solution is to close the connection when request handler returns response or after significant part of database queries.
from django.db import connection
# Put this either after significant database communication logic or at the beginning or at the end of request handler.
connection.close()
Also since the connection is shared between coroutines you should be extremely careful with that to avoid connection state being inconsistent between coroutine context switches. For example you should make sure context is never switched within a transaction.
Updated by daviddavis almost 5 years ago
- Triaged changed from No to Yes
- Sprint set to Sprint 66
Updated by daviddavis almost 5 years ago
- Status changed from NEW to POST
Added by daviddavis almost 5 years ago
Updated by daviddavis almost 5 years ago
- Status changed from POST to MODIFIED
Applied in changeset pulpcore|5aabb202dd59cbe2d30ef5ad91f01932c9ca041b.
Added by bmbouter almost 5 years ago
Revision 6fe74ee2 | View on GitHub
Add bugfix changelog entry for 6045
Added by daviddavis almost 5 years ago
Revision 3da5c467 | View on GitHub
Fix "connection already closed" error in content app
fixes #6045 https://pulp.plan.io/issues/6045
(cherry picked from commit 5aabb202dd59cbe2d30ef5ad91f01932c9ca041b)
Added by bmbouter almost 5 years ago
Revision 3293134e | View on GitHub
Add bugfix changelog entry for 6045
https://pulp.plan.io/issues/6045 re #6045
(cherry picked from commit 6fe74ee2978732c72dcfdfa8a6eeed5760a4dee7)
Updated by daviddavis almost 5 years ago
Applied in changeset pulpcore|3da5c4672a6b047e5eee9b58d6a9400676ea26b9.
Updated by bmbouter almost 5 years ago
- Status changed from MODIFIED to CLOSED - CURRENTRELEASE
Updated by ttereshc over 3 years ago
- Related to Issue #9276: Content app can have unusable/closed db connections in pulpcore 3.15/3.16 added
Fix "connection already closed" error in content app
fixes #6045 https://pulp.plan.io/issues/6045