Actions
Issue #7051
closedpostgres sometimes fails to start
Start date:
Due date:
Estimated time:
Severity:
2. Medium
Version:
Platform Release:
OS:
Triaged:
Yes
Groomed:
No
Sprint Candidate:
No
Tags:
Single Container
Sprint:
Sprint 75
Quarter:
Description
This issue is a clone from Github issues: https://github.com/pulp/pulp-oci-images/issues/19
For some reason, I am regularly seeing that postgres fails to start on boot.
This does not happen all the time, but while obtaining information for this issue, out of 5 attempts, this failure occurred on the 5th run.
Roughly my process is to:
docker create --name pulp docker.io/pulp/pulp-fedora31:latest docker cp settings.py pulp:/etc/settings/settings.py docker cp admin_pass.prep pulp:/etc/cont-init.d/111-postgres docker start pulp
Here are the logs for such a failure:
[cont-init.d] 000-postgres: started service [312] postgres [cont-init.d] 000-postgres: wait for services startup... 2020-06-25 16:49:53.548 UTC [312] LOG: listening on IPv4 address "127.0.0.1", port 5432 2020-06-25 16:49:53.548 UTC [312] LOG: could not bind IPv6 address "::1": Cannot assign requested address 2020-06-25 16:49:53.548 UTC [312] HINT: Is another postmaster already running on port 5432? If not, wait a few seconds and retry. psql: could not connect to server: No such file or directory Is the server running locally and accepting connections on Unix domain socket "/var/run/postgresql/.s.PGSQL.5432"? [cont-init.d] 000-postgres: createuser pulp createuser: could not connect to database postgres: could not connect to server: No such file or directory Is the server running locally and accepting connections on Unix domain socket "/var/run/postgresql/.s.PGSQL.5432"? [cont-init.d] 000-postgres: createdb --encoding=utf-8 -O pulp pulp 2020-06-25 16:49:54.305 UTC [312] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432" 2020-06-25 16:49:54.481 UTC [312] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432" 2020-06-25 16:49:54.605 UTC [312] LOG: redirecting log output to logging collector process 2020-06-25 16:49:54.605 UTC [312] HINT: Future log output will appear in directory "log". [cont-init.d] 000-postgres: exited 0. [cont-init.d] 111-postgres: executing... [cont-init.d] 111-postgres: exited 0. [cont-init.d] zzz-postgres: executing... [cont-init.d] zzz-postgres: shutdown service [${PG_PID}] postgres [cont-init.d] zzz-postgres: exited 0. [cont-init.d] done. [services.d] starting services 2020-06-25 16:50:03.691 UTC [397] LOG: listening on IPv4 address "127.0.0.1", port 5432 2020-06-25 16:50:03.692 UTC [397] LOG: could not bind IPv6 address "::1": Cannot assign requested address 2020-06-25 16:50:03.692 UTC [397] HINT: Is another postmaster already running on port 5432? If not, wait a few seconds and retry. [services.d] done. 2020-06-25 16:50:03.768 UTC [397] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432" 2020-06-25 16:50:03.823 UTC [397] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432" 2020-06-25 16:50:03.858 UTC [397] LOG: redirecting log output to logging collector process 2020-06-25 16:50:03.858 UTC [397] HINT: Future log output will appear in directory "log". [2020-06-25 16:50:04 +0000] [399] [INFO] Starting gunicorn 20.0.4 [2020-06-25 16:50:04 +0000] [399] [INFO] Listening at: http://127.0.0.1:24816 (399) [2020-06-25 16:50:04 +0000] [399] [INFO] Using worker: aiohttp.GunicornWebWorker [2020-06-25 16:50:04 +0000] [431] [INFO] Booting worker with pid: 431 [2020-06-25 16:50:04 +0000] [432] [INFO] Booting worker with pid: 432 pulp: pulpcore.tasking.services.worker_watcher:INFO: Cleaning up shutdown worker '396@8c2dde412bd8'. Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/django/db/backends/base/base.py", line 217, in ensure_connection self.connect() File "/usr/local/lib/python3.7/site-packages/django/db/backends/base/base.py", line 195, in connect self.connection = self.get_new_connection(conn_params) File "/usr/local/lib/python3.7/site-packages/django/db/backends/postgresql/base.py", line 178, in get_new_connection connection = Database.connect(**conn_params) File "/usr/lib64/python3.7/site-packages/psycopg2/__init__.py", line 130, in connect conn = _connect(dsn, connection_factory=connection_factory, **kwasync) psycopg2.OperationalError: FATAL: role "pulp" does not exist Removing --quiet from pg_isready results in the following output:
/var/run/postgresql:5432 - rejecting connections
or
/var/run/postgresql:5432 - no response
As such, pg_isready doesn't seem to be causing the loopwhilex to actually loop until postgres is ready.
[cont-init.d] 000-postgres: started service [312] postgres [cont-init.d] 000-postgres: wait for services startup... 2020-06-25 17:26:42.434 UTC [312] LOG: listening on IPv4 address "127.0.0.1", port 5432 2020-06-25 17:26:42.434 UTC [312] LOG: could not bind IPv6 address "::1": Cannot assign requested address 2020-06-25 17:26:42.434 UTC [312] HINT: Is another postmaster already running on port 5432? If not, wait a few seconds and retry. /var/run/postgresql:5432 - no response 2020-06-25 17:26:43.061 UTC [312] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432" 2020-06-25 17:26:43.291 UTC [312] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432" 2020-06-25 17:26:43.382 UTC [312] LOG: redirecting log output to logging collector process 2020-06-25 17:26:43.382 UTC [312] HINT: Future log output will appear in directory "log". psql: FATAL: the database system is starting up [cont-init.d] 000-postgres: createuser pulp createuser: could not connect to database postgres: FATAL: the database system is starting up
Updated by dkliban@redhat.com almost 4 years ago
- Status changed from NEW to CLOSED - CURRENTRELEASE
- Assignee set to sivel
- Triaged changed from No to Yes
- Sprint set to Sprint 75
Actions