Project

Profile

Help

Issue #7051

closed

postgres sometimes fails to start

Added by dkliban@redhat.com almost 4 years ago. Updated almost 3 years ago.

Status:
CLOSED - CURRENTRELEASE
Priority:
Normal
Assignee:
Category:
-
Sprint/Milestone:
-
Start date:
Due date:
Estimated time:
Severity:
2. Medium
Version:
Platform Release:
OS:
Triaged:
Yes
Groomed:
No
Sprint Candidate:
No
Tags:
Single Container
Sprint:
Sprint 75
Quarter:

Description

This issue is a clone from Github issues: https://github.com/pulp/pulp-oci-images/issues/19

For some reason, I am regularly seeing that postgres fails to start on boot.

This does not happen all the time, but while obtaining information for this issue, out of 5 attempts, this failure occurred on the 5th run.

Roughly my process is to:

docker create --name pulp docker.io/pulp/pulp-fedora31:latest
docker cp settings.py pulp:/etc/settings/settings.py
docker cp admin_pass.prep pulp:/etc/cont-init.d/111-postgres
docker start pulp

Here are the logs for such a failure:

[cont-init.d] 000-postgres: started service [312] postgres
[cont-init.d] 000-postgres: wait for services startup...
2020-06-25 16:49:53.548 UTC [312] LOG:  listening on IPv4 address "127.0.0.1", port 5432
2020-06-25 16:49:53.548 UTC [312] LOG:  could not bind IPv6 address "::1": Cannot assign requested address
2020-06-25 16:49:53.548 UTC [312] HINT:  Is another postmaster already running on port 5432? If not, wait a few seconds and retry.
psql: could not connect to server: No such file or directory
	Is the server running locally and accepting
	connections on Unix domain socket "/var/run/postgresql/.s.PGSQL.5432"?
[cont-init.d] 000-postgres: createuser pulp
createuser: could not connect to database postgres: could not connect to server: No such file or directory
	Is the server running locally and accepting
	connections on Unix domain socket "/var/run/postgresql/.s.PGSQL.5432"?
[cont-init.d] 000-postgres: createdb --encoding=utf-8 -O pulp pulp
2020-06-25 16:49:54.305 UTC [312] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2020-06-25 16:49:54.481 UTC [312] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"
2020-06-25 16:49:54.605 UTC [312] LOG:  redirecting log output to logging collector process
2020-06-25 16:49:54.605 UTC [312] HINT:  Future log output will appear in directory "log".
[cont-init.d] 000-postgres: exited 0.
[cont-init.d] 111-postgres: executing...
[cont-init.d] 111-postgres: exited 0.
[cont-init.d] zzz-postgres: executing...
[cont-init.d] zzz-postgres: shutdown service [${PG_PID}] postgres
[cont-init.d] zzz-postgres: exited 0.
[cont-init.d] done.
[services.d] starting services
2020-06-25 16:50:03.691 UTC [397] LOG:  listening on IPv4 address "127.0.0.1", port 5432
2020-06-25 16:50:03.692 UTC [397] LOG:  could not bind IPv6 address "::1": Cannot assign requested address
2020-06-25 16:50:03.692 UTC [397] HINT:  Is another postmaster already running on port 5432? If not, wait a few seconds and retry.
[services.d] done.
2020-06-25 16:50:03.768 UTC [397] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2020-06-25 16:50:03.823 UTC [397] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"
2020-06-25 16:50:03.858 UTC [397] LOG:  redirecting log output to logging collector process
2020-06-25 16:50:03.858 UTC [397] HINT:  Future log output will appear in directory "log".
[2020-06-25 16:50:04 +0000] [399] [INFO] Starting gunicorn 20.0.4
[2020-06-25 16:50:04 +0000] [399] [INFO] Listening at: http://127.0.0.1:24816 (399)
[2020-06-25 16:50:04 +0000] [399] [INFO] Using worker: aiohttp.GunicornWebWorker
[2020-06-25 16:50:04 +0000] [431] [INFO] Booting worker with pid: 431
[2020-06-25 16:50:04 +0000] [432] [INFO] Booting worker with pid: 432
pulp: pulpcore.tasking.services.worker_watcher:INFO: Cleaning up shutdown worker '396@8c2dde412bd8'.
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/django/db/backends/base/base.py", line 217, in ensure_connection
    self.connect()
  File "/usr/local/lib/python3.7/site-packages/django/db/backends/base/base.py", line 195, in connect
    self.connection = self.get_new_connection(conn_params)
  File "/usr/local/lib/python3.7/site-packages/django/db/backends/postgresql/base.py", line 178, in get_new_connection
    connection = Database.connect(**conn_params)
  File "/usr/lib64/python3.7/site-packages/psycopg2/__init__.py", line 130, in connect
    conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
psycopg2.OperationalError: FATAL:  role "pulp" does not exist

Removing --quiet from pg_isready results in the following output:
/var/run/postgresql:5432 - rejecting connections

or

/var/run/postgresql:5432 - no response

As such, pg_isready doesn't seem to be causing the loopwhilex to actually loop until postgres is ready.

[cont-init.d] 000-postgres: started service [312] postgres
[cont-init.d] 000-postgres: wait for services startup...
2020-06-25 17:26:42.434 UTC [312] LOG:  listening on IPv4 address "127.0.0.1", port 5432
2020-06-25 17:26:42.434 UTC [312] LOG:  could not bind IPv6 address "::1": Cannot assign requested address
2020-06-25 17:26:42.434 UTC [312] HINT:  Is another postmaster already running on port 5432? If not, wait a few seconds and retry.
/var/run/postgresql:5432 - no response
2020-06-25 17:26:43.061 UTC [312] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2020-06-25 17:26:43.291 UTC [312] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"
2020-06-25 17:26:43.382 UTC [312] LOG:  redirecting log output to logging collector process
2020-06-25 17:26:43.382 UTC [312] HINT:  Future log output will appear in directory "log".
psql: FATAL:  the database system is starting up
[cont-init.d] 000-postgres: createuser pulp
createuser: could not connect to database postgres: FATAL:  the database system is starting up
Actions #1

Updated by dkliban@redhat.com almost 4 years ago

  • Category set to 27
Actions #2

Updated by dkliban@redhat.com almost 4 years ago

  • Status changed from NEW to CLOSED - CURRENTRELEASE
  • Assignee set to sivel
  • Triaged changed from No to Yes
  • Sprint set to Sprint 75
Actions #3

Updated by daviddavis almost 3 years ago

  • Tags Single Container added

Also available in: Atom PDF