Actions
Issue #8750
closedDeadlock on rpm repository pulp2pulp sync
Start date:
Due date:
Estimated time:
Severity:
3. High
Version:
Platform Release:
OS:
Triaged:
Yes
Groomed:
No
Sprint Candidate:
No
Tags:
Katello
Sprint:
Sprint 106
Quarter:
Description
When running repository sync, I am hitting the below deadlock.
The content is a mirror of "http://linux.dell.com/repo/hardware/dsu/os_independent/"
{'child_tasks': [],
'created_resources': [],
'error': {'description': 'deadlock detected\n'
'DETAIL: Process 31559 waits for ShareLock on '
'transaction 4972396; blocked by process 31883.\n'
'Process 31883 waits for ShareLock on transaction '
'4972398; blocked by process 31559.\n'
'HINT: See server log for query details.\n'
'CONTEXT: while inserting index tuple (49588,3) in '
'relation "rpm_package_pkgId_key"\n',
'traceback': ' File '
'"/opt/bats/lib/python3.8/site-packages/rq/worker.py", '
'line 1008, in perform_job\n'
' rv = job.perform()\n'
' File '
'"/opt/bats/lib/python3.8/site-packages/rq/job.py", '
'line 706, in perform\n'
' self._result = self._execute()\n'
' File '
'"/opt/bats/lib/python3.8/site-packages/rq/job.py", '
'line 729, in _execute\n'
' result = self.func(*self.args, **self.kwargs)\n'
' File '
'"/opt/bats/lib/python3.8/site-packages/pulp_rpm/app/tasks/synchronizing.py", '
'line 269, in synchronize\n'
' dv.create()\n'
' File '
'"/opt/bats/lib/python3.8/site-packages/pulpcore/plugin/stages/declarative_version.py", '
'line 149, in create\n'
' loop.run_until_complete(pipeline)\n'
' File '
'"/opt/bats/lib/python3.8/asyncio/base_events.py", '
'line 616, in run_until_complete\n'
' return future.result()\n'
' File '
'"/opt/bats/lib/python3.8/site-packages/pulpcore/plugin/stages/api.py", '
'line 225, in create_pipeline\n'
' await asyncio.gather(*futures)\n'
' File '
'"/opt/bats/lib/python3.8/site-packages/pulpcore/plugin/stages/api.py", '
'line 43, in __call__\n'
' await self.run()\n'
' File '
'"/opt/bats/lib/python3.8/site-packages/pulpcore/plugin/stages/content_stages.py", '
'line 96, in run\n'
' d_content.content.save()\n'
' File '
'"/opt/bats/lib/python3.8/site-packages/pulpcore/app/models/base.py", '
'line 149, in save\n'
' return super().save(*args, **kwargs)\n'
' File '
'"/opt/bats/lib/python3.8/site-packages/django_lifecycle/mixins.py", '
'line 134, in save\n'
' save(*args, **kwargs)\n'
' File '
'"/opt/bats/lib/python3.8/site-packages/django/db/models/base.py", '
'line 743, in save\n'
' self.save_base(using=using, '
'force_insert=force_insert,\n'
' File '
'"/opt/bats/lib/python3.8/site-packages/django/db/models/base.py", '
'line 780, in save_base\n'
' updated = self._save_table(\n'
' File '
'"/opt/bats/lib/python3.8/site-packages/django/db/models/base.py", '
'line 873, in _save_table\n'
' result = self._do_insert(cls._base_manager, '
'using, fields, update_pk, raw)\n'
' File '
'"/opt/bats/lib/python3.8/site-packages/django/db/models/base.py", '
'line 910, in _do_insert\n'
' return manager._insert([self], fields=fields, '
'return_id=update_pk,\n'
' File '
'"/opt/bats/lib/python3.8/site-packages/django/db/models/manager.py", '
'line 82, in manager_method\n'
' return getattr(self.get_queryset(), name)(*args, '
'**kwargs)\n'
' File '
'"/opt/bats/lib/python3.8/site-packages/django/db/models/query.py", '
'line 1186, in _insert\n'
' return '
'query.get_compiler(using=using).execute_sql(return_id)\n'
' File '
'"/opt/bats/lib/python3.8/site-packages/django/db/models/sql/compiler.py", '
'line 1377, in execute_sql\n'
' cursor.execute(sql, params)\n'
' File '
'"/opt/bats/lib/python3.8/site-packages/django/db/backends/utils.py", '
'line 67, in execute\n'
' return self._execute_with_wrappers(sql, params, '
'many=False, executor=self._execute)\n'
' File '
'"/opt/bats/lib/python3.8/site-packages/django/db/backends/utils.py", '
'line 76, in _execute_with_wrappers\n'
' return executor(sql, params, many, context)\n'
' File '
'"/opt/bats/lib/python3.8/site-packages/django/db/backends/utils.py", '
'line 84, in _execute\n'
' return self.cursor.execute(sql, params)\n'
' File '
'"/opt/bats/lib/python3.8/site-packages/django/db/utils.py", '
'line 89, in __exit__\n'
' raise dj_exc_value.with_traceback(traceback) from '
'exc_value\n'
' File '
'"/opt/bats/lib/python3.8/site-packages/django/db/backends/utils.py", '
'line 84, in _execute\n'
' return self.cursor.execute(sql, params)\n'},
'finished_at': datetime.datetime(2021, 5, 13, 16, 4, 16, 230996, tzinfo=tzutc()),
'logging_cid': '01b00ffb6c684f1a83bd46878184fe5b',
'name': 'pulp_rpm.app.tasks.synchronizing.synchronize',
'parent_task': None,
'progress_reports': [{'code': 'downloading.metadata',
'done': 5,
'message': 'Downloading Metadata Files',
'state': 'canceled',
'suffix': None,
'total': None},
{'code': 'sync.downloading.artifacts',
'done': 5501,
'message': 'Downloading Artifacts',
'state': 'canceled',
'suffix': None,
'total': None},
{'code': 'associating.content',
'done': 5501,
'message': 'Associating Content',
'state': 'canceled',
'suffix': None,
'total': None},
{'code': 'parsing.advisories',
'done': 0,
'message': 'Parsed Advisories',
'state': 'completed',
'suffix': None,
'total': 0},
{'code': 'parsing.packages',
'done': 9010,
'message': 'Parsed Packages',
'state': 'canceled',
'suffix': None,
'total': 34644}],
'pulp_created': datetime.datetime(2021, 5, 13, 15, 8, 17, 924670, tzinfo=tzutc()),
'pulp_href': '/pulp/api/v3/tasks/c4ed6290-a764-4ec3-96a4-1b6ce55597f3/',
'reserved_resources_record': ['/pulp/api/v3/repositories/rpm/rpm/f8bfd15c-a831-4e63-a8f7-2fd3156c97af/',
'/pulp/api/v3/remotes/rpm/rpm/d12fcb7e-9c0f-4226-bf0a-42381f4d9b51/'],
'started_at': datetime.datetime(2021, 5, 13, 15, 8, 18, 18915, tzinfo=tzutc()),
'state': 'failed',
'task_group': None,
'worker': '/pulp/api/v3/workers/b63013da-896e-4882-884a-6f9b9b6410b0/'}
A little context.
I have 3 secondary Pulp3 servers, syncing against a primary Pulp3 server, I kick off a sync of ~130 repositories per secondary in one go.
I've done this 3 times, and I've hit this deadlock on three occasions and so far only on this dell dsu repository.
Let me know if there is any more information that I can provide.
Related issues
Actions
Fix occasional deadlocks when doing multiple similar syncs concurrently.
Forcing deadlocks requires a lot of time and pulpcore-workers running. There is therefore no specific CI test for this, but there is a reproducer script that will force deadlocks to happen (and show that they're fixed) here:
https://github.com/ggainey/pulp_startup/blob/main/8750_deadlocks/file_repro.sh
fixes #8750. [nocoverage]