Issue #7828
closedsyncing two repos with the same content at the same time results in an error
Description
a katello user reported the following error:
File "/usr/lib/python3.6/site-packages/pulpcore/app/models/base.py", line 115, in save
return super().save(*args, **kwargs)
File "/usr/lib/python3.6/site-packages/django_lifecycle/mixins.py", line 128, in save
save(*args, **kwargs)
File "/usr/lib/python3.6/site-packages/django/db/models/base.py", line 744, in save
force_update=force_update, update_fields=update_fields)
File "/usr/lib/python3.6/site-packages/django/db/models/base.py", line 782, in save_base
force_update, using, update_fields,
File "/usr/lib/python3.6/site-packages/django/db/models/base.py", line 873, in _save_table
result = self._do_insert(cls._base_manager, using, fields, update_pk, raw)
File "/usr/lib/python3.6/site-packages/django/db/models/base.py", line 911, in _do_insert
using=using, raw=raw)
File "/usr/lib/python3.6/site-packages/django/db/models/manager.py", line 82, in manager_method
return getattr(self.get_queryset(), name)(*args, **kwargs)
File "/usr/lib/python3.6/site-packages/django/db/models/query.py", line 1186, in _insert
return query.get_compiler(using=using).execute_sql(return_id)
File "/usr/lib/python3.6/site-packages/django/db/models/sql/compiler.py", line 1377, in execute_sql
cursor.execute(sql, params)
File "/usr/lib/python3.6/site-packages/django/db/backends/utils.py", line 67, in execute
return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
File "/usr/lib/python3.6/site-packages/django/db/backends/utils.py", line 76, in _execute_with_wrappers
return executor(sql, params, many, context)
File "/usr/lib/python3.6/site-packages/django/db/backends/utils.py", line 84, in _execute
return self.cursor.execute(sql, params)
File "/usr/lib/python3.6/site-packages/django/db/utils.py", line 89, in __exit__
raise dj_exc_value.with_traceback(traceback) from exc_value
File "/usr/lib/python3.6/site-packages/django/db/backends/utils.py", line 84, in _execute
return self.cursor.execute(sql, params)
django.db.utils.IntegrityError: FEHLER: doppelter Schlüsselwert verletzt Unique-Constraint »rpm_package_pkgId_key«
DETAIL: Schlüssel »("pkgId")=(6573346016ae0cbf54cfcfc2e59552128fdafebe)« existiert bereits.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/rq/worker.py", line 936, in perform_job
rv = job.perform()
File "/usr/lib/python3.6/site-packages/rq/job.py", line 684, in perform
self._result = self._execute()
File "/usr/lib/python3.6/site-packages/rq/job.py", line 690, in _execute
return self.func(*self.args, **self.kwargs)
File "/usr/lib/python3.6/site-packages/pulp_rpm/app/tasks/synchronizing.py", line 266, in synchronize
dv.create()
File "/usr/lib/python3.6/site-packages/pulpcore/plugin/stages/declarative_version.py", line 148, in create
loop.run_until_complete(pipeline)
File "/usr/lib64/python3.6/asyncio/base_events.py", line 484, in run_until_complete
return future.result()
File "/usr/lib/python3.6/site-packages/pulpcore/plugin/stages/api.py", line 225, in create_pipeline
await asyncio.gather(*futures)
File "/usr/lib/python3.6/site-packages/pulpcore/plugin/stages/api.py", line 43, in __call__
await self.run()
File "/usr/lib/python3.6/site-packages/pulpcore/plugin/stages/content_stages.py", line 105, in run
d_content.content.q()
File "/usr/lib/python3.6/site-packages/django/db/models/manager.py", line 82, in manager_method
return getattr(self.get_queryset(), name)(*args, **kwargs)
File "/usr/lib/python3.6/site-packages/django/db/models/query.py", line 408, in get
self.model._meta.object_name
pulp_rpm.app.models.package.Package.DoesNotExist: Package matching query does not exist.
when trying to sync two of the same repos at the same time. His info:
When syncing the Oracle Linux repositories "Oracle Linux 7 (x86_64) Latest" (http://yum.oracle.com/repo/OracleLinux/OL7/latest/x86_64) and "Oracle Linux 7 (x86_64) Optional Latest" (http://yum.oracle.com/repo/OracleLinux/OL7/optional/latest/x86_64), the synchronization of the repo, which is synced after the other one fails with the error "Package matching query does not exist".
When looking at the pulp worker logs, I can see, that the database throws a duplicate key error, because some packages are present in both repositories:
Examples: c-ares-devel-1.10.0-3.el7.i686.rpm c-ares-devel-1.10.0-3.el7.x86_64.rpm cdparanoia-10.2-17.el7.x86_64.rpm
I found this behaviour in both Katello 3.16 and in Katello 3.17RC2.
As I don't have any possibility to change the foreign repositories, it would be nice to have either a possibility to filter out the packages like in Content Views, so that they are not synced for specific repositories, or if the duplicate would be detected by an error handling and the existing package is also linked to the other repository (no duplicate in the file system, but an additional reference in the database).
I attached a PDF with the error in the Foreman interface, the pulp worker log with the error and the corresponding entry from the database.
for more info see: https://projects.theforeman.org/issues/31254
Related issues