Project

Profile

Help

Issue #8633

PulpImport with overlapping content can fail with unique-constraint violation

Added by ggainey 5 months ago. Updated about 2 months ago.

Status:
CLOSED - CURRENTRELEASE
Priority:
Normal
Assignee:
Category:
-
Sprint/Milestone:
Start date:
Due date:
Estimated time:
Severity:
2. Medium
Version:
Platform Release:
OS:
Triaged:
Yes
Groomed:
No
Sprint Candidate:
No
Tags:
Sprint:
Sprint 97
Quarter:

Description

With the fix for #7904 in place for pulp_rpm, importing repos with overlapping content can/will fail with a constraint-violation like the following:

    "error": {
      "traceback": "  File \"/usr/local/lib/pulp/lib64/python3.9/site-packages/rq/worker.py\", line 1008, in perform_job
    rv = job.perform()
  File \"/usr/local/lib/pulp/lib64/python3.9/site-packages/rq/job.py\", line 706, in perform
    self._result = self._execute()
  File \"/usr/local/lib/pulp/lib64/python3.9/site-packages/rq/job.py\", line 729, in _execute
    result = self.func(*self.args, **self.kwargs)
  File \"/home/vagrant/devel/pulpcore/pulpcore/app/tasks/importer.py\", line 161, in import_repository_version
    a_result = _import_file(os.path.join(rv_path, filename), res_class)
  File \"/home/vagrant/devel/pulpcore/pulpcore/app/tasks/importer.py\", line 65, in _import_file
    return resource.import_data(data, raise_errors=True)
  File \"/usr/local/lib/pulp/lib64/python3.9/site-packages/import_export/resources.py\", line 741, in import_data
    return self.import_data_inner(dataset, dry_run, raise_errors, using_transactions, collect_failed_rows, **kwargs)
  File \"/usr/local/lib/pulp/lib64/python3.9/site-packages/import_export/resources.py\", line 788, in import_data_inner
    raise row_result.errors[-1].error
  File \"/usr/local/lib/pulp/lib64/python3.9/site-packages/import_export/resources.py\", line 668, in import_row
    self.save_instance(instance, using_transactions, dry_run)
  File \"/usr/local/lib/pulp/lib64/python3.9/site-packages/import_export/resources.py\", line 446, in save_instance
    instance.save()
  File \"/home/vagrant/devel/pulpcore/pulpcore/app/models/base.py\", line 149, in save
    return super().save(*args, **kwargs)
  File \"/usr/local/lib/pulp/lib64/python3.9/site-packages/django_lifecycle/mixins.py\", line 134, in save
    save(*args, **kwargs)
  File \"/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/base.py\", line 743, in save
    self.save_base(using=using, force_insert=force_insert,
  File \"/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/base.py\", line 780, in save_base
    updated = self._save_table(
  File \"/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/base.py\", line 873, in _save_table
    result = self._do_insert(cls._base_manager, using, fields, update_pk, raw)
  File \"/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/base.py\", line 910, in _do_insert
    return manager._insert([self], fields=fields, return_id=update_pk,
  File \"/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/manager.py\", line 82, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File \"/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/query.py\", line 1186, in _insert
    return query.get_compiler(using=using).execute_sql(return_id)
  File \"/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/models/sql/compiler.py\", line 1377, in execute_sql
    cursor.execute(sql, params)
  File \"/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/backends/utils.py\", line 67, in execute
    return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
  File \"/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/backends/utils.py\", line 76, in _execute_with_wrappers
    return executor(sql, params, many, context)
  File \"/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/backends/utils.py\", line 84, in _execute
    return self.cursor.execute(sql, params)
  File \"/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/utils.py\", line 89, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
  File \"/usr/local/lib/pulp/lib64/python3.9/site-packages/django/db/backends/utils.py\", line 84, in _execute
    return self.cursor.execute(sql, params)
",
      "description": "duplicate key value violates unique constraint \"rpm_package_pkgId_key\"
DETAIL:  Key (\"pkgId\")=(4c61ae66f89a55a74dabec2dfe408df3befbf6baadc9c03f70d496b9eab34d3e) already exists.
"
    },

This is the result of a) repo-versions importing in parallel triggering b) a race-condition in django-import-export's get_or_init_instance() code.

This has to be addressed for the fix to #7904 to be useful.


Related issues

Related to RPM Support - Issue #7904: PulpImport can deadlock when importing Centos*-base and app-stream in one import fileCLOSED - CURRENTRELEASE<a title="Actions" class="icon-only icon-actions js-contextmenu" href="#">Actions</a>
Related to Pulp - Issue #8967: "duplicate key value violates unique constraint" when syncing two repositories with identical content in parallelNEW<a title="Actions" class="icon-only icon-actions js-contextmenu" href="#">Actions</a>

Associated revisions

Revision d621ebff View on GitHub
Added by ggainey 4 months ago

Taught PulpImporter to retry (once) on _import_file() failure.

There's a race condition in django-import-export's get_or_init_instance() that is exercised by importing repo-versions concurrently. We attempt an import and check for errors, retrying ONCE if encountered. On a second error, fail the attempt.

The test added for pulp_rpm #7904 cover this case.

fixes #8633 [nocoverage]

History

#1 Updated by ggainey 5 months ago

  • Related to Issue #7904: PulpImport can deadlock when importing Centos*-base and app-stream in one import file added

#2 Updated by pulpbot 5 months ago

  • Status changed from ASSIGNED to POST

#3 Updated by fao89 5 months ago

  • Triaged changed from No to Yes
  • Sprint set to Sprint 95

#4 Updated by rchan 5 months ago

  • Sprint changed from Sprint 95 to Sprint 96

#5 Updated by rchan 4 months ago

  • Sprint changed from Sprint 96 to Sprint 97

#6 Updated by ggainey 4 months ago

  • Status changed from POST to MODIFIED

#7 Updated by dalley 4 months ago

  • Sprint/Milestone set to 3.13.0

#8 Updated by pulpbot 4 months ago

  • Status changed from MODIFIED to CLOSED - CURRENTRELEASE

#9 Updated by dalley 3 months ago

  • Related to Issue #8967: "duplicate key value violates unique constraint" when syncing two repositories with identical content in parallel added

Please register to edit this issue

Also available in: Atom PDF