Project

Profile

Help

Issue #4060

closed

QueryExistingArtifacts stage does not prevent duplicates within a stream

Added by amacdona@redhat.com over 5 years ago. Updated over 4 years ago.

Status:
CLOSED - CURRENTRELEASE
Priority:
Normal
Category:
-
Sprint/Milestone:
Start date:
Due date:
Estimated time:
Severity:
2. Medium
Version:
Platform Release:
OS:
Triaged:
Yes
Groomed:
No
Sprint Candidate:
No
Tags:
Sprint:
Sprint 45
Quarter:

Description

Note: This problem occurs because content_2 (below) either is passed by the QueryExisting stage twice. This problem will occur in the save stage. Either way, 2 or more dupes are in the Queue for the save stage at the same time. Whether the dupes are saved in separate batches or together in 1 batch, we will fail with an integrity error.

I've encountered this while implementing Docker sync, so there isn't a simple reproducer yet. This issue occurs when a sync has multiple metadata files that refer to the same content, which has the same artifact.

metadata_1.artifacts = [content_1, content_2]
metadata_2.artifacts = [content_2, content_3]

content_2 and its artifact are able to pass through the pipeline twice resulting in a duplicate key.

django.db.utils.IntegrityError: duplicate key value violates unique constraint "pulp_app_artifact_sha256_key"

The actual error occurs in the ArtifactSaver stage during a bulk save, but AFAICT this stage needs to rely on unique units to allow bulk_save.

Full traceback:

Traceback (most recent call last):
  File "/home/vagrant/.virtualenvs/pulp/lib64/python3.6/site-packages/rq/worker.py", line 793, in perform_job
    rv = job.perform()
  File "/home/vagrant/.virtualenvs/pulp/lib64/python3.6/site-packages/rq/job.py", line 599, in perform
    self._result = self._execute()
  File "/home/vagrant/.virtualenvs/pulp/lib64/python3.6/site-packages/rq/job.py", line 605, in _execute
    return self.func(*self.args, **self.kwargs)
  File "/home/vagrant/devel/pulp_docker/pulp_docker/app/tasks/synchronizing.py", line 44, in synchronize
    DeclarativeVersion(first_stage, repository).create()
  File "/home/vagrant/devel/pulp/plugin/pulpcore/plugin/stages/declarative_version.py", line 125, in create
    loop.run_until_complete(pipeline)
  File "/usr/lib64/python3.6/asyncio/base_events.py", line 468, in run_until_complete
    return future.result()
  File "/home/vagrant/devel/pulp/plugin/pulpcore/plugin/stages/api.py", line 128, in create_pipeline
    await asyncio.gather(*futures)
  File "/home/vagrant/devel/pulp/plugin/pulpcore/plugin/stages/artifact_stages.py", line 323, in __call__
    Artifact.objects.bulk_create(artifacts_to_save)
  File "/home/vagrant/.virtualenvs/pulp/lib64/python3.6/site-packages/django/db/models/manager.py", line 82, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/home/vagrant/.virtualenvs/pulp/lib64/python3.6/site-packages/django/db/models/query.py", line 465, in bulk_create
    ids = self._batched_insert(objs_without_pk, fields, batch_size)
  File "/home/vagrant/.virtualenvs/pulp/lib64/python3.6/site-packages/django/db/models/query.py", line 1149, in _batched_insert
    inserted_id = self._insert(item, fields=fields, using=self.db, return_id=True)
  File "/home/vagrant/.virtualenvs/pulp/lib64/python3.6/site-packages/django/db/models/query.py", line 1136, in _insert
    return query.get_compiler(using=using).execute_sql(return_id)
  File "/home/vagrant/.virtualenvs/pulp/lib64/python3.6/site-packages/django/db/models/sql/compiler.py", line 1289, in execute_sql
    cursor.execute(sql, params)
  File "/home/vagrant/.virtualenvs/pulp/lib64/python3.6/site-packages/django/db/backends/utils.py", line 100, in execute
    return super().execute(sql, params)
  File "/home/vagrant/.virtualenvs/pulp/lib64/python3.6/site-packages/django/db/backends/utils.py", line 68, in execute
    return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
  File "/home/vagrant/.virtualenvs/pulp/lib64/python3.6/site-packages/django/db/backends/utils.py", line 77, in _execute_with_wrappers
    return executor(sql, params, many, context)
  File "/home/vagrant/.virtualenvs/pulp/lib64/python3.6/site-packages/django/db/backends/utils.py", line 85, in _execute
    return self.cursor.execute(sql, params)
  File "/home/vagrant/.virtualenvs/pulp/lib64/python3.6/site-packages/django/db/utils.py", line 89, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
  File "/home/vagrant/.virtualenvs/pulp/lib64/python3.6/site-packages/django/db/backends/utils.py", line 85, in _execute
    return self.cursor.execute(sql, params)
django.db.utils.IntegrityError: duplicate key value violates unique constraint "pulp_app_artifact_sha256_key"
DETAIL:  Key (sha256)=(8ddc19f16526912237dd8af81971d5e4dd0587907234be2b83e249518d5b673f) already exists.

Related issues

Related to Pulp - Issue #4085: ContentUnitSaver stage is vulnerable to race conditions.CLOSED - CURRENTRELEASEdkliban@redhat.comActions
Related to Container Support - Refactor #4177: Update sync to use ArtifactSaver StageCLOSED - CURRENTRELEASEipanova@redhat.com

Actions
Has duplicate Pulp - Issue #4086: ArtifactSaver stage is vulnerable to race conditions.CLOSED - DUPLICATEActions

Also available in: Atom PDF