Project

Profile

Help

Issue #8028

closed

Cannot sync Ubuntu repositories

Added by nhavens over 3 years ago. Updated over 2 years ago.

Status:
CLOSED - WORKSFORME
Priority:
Low
Assignee:
-
Sprint/Milestone:
-
Start date:
Due date:
Estimated time:
Severity:
2. Medium
Version - Debian:
Platform Release:
Target Release - Debian:
OS:
Triaged:
Yes
Groomed:
No
Sprint Candidate:
No
Tags:
Katello
Sprint:
Quarter:

Description

With the following remote configuration, I'm unable to sync to a pulp repository:

{
  "pulp_href": "/pulp/api/v3/remotes/deb/apt/d5143b62-3f97-4f10-a466-734dcd4d5146/",
  "pulp_created": "2020-12-29T02:49:24.689377Z",
  "name": "ubuntu",
  "url": "http://mirror.us.leaseweb.net/ubuntu",
  "ca_cert": null,
  "client_cert": null,
  "client_key": null,
  "tls_validation": true,
  "proxy_url": null,
  "username": null,
  "password": null,
  "pulp_last_updated": "2020-12-29T03:06:33.546708Z",
  "download_concurrency": 10,
  "policy": "immediate",
  "total_timeout": null,
  "connect_timeout": null,
  "sock_connect_timeout": null,
  "sock_read_timeout": null,
  "distributions": "bionic bionic-backports bionic-proposed bionic-security bionic-updates",
  "components": null,
  "architectures": "amd64 i386",
  "sync_sources": false,
  "sync_udebs": true,
  "sync_installer": true,
  "gpgkey": null,
  "ignore_missing_package_indices": false

I get the following error message:

Dec 28 21:32:42 pulp-server rq[48697]: psycopg2.errors.UniqueViolation: duplicate key value violates unique constraint "core_artifact_sha256_key"    
Dec 28 21:32:42 pulp-server rq[48697]: DETAIL:  Key (sha256)=(e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855) already exists.

If I subsequently patch the remote to set ignore_missing_package_indices=true, the sync also fails, but with a different error message:

Dec 28 21:36:19 pulp-server rq[48741]: psycopg2.errors.UniqueViolation: duplicate key value violates unique constraint "core_contentartifact_content_id_relative_path_d2ef8231_uniq"
Dec 28 21:36:19 pulp-server rq[48741]: DETAIL:  Key (content_id, relative_path)=(6bfcee15-6b12-4c36-831d-9b34d1dc9cf8, dists/bionic-backports) already exists.

I'd be happy to provide full stack traces if that would be helpful.


Related issues

Related to Debian Support - Issue #6920: Pulp 3 - pulp-deb : Issue synchronizing bullseye-security repoCLOSED - WORKSFORMEActions
Actions #1

Updated by quba42 over 3 years ago

The full trace for the error would be useful.

The only thing that strikes the eye is that with all of Ubuntu bionic and two architectures this will be a large sync. (I don't think I have a test system with a large enough disk to reproduce it.) Of course large sync should nevertheless work!

The error sounds like there is some artifact that pulp_deb is trying to save more than once, perhaps the back trace can give us some clues what that artifact is. (We have the checksum, so searching through packages indices might also work...)

Actions #2

Updated by quba42 over 3 years ago

The error for the sync with "ignore_missing_package_indices=true" looks odd, since it suggests it is trying to create some artifact with a relative path of dists/bionic-backports. But that path is a folder and not a file, and should never be used for any artifact... I will need to investigate.

Actions #3

Updated by nhavens over 3 years ago

Traceback for the duplicate key error with ignore_missing_package_indices=false:

  File \"/site/apps/pulp3/pulpenv/lib64/python3.6/site-packages/rq/worker.py\", line 975, in perform_job
    rv = job.perform()
  File \"/site/apps/pulp3/pulpenv/lib64/python3.6/site-packages/rq/job.py\", line 696, in perform
    self._result = self._execute()
  File \"/site/apps/pulp3/pulpenv/lib64/python3.6/site-packages/rq/job.py\", line 719, in _execute
    return self.func(*self.args, **self.kwargs)
  File \"/site/apps/pulp3/pulpenv/lib64/python3.6/site-packages/pulp_deb/app/tasks/synchronizing.py\", line 106, in synchronize
    DebDeclarativeVersion(first_stage, repository, mirror=mirror).create()
  File \"/site/apps/pulp3/pulpenv/lib64/python3.6/site-packages/pulpcore/plugin/stages/declarative_version.py\", line 148, in create
    loop.run_until_complete(pipeline)
  File \"/usr/lib64/python3.6/asyncio/base_events.py\", line 484, in run_until_complete
    return future.result()
  File \"/site/apps/pulp3/pulpenv/lib64/python3.6/site-packages/pulpcore/plugin/stages/api.py\", line 225, in create_pipeline
    await asyncio.gather(*futures)
  File \"/site/apps/pulp3/pulpenv/lib64/python3.6/site-packages/pulpcore/plugin/stages/api.py\", line 43, in __call__
    await self.run()
  File \"/site/apps/pulp3/pulpenv/lib64/python3.6/site-packages/pulp_deb/app/tasks/synchronizing.py\", line 304, in run 
    da.artifact.save()
  File \"/site/apps/pulp3/pulpenv/lib64/python3.6/site-packages/pulpcore/app/models/content.py\", line 126, in save
    super().save(*args, **kwargs)
  File \"/site/apps/pulp3/pulpenv/lib64/python3.6/site-packages/django_lifecycle/mixins.py\", line 129, in save
    save(*args, **kwargs)
  File \"/site/apps/pulp3/pulpenv/lib64/python3.6/site-packages/django/db/models/base.py\", line 744, in save
    force_update=force_update, update_fields=update_fields)
  File \"/site/apps/pulp3/pulpenv/lib64/python3.6/site-packages/django/db/models/base.py\", line 782, in save_base
    force_update, using, update_fields,
  File \"/site/apps/pulp3/pulpenv/lib64/python3.6/site-packages/django/db/models/base.py\", line 873, in _save_table
    result = self._do_insert(cls._base_manager, using, fields, update_pk, raw)
  File \"/site/apps/pulp3/pulpenv/lib64/python3.6/site-packages/django/db/models/base.py\", line 911, in _do_insert
    using=using, raw=raw)
  File \"/site/apps/pulp3/pulpenv/lib64/python3.6/site-packages/django/db/models/manager.py\", line 82, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File \"/site/apps/pulp3/pulpenv/lib64/python3.6/site-packages/django/db/models/query.py\", line 1186, in _insert
    return query.get_compiler(using=using).execute_sql(return_id)
  File \"/site/apps/pulp3/pulpenv/lib64/python3.6/site-packages/django/db/models/sql/compiler.py\", line 1377, in execute_sql
    cursor.execute(sql, params)
  File \"/site/apps/pulp3/pulpenv/lib64/python3.6/site-packages/django/db/backends/utils.py\", line 67, in execute
    return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
  File \"/site/apps/pulp3/pulpenv/lib64/python3.6/site-packages/django/db/backends/utils.py\", line 76, in _execute_with_wrappers
    return executor(sql, params, many, context)
  File \"/site/apps/pulp3/pulpenv/lib64/python3.6/site-packages/django/db/backends/utils.py\", line 84, in _execute
    return self.cursor.execute(sql, params)
  File \"/site/apps/pulp3/pulpenv/lib64/python3.6/site-packages/django/db/utils.py\", line 89, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
  File \"/site/apps/pulp3/pulpenv/lib64/python3.6/site-packages/django/db/backends/utils.py\", line 84, in _execute
    return self.cursor.execute(sql, params)

Traceback for the "ContentArtifact matching query does not exist" error with ignore_missing_package_indices=true:

  File \"/site/apps/pulp3/pulpenv/lib64/python3.6/site-packages/rq/worker.py\", line 975, in perform_job
    rv = job.perform()
  File \"/site/apps/pulp3/pulpenv/lib64/python3.6/site-packages/rq/job.py\", line 696, in perform
    self._result = self._execute()
  File \"/site/apps/pulp3/pulpenv/lib64/python3.6/site-packages/rq/job.py\", line 719, in _execute
    return self.func(*self.args, **self.kwargs) 
  File \"/site/apps/pulp3/pulpenv/lib64/python3.6/site-packages/pulp_deb/app/tasks/synchronizing.py\", line 106, in synchronize
    DebDeclarativeVersion(first_stage, repository, mirror=mirror).create()
  File \"/site/apps/pulp3/pulpenv/lib64/python3.6/site-packages/pulpcore/plugin/stages/declarative_version.py\", line 148, in create
    loop.run_until_complete(pipeline)
  File \"/usr/lib64/python3.6/asyncio/base_events.py\", line 484, in run_until_complete
    return future.result()
  File \"/site/apps/pulp3/pulpenv/lib64/python3.6/site-packages/pulpcore/plugin/stages/api.py\", line 225, in create_pipeline
    await asyncio.gather(*futures)
  File \"/site/apps/pulp3/pulpenv/lib64/python3.6/site-packages/pulpcore/plugin/stages/api.py\", line 43, in __call__
    await self.run()
  File \"/site/apps/pulp3/pulpenv/lib64/python3.6/site-packages/pulpcore/plugin/stages/content_stages.py\", line 113, in run
    ContentArtifact.objects.bulk_get_or_create(content_artifact_bulk)
  File \"/site/apps/pulp3/pulpenv/lib64/python3.6/site-packages/pulpcore/app/models/content.py\", line 89, in bulk_get_or_create
    objs[i] = objs[i].__class__.objects.get(objs[i].q())
  File \"/site/apps/pulp3/pulpenv/lib64/python3.6/site-packages/django/db/models/manager.py\", line 82, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File \"/site/apps/pulp3/pulpenv/lib64/python3.6/site-packages/django/db/models/query.py\", line 408, in get
    self.model._meta.object_name
Actions #4

Updated by nhavens over 3 years ago

In case it's helpful, if I patch the remote to only include bionic bionic-security bionic-updates, the sync succeeds.

Actions #5

Updated by quba42 over 3 years ago

The minimal remote that I was able to reproduce this with goes as follows:

ENTITIES_NAME='bionic_backports'
UPSTREAM_URL='http://mirror.us.leaseweb.net/ubuntu'
ARCHITECTURES='amd64'
COMPONENTS='multiverse restricted'
DISTRIBUTIONS='bionic-backports'

REPO_HREF=$(http post "${BASE_ADDR}"/pulp/api/v3/repositories/deb/apt/ name="${ENTITIES_NAME}" | jq -r '.pulp_href')
REMOTE_HREF=$(http post "${BASE_ADDR}"/pulp/api/v3/remotes/deb/apt/ name="${ENTITIES_NAME}" distributions="${DISTRIBUTIONS}" architectures="${ARCHITECTURES}" components="${COMPONENTS}" url="${UPSTREAM_URL}" | jq -r '.pulp_href')
SYNC_TASK_HREF=$(http post "${BASE_ADDR}${REPO_HREF}"sync/ remote="${REMOTE_HREF}" | jq -r '.task')
http get "${BASE_ADDR}${SYNC_TASK_HREF}" | jq '.state'

The two components multiverse and restricted can be individually synced, but not both within the same pulp instance. It looks like something from those two components clashes within pulp_deb.

Actions #6

Updated by quba42 over 3 years ago

  • Triaged changed from No to Yes
Actions #7

Updated by quba42 over 3 years ago

Update: It looks like the problematic components are both empty (contain no packages), and Pulp is tripping over the fact that both Packages file artifacts being created are empty and hence identical. However, it is very strange that we are tripping over this, since we have been able to sync empty repositories (which should suffer from the same problem) and indeed we have test coverage for empty repositories.

See also: https://pulp.plan.io/issues/7344

Actions #8

Updated by quba42 about 3 years ago

  • Sprint/Milestone set to Katello
  • Tags Katello added
Actions #9

Updated by quba42 almost 3 years ago

  • Priority changed from Normal to Low

Setting this to low priority, since there is a simple workaround: (Do not sync the problematic empty components.)

Still should be investigated and fixed, though.

Actions #10

Updated by knzivid almost 3 years ago

I hit the same problem trying to sync "xenial xenial-updates xenial-backports" from http://no.archive.ubuntu.com/ubuntu/

How can I tell if the components are empty on this one too?

Actions #11

Updated by quba42 almost 3 years ago

  • Related to Issue #6920: Pulp 3 - pulp-deb : Issue synchronizing bullseye-security repo added
Actions #12

Updated by quba42 over 2 years ago

  • Status changed from NEW to CLOSED - WORKSFORME

I can no longer reproduce this issue with latest pulpcore and pulp_deb.

It looks like this was fixed as a side effect of something else.

I am closing this ticket "CLOSED - WORKSFORME".

If somebody can find another reproducer, feel free to re-open or open a new ticket!

Actions #13

Updated by quba42 over 2 years ago

  • Sprint/Milestone deleted (Katello)

Also available in: Atom PDF