Project

Profile

Help

Issue #6217

closed

First repo_version can contain duplicates

Added by ipanova@redhat.com about 4 years ago. Updated over 3 years ago.

Status:
CLOSED - CURRENTRELEASE
Priority:
Normal
Assignee:
-
Category:
-
Sprint/Milestone:
-
Start date:
Due date:
Estimated time:
Severity:
2. Medium
Version:
Platform Release:
OS:
Triaged:
Yes
Groomed:
No
Sprint Candidate:
No
Tags:
Sprint:
Quarter:

Description

remove_duplicates() checks for repo_keys duplicates only comparing two repo_versions. If this is first ever repo_version that is being created and it already contains duplicates, they will be saved as is [0]

if new_content_qs.count() and existing_content.count():
            _logger.debug(_("Removing duplicates for type: {}".format(type_obj.get_pulp_type())))
          <snip>

As you can see existing_content.count() would be zero, if we are creating first ever repo version, so duplicates won't be removed from the repo_version.

[0] https://github.com/pulp/pulpcore/blob/master/pulpcore/plugin/repo_version_utils.py#L48

Currently I can easily reproduce this when migrating docker content from pulp2 to pulp3. Tag repo keys in pulp2 https://github.com/pulp/pulp_docker/blob/2-master/plugins/pulp_docker/plugins/models.py#L440 Tag repo_key_fields for pulp3 https://github.com/pulp/pulp_container/blob/master/pulp_container/app/models.py#L169 In pulp2 I have a docker repo that has 18 tags. In pulp3 i should have 9 tags.

> db.units_docker_tag.count()
18

> db.units_docker_tag.distinct("name").length
9

$ http GET :24817/pulp/api/v3/repositories/container/container/e03616bb-9a1b-4b1c-bb8f-c7e3b0eb87e2/versions/1/
HTTP/1.1 200 OK
Allow: GET, DELETE, HEAD, OPTIONS
Connection: close
Content-Length: 1417
Content-Type: application/json
Date: Sun, 23 Feb 2020 18:08:15 GMT
Server: gunicorn/20.0.4
Vary: Accept, Cookie
X-Frame-Options: SAMEORIGIN

{
    "base_version": null,
    "content_summary": {
        "added": {
            "container.blob": {
                "count": 10,
                "href": "/pulp/api/v3/content/container/blobs/?repository_version_added=/pulp/api/v3/repositories/container/container/e03616bb-9a1b-4b1c-bb8f-c7e3b0eb87e2/versions/1/"
            },
            "container.manifest": {
                "count": 18,
                "href": "/pulp/api/v3/content/container/manifests/?repository_version_added=/pulp/api/v3/repositories/container/container/e03616bb-9a1b-4b1c-bb8f-c7e3b0eb87e2/versions/1/"
            },
            "container.tag": {
                "count": 18,
                "href": "/pulp/api/v3/content/container/tags/?repository_version_added=/pulp/api/v3/repositories/container/container/e03616bb-9a1b-4b1c-bb8f-c7e3b0eb87e2/versions/1/"
            }
        },
        "present": {
            "container.blob": {
                "count": 10,
                "href": "/pulp/api/v3/content/container/blobs/?repository_version=/pulp/api/v3/repositories/container/container/e03616bb-9a1b-4b1c-bb8f-c7e3b0eb87e2/versions/1/"
            },
            "container.manifest": {
                "count": 18,
                "href": "/pulp/api/v3/content/container/manifests/?repository_version=/pulp/api/v3/repositories/container/container/e03616bb-9a1b-4b1c-bb8f-c7e3b0eb87e2/versions/1/"
            },
            "container.tag": {
                "count": 18,
                "href": "/pulp/api/v3/content/container/tags/?repository_version=/pulp/api/v3/repositories/container/container/e03616bb-9a1b-4b1c-bb8f-c7e3b0eb87e2/versions/1/"
            }
        },
        "removed": {}
    },
    "number": 1,
    "pulp_created": "2020-02-23T18:08:00.140745Z",
    "pulp_href": "/pulp/api/v3/repositories/container/container/e03616bb-9a1b-4b1c-bb8f-c7e3b0eb87e2/versions/1/"
}

(pulp) [vagrant@pulp2-nightly-pulp3-source-centos7 ~]$ http GET :24817/pulp/api/v3/content/container/tags/?repository_version=/pulp/api/v3/repositories/container/container/e03616bb-9a1b-4b1c-bb8f-c7e3b0eb87e2/versions/1/
HTTP/1.1 200 OK
Allow: GET, HEAD, OPTIONS
Connection: close
Content-Length: 5916
Content-Type: application/json
Date: Sun, 23 Feb 2020 18:08:25 GMT
Server: gunicorn/20.0.4
Vary: Accept, Cookie
X-Frame-Options: SAMEORIGIN

{
    "count": 18,
    "next": null,
    "previous": null,
    "results": [
        {
            "artifact": "/pulp/api/v3/artifacts/d1fe4c88-ecab-4170-aff4-40c93aad851f/",
            "name": "manifest_c",
            "pulp_created": "2020-02-23T18:08:00.039824Z",
            "pulp_href": "/pulp/api/v3/content/container/tags/011823d9-df72-479e-a6b2-30bafc35d1ed/",
            "tagged_manifest": "/pulp/api/v3/content/container/manifests/3d818ea7-e327-4b84-aa7d-18ca2a5a9913/"
        },
        {
            "artifact": "/pulp/api/v3/artifacts/713cd855-d869-462a-82a0-f70da4cbf550/",
            "name": "ml_iii",
            "pulp_created": "2020-02-23T18:08:00.061319Z",
            "pulp_href": "/pulp/api/v3/content/container/tags/2d31ed64-82fd-448c-abd3-ca702a82efb2/",
            "tagged_manifest": "/pulp/api/v3/content/container/manifests/daf31263-088a-49a9-8ef1-5d297af89644/"
        },
        {
            "artifact": "/pulp/api/v3/artifacts/c5999abb-acca-4db0-a9c5-7b9d50da1e4f/",
            "name": "ml_i",
            "pulp_created": "2020-02-23T18:08:00.051736Z",
            "pulp_href": "/pulp/api/v3/content/container/tags/e17ebed2-a006-41a0-9bca-dca59bf7c3e0/",
            "tagged_manifest": "/pulp/api/v3/content/container/manifests/1ab986b6-d0f9-4c8b-b8c1-48fc4a711a37/"
        },
        {
            "artifact": "/pulp/api/v3/artifacts/e07234d0-4d24-456c-81f2-e129db487657/",
            "name": "manifest_b",
            "pulp_created": "2020-02-23T18:08:00.032676Z",
            "pulp_href": "/pulp/api/v3/content/container/tags/73a47aa0-310e-4588-98d0-4ef463ad1da4/",
            "tagged_manifest": "/pulp/api/v3/content/container/manifests/0f42ebbb-5fe3-4a2f-9db4-bf2d95508772/"
        },
        {
            "artifact": "/pulp/api/v3/artifacts/7495598c-1dc2-41b9-b0ca-58c18a3564b0/",
            "name": "ml_iv",
            "pulp_created": "2020-02-23T18:08:00.068199Z",
            "pulp_href": "/pulp/api/v3/content/container/tags/019a4ede-8e44-4e95-81fc-a722f681affc/",
            "tagged_manifest": "/pulp/api/v3/content/container/manifests/02629bc4-068a-4f64-bc48-ebffff460e77/"
        },
        {
            "artifact": "/pulp/api/v3/artifacts/e0f90edf-8cac-4711-b77f-0495e96c27fa/",
            "name": "ml_ii",
            "pulp_created": "2020-02-23T18:08:00.056432Z",
            "pulp_href": "/pulp/api/v3/content/container/tags/44ea6a74-146b-41c6-a10d-6b66f4d4f67d/",
            "tagged_manifest": "/pulp/api/v3/content/container/manifests/04eb5d66-2ed4-4f37-a222-6f8781b0f1e1/"
        },
        {
            "artifact": "/pulp/api/v3/artifacts/7852cb4c-12cb-4dfa-a462-0ca395de46d8/",
            "name": "ml_ii",
            "pulp_created": "2020-02-23T18:08:00.058945Z",
            "pulp_href": "/pulp/api/v3/content/container/tags/9ac86bd0-f2cb-43c2-9b2f-7a05bf6d6c96/",
            "tagged_manifest": "/pulp/api/v3/content/container/manifests/30588caf-1adf-4830-a207-b272e5035ec1/"
        },
        {
            "artifact": "/pulp/api/v3/artifacts/dc5bdd7f-6712-4d8b-846e-7dbb531bf4eb/",
            "name": "manifest_e",
            "pulp_created": "2020-02-23T18:08:00.049314Z",
            "pulp_href": "/pulp/api/v3/content/container/tags/41d70622-4159-465c-8275-4cff184f61d4/",
            "tagged_manifest": "/pulp/api/v3/content/container/manifests/17f21696-0914-47b5-a2a8-4fc58311236d/"
        },
        {
            "artifact": "/pulp/api/v3/artifacts/130f12d9-f9ce-4ad1-8b65-e0bfbd5bd4e8/",
            "name": "manifest_b",
            "pulp_created": "2020-02-23T18:08:00.034979Z",
            "pulp_href": "/pulp/api/v3/content/container/tags/2b72ef2a-0032-4e8b-ae32-5ef76eb7967b/",
            "tagged_manifest": "/pulp/api/v3/content/container/manifests/fc297c89-3fc3-4d0f-9bb5-d7224a603abe/"
        },
        {
            "artifact": "/pulp/api/v3/artifacts/7495598c-1dc2-41b9-b0ca-58c18a3564b0/",
            "name": "ml_iii",
            "pulp_created": "2020-02-23T18:08:00.063686Z",
            "pulp_href": "/pulp/api/v3/content/container/tags/a44ddeeb-5d45-4de6-bc8f-c71c3c0385ea/",
            "tagged_manifest": "/pulp/api/v3/content/container/manifests/02629bc4-068a-4f64-bc48-ebffff460e77/"
        },
        {
            "artifact": "/pulp/api/v3/artifacts/3c33bb45-1add-424b-98dd-560b9b2b1e0f/",
            "name": "manifest_a",
            "pulp_created": "2020-02-23T18:08:00.030177Z",
            "pulp_href": "/pulp/api/v3/content/container/tags/d06cf29b-2463-4083-b8e4-eae1dc658945/",
            "tagged_manifest": "/pulp/api/v3/content/container/manifests/e356e5fa-ec4a-4ea2-b25c-d4ea97c55e48/"
        },
        {
            "artifact": "/pulp/api/v3/artifacts/368f1d10-faaf-4743-9f19-cc0fd2db3a8d/",
            "name": "ml_iv",
            "pulp_created": "2020-02-23T18:08:00.065909Z",
            "pulp_href": "/pulp/api/v3/content/container/tags/5f347494-11c2-44dd-92f0-c81d9b2a866c/",
            "tagged_manifest": "/pulp/api/v3/content/container/manifests/50b528b7-8eb9-44be-92af-5c2dc87a0559/"
        },
        {
            "artifact": "/pulp/api/v3/artifacts/406b6000-4973-427b-930d-931176ab7e97/",
            "name": "manifest_d",
            "pulp_created": "2020-02-23T18:08:00.044550Z",
            "pulp_href": "/pulp/api/v3/content/container/tags/d068075c-8a9c-479d-8ac2-1a0e33e0a8fa/",
            "tagged_manifest": "/pulp/api/v3/content/container/manifests/cc1243bf-495b-4a66-8376-64572ce7080a/"
        },
        {
            "artifact": "/pulp/api/v3/artifacts/f4d8b643-ab94-4724-a84e-4fc24154c9c8/",
            "name": "manifest_a",
            "pulp_created": "2020-02-23T18:08:00.027247Z",
            "pulp_href": "/pulp/api/v3/content/container/tags/af2355ff-824d-4b5f-a202-0fcd3423a4bb/",
            "tagged_manifest": "/pulp/api/v3/content/container/manifests/c9a47f96-1a9d-4fda-92a5-038c180774c4/"
        },
        {
            "artifact": "/pulp/api/v3/artifacts/5b3ec4fa-50a9-44e6-8766-cf972571cf53/",
            "name": "manifest_d",
            "pulp_created": "2020-02-23T18:08:00.042140Z",
            "pulp_href": "/pulp/api/v3/content/container/tags/af766015-4dc6-42aa-99bb-1090cf649063/",
            "tagged_manifest": "/pulp/api/v3/content/container/manifests/cbcd6ae9-4d32-44c4-974a-369cbc5db9c1/"
        },
        {
            "artifact": "/pulp/api/v3/artifacts/2bd3cdde-6a53-4215-96e5-4f5f79168ebd/",
            "name": "manifest_e",
            "pulp_created": "2020-02-23T18:08:00.046863Z",
            "pulp_href": "/pulp/api/v3/content/container/tags/bb750482-c072-4813-9170-b071c8a6f6f0/",
            "tagged_manifest": "/pulp/api/v3/content/container/manifests/c4c77ec6-42c2-40e5-9f19-9afde52236f9/"
        },
        {
            "artifact": "/pulp/api/v3/artifacts/8b312539-7101-403a-aa2b-6222cf177ad7/",
            "name": "manifest_c",
            "pulp_created": "2020-02-23T18:08:00.037448Z",
            "pulp_href": "/pulp/api/v3/content/container/tags/2f2d774b-5b7d-4775-9539-dfccfaf0990f/",
            "tagged_manifest": "/pulp/api/v3/content/container/manifests/f39cb32d-5281-4199-9a59-4dbadc7ca153/"
        },
        {
            "artifact": "/pulp/api/v3/artifacts/02f9a804-bc6d-401c-88f9-b084603b8276/",
            "name": "ml_i",
            "pulp_created": "2020-02-23T18:08:00.054033Z",
            "pulp_href": "/pulp/api/v3/content/container/tags/bf22e7a9-a88f-4bf3-90be-f8a481022510/",
            "tagged_manifest": "/pulp/api/v3/content/container/manifests/08346abe-391f-43f7-9d8f-b487be36905f/"
        }
    ]
}

Related issues

Related to Pulp - Issue #6362: Check for duplicated content happens without plugin inputCLOSED - CURRENTRELEASEdaviddavisActions
Actions #1

Updated by daviddavis about 4 years ago

  • Triaged changed from No to Yes
  • Sprint set to Sprint 67

Adding to sprint per triage.

Actions #2

Updated by daviddavis about 4 years ago

Note that we solved a very similar issue (#5567) which probably should have fixed this issue. We should test to confirm.

Actions #3

Updated by rchan about 4 years ago

  • Sprint changed from Sprint 67 to Sprint 68
Actions #4

Updated by ttereshc about 4 years ago

  • Related to Issue #6362: Check for duplicated content happens without plugin input added
Actions #5

Updated by rchan about 4 years ago

  • Sprint deleted (Sprint 68)
Actions #6

Updated by rchan about 4 years ago

Removing from sprint. Evaluate after some related changes to later determine what remains to be done.

Actions #7

Updated by dkliban@redhat.com over 3 years ago

  • Status changed from NEW to CLOSED - CURRENTRELEASE

Plugins now handle this use case.

Also available in: Atom PDF