Project

Profile

Help

Issue #7887

If orphaned content is removed in pulp 2 between migration re-runs, FileNotFoundError is raised

Added by ttereshc about 2 months ago. Updated about 2 months ago.

Status:
NEW
Priority:
Normal
Assignee:
-
Start date:
Due date:
Estimated time:
Severity:
2. Medium
Platform Release:
OS:
Triaged:
No
Groomed:
No
Sprint Candidate:
No
Tags:
Sprint:
Quarter:

Description

It happens only if orphaned content hasn't been fully migrated with the first run.

To reproduce:

  • have orphaned content in pulp 2
  • run migration, stop it before the content migration is done (ensure that at least one of the orphaned content units was pre-migrated but not migrated.)
  • clean orphans in pulp2
  • run migration again
{
    "child_tasks": [],
    "created_resources": [
        "/pulp/api/v3/task-groups/f6d728d2-61ea-49f2-b461-f8e0baf33b9b/"
    ],
    "error": {
        "description": "[Errno 2] No such file or directory: '/var/lib/pulp/content/units/iso/d7/269cf0f9afe9445d5e31c82cf13be3b17f75af57312a1638659ac592c221fc/1.iso'",
        "traceback": "  File \"/usr/local/lib/pulp/lib64/python3.6/site-packages/rq/worker.py\", line 936, in perform_job\n    rv = job.perform()\n  File \"/usr/local/lib/pulp/lib64/python3.6/site-packages/rq/job.py\", line 684, in perform\n    self._result = self._execute()\n  File \"/usr/local/lib/pulp/lib64/python3.6/site-packages/rq/job.py\", line 690, in _execute\n    return self.func(*self.args, **self.kwargs)\n  File \"/home/vagrant/devel/pulp-2to3-migration/pulp_2to3_migration/app/tasks/migrate.py\", line 141, in migrate_from_pulp2\n    migrate_content(plan, skip_corrupted=skip_corrupted)\n  File \"/home/vagrant/devel/pulp-2to3-migration/pulp_2to3_migration/app/migration.py\", line 47, in migrate_content\n    plugin.migrator.migrate_content_to_pulp3(skip_corrupted=skip_corrupted)\n  File \"/home/vagrant/devel/pulp-2to3-migration/pulp_2to3_migration/app/plugin/iso/migrator.py\", line 64, in migrate_content_to_pulp3\n    loop.run_until_complete(dm.create())\n  File \"/usr/lib64/python3.6/asyncio/base_events.py\", line 484, in run_until_complete\n    return future.result()\n  File \"/home/vagrant/devel/pulp-2to3-migration/pulp_2to3_migration/app/plugin/content.py\", line 89, in create\n    await pipeline\n  File \"/home/vagrant/devel/pulpcore/pulpcore/plugin/stages/api.py\", line 225, in create_pipeline\n    await asyncio.gather(*futures)\n  File \"/home/vagrant/devel/pulpcore/pulpcore/plugin/stages/api.py\", line 43, in __call__\n    await self.run()\n  File \"/home/vagrant/devel/pulp-2to3-migration/pulp_2to3_migration/app/plugin/content.py\", line 178, in run\n    self.migrate_to_pulp3(cmodel, ctype)\n  File \"/home/vagrant/devel/pulp-2to3-migration/pulp_2to3_migration/app/plugin/content.py\", line 378, in migrate_to_pulp3\n    downloaded=pulp2content.downloaded\n  File \"/home/vagrant/devel/pulp-2to3-migration/pulp_2to3_migration/app/plugin/content.py\", line 128, in create_artifact\n    expected_size=expected_size)\n  File \"/home/vagrant/devel/pulpcore/pulpcore/app/models/content.py\", line 277, in init_and_validate\n    with open(file, \"rb\") as f:\n"
    },
    "finished_at": "2020-11-23T11:12:54.853991Z",
    "name": "pulp_2to3_migration.app.tasks.migrate.migrate_from_pulp2",
    "parent_task": null,
    "progress_reports": [
        {
            "code": "creating.repositories",
            "done": 0,
            "message": "Creating repositories in Pulp 3",
            "state": "completed",
            "suffix": null,
            "total": 0
        },
        {
            "code": "migrating.importers",
            "done": 0,
            "message": "Migrating importers to Pulp 3",
            "state": "completed",
            "suffix": null,
            "total": 0
        },
        {
            "code": "migrating.content",
            "done": 0,
            "message": "Migrating content to Pulp 3",
            "state": "failed",
            "suffix": null,
            "total": 0
        },
        {
            "code": "migrating.iso.content",
            "done": 13,
            "message": "Migrating iso content to Pulp 3 iso",
            "state": "failed",
            "suffix": null,
            "total": 276
        },
        {
            "code": "processing.repositories",
            "done": 4,
            "message": "Processing Pulp 2 repositories, importers, distributors",
            "state": "completed",
            "suffix": null,
            "total": 4
        },
        {
            "code": "premigrating.content.general",
            "done": 0,
            "message": "Pre-migrating Pulp 2 ISO content (general info)",
            "state": "completed",
            "suffix": null,
            "total": 0
        },
        {
            "code": "premigrating.content.detail",
            "done": 0,
            "message": "Pre-migrating Pulp 2 ISO content (detail info)",
            "state": "completed",
            "suffix": null,
            "total": 0
        }
    ],
    "pulp_created": "2020-11-23T11:12:54.394083Z",
    "pulp_href": "/pulp/api/v3/tasks/e1a1a18e-bebe-4f74-bde8-804432964765/",
    "reserved_resources_record": [
        "pulp_2to3_migration"
    ],
    "started_at": "2020-11-23T11:12:54.538645Z",
    "state": "failed",
    "task_group": "/pulp/api/v3/task-groups/f6d728d2-61ea-49f2-b461-f8e0baf33b9b/",
    "worker": "/pulp/api/v3/workers/74ba07cd-4830-4527-9c70-452f16fe2de5/"
}

History

#1 Updated by ttereshc about 2 months ago

  • Description updated (diff)

#2 Updated by dalley about 2 months ago

At the scales we are expecting, it is plausible that we could brute force this problem by:

  • Extract all of the pulp 2 content IDs from mongo
  • Extract all of the pulp 2 content IDs from the premigrated content table in postgresql
  • Dump them into sets
  • Take a difference of the sets
  • Delete the pre-migrated content which was removed from Pulp 2

MongoDB object IDs are 24 character long hexidecimal strings. 24 bytes x 2 lists x 4,000,000 content (which would be a very large installation) would yield about 200 megabytes of memory consumption. Since Python uses string interning and the contents of the lists are expected to be mostly duplicate, the actual number should be ~max 100 megabytes.

If we have to do this work anyway, we might also be able to use it to get rid of some corner cases.

https://github.com/pulp/pulp-2to3-migration/blob/master/pulp_2to3_migration/app/pre_migration.py#L140-L152 https://github.com/pulp/pulp-2to3-migration/blob/master/pulp_2to3_migration/app/pre_migration.py#L120-L124

Please register to edit this issue

Also available in: Atom PDF