Project

Profile

Help

Issue #1944

closed

YumMetadataFile copy does not save its new storage_path

Added by mhrivnak almost 8 years ago. Updated almost 4 years ago.

Status:
CLOSED - CURRENTRELEASE
Priority:
High
Sprint/Milestone:
-
Start date:
Due date:
Estimated time:
Severity:
3. High
Version:
2.8.3
Platform Release:
2.8.5
OS:
Triaged:
Yes
Groomed:
No
Sprint Candidate:
No
Tags:
Pulp 2
Sprint:
Sprint 3
Quarter:

Description

This bug was introduced in pulp 2.8.

When a YumMetadataFile is copied from repo A to repo B, a new unit gets created. A new copy of the file also gets made.

As of pulp 2.8.3, the code works like this:

1. create a new unit (or find a pre-existing one)
2. save the new unit
3. calculate a new storage path for the new unit
4. copy the file to the new storage path

The problem is that the new storage path calculated in step 3 never gets saved. Swapping the order of 2 and 3 fixes the problem.

The effect is that the new unit's storage_path references the old unit's file. If either unit gets deleted, it blows away the file, leaving one or more remaining units referencing a non-existing file.

You can reproduce this by syncing a repo with a metadata file, such as a "productid", and then copying that unit to another repo. You can then see in the database that two units exist referencing the same file on disk.

This diff corrects the behavior:

diff --git a/plugins/pulp_rpm/plugins/importers/yum/associate.py b/plugins/pulp_rpm/plugins/importers/yum/associate.py
index 2cb4088..b86f31c 100644
--- a/plugins/pulp_rpm/plugins/importers/yum/associate.py
+++ b/plugins/pulp_rpm/plugins/importers/yum/associate.py
@@ -354,6 +354,8 @@ def associate_copy_for_repo(unit, dest_repo, set_content=False):
     """
     new_unit = unit.clone()
     new_unit.repo_id = dest_repo.repo_id
+    # calculate a new storage path since the unit key has changed
+    new_unit.set_storage_path(os.path.basename(unit.storage_path))

     try:
         new_unit.save()
@@ -365,7 +367,6 @@ def associate_copy_for_repo(unit, dest_repo, set_content=False):
         new_unit.save()

     if set_content:
-        new_unit.set_storage_path(os.path.basename(unit._storage_path))
         new_unit.safe_import_content(unit._storage_path)

     repo_controller.associate_single_unit(repository=dest_repo, unit=new_unit)

A migration will also be required to fix up existing units. Probably the best we can do is re-calculate each unit's storage path, copy its referenced file to that location, and then save the unit with the new storage path.

Unfortunately that doesn't fix published data. We might just have to live with that, and recommend that users re-publish any repos with YumMetadatafiles in them. This is the problem users could encounter:

Consider a YumMetadataFile gets sync'd into repo A. It then gets copied from A to B, and from B to C. Then our migration runs, and it ensures each unit references a unique file. The problem is that all three publications have symlinks to the file in A.

If/when a sync of A replaces that file with a new one, B and C will have a published symlink that is broken. The easiest fix is to just re-publish B and C, but we generally try to avoid that in migrations.


Related issues

Related to RPM Support - Task #1935: Redesign the yum_repo_metadata_file modelCLOSED - WONTFIX

Actions
Related to RPM Support - Issue #2248: metadata file copy results in error 'Content import of FILENAME failed - must be an existing file'CLOSED - WONTFIXActions

Also available in: Atom PDF