Project

Profile

Help

Issue #9395

RemoteArtifacts are not being saved properly

Added by dalley 2 days ago. Updated 1 day ago.

Status:
POST
Priority:
High
Assignee:
Category:
-
Sprint/Milestone:
Start date:
Due date:
Estimated time:
Severity:
4. Urgent
Version:
Platform Release:
OS:
Triaged:
Yes
Groomed:
No
Sprint Candidate:
No
Tags:
Sprint:
Sprint 105
Quarter:

Description

This is the result of a long discussion on the Katello forums: https://community.theforeman.org/t/katello-4-1-2-1-404-error-through-content-proxy-due-to-incorrect-location-href/24812/26?u=dralley

TL;DR if you sync a repository on-demand multiple times against different repos, only the first set of RemoteArtifacts is saved. If the layout of the repository changes or the repository disappears, all of these URLs get broken, and the RemoteArtifacts are thus broken even if they were supposed to have multiple different potential sources.

I've confirmed this by syncing a single repository, changing the layout of that repository, and resyncing. Only the original RemoteArtifact will exist and the new ones will not. A script is attached to demonstrate this (if you look in the DB afterwards)

This is an especially severe issue because "metadata mirroring" and standard syncs have entirely different layouts, so re-publishing a mirrored repository or mirroring it after having otherwise not been doing so results in broken repositories due to all of the URLs changing.

no_new_remoteartifact.py (2.43 KB) no_new_remoteartifact.py dalley, 09/14/2021 05:33 AM

Related issues

Copied to Pulp - Backport #9400: Backport #9395 "RemoteArtifacts are not being saved properly" to 3.14.zNEW

<a title="Actions" class="icon-only icon-actions js-contextmenu" href="#">Actions</a>

History

#1 Updated by dalley 2 days ago

#2 Updated by dalley 2 days ago

  • Description updated (diff)

#3 Updated by dalley 2 days ago

  • Description updated (diff)

#4 Updated by dalley 2 days ago

  • Description updated (diff)

#5 Updated by dkliban@redhat.com 2 days ago

  • Triaged changed from No to Yes
  • Sprint set to Sprint 105

#6 Updated by dalley 1 day ago

More context. The way the uniqueness constraint for RemoteArtifact is set up precludes us from having more than one RemoteArtifact saved for any given remote.

https://github.com/pulp/pulpcore/blob/master/pulpcore/app/models/content.py#L652

This means that if the remote URL is changed, but the "relative path" stays the same, then new RemoteArtifacts cannot be created due to this constraint. And as mentioned above this is a bigger problem for the RPM plugin which uses the filename as the relative_path, and the layout of the repository can potentially change on you.

There are really only two potential solutions for this:

  • Allow the RemoteArtifactSaver stage to update the URLs of existing RemoteArtifacts to match the current URL of the remote during a resync
    • This feels like the more correct option and is not too difficult to implement
  • Relax the uniqueness constraint to allow more than one RemoteArtifact to be stored for a given remote e.g. change the constraint to ("content_artifact", "url", "remote").
    • This would probably not be backportable due to the migration.

#7 Updated by pulpbot 1 day ago

  • Status changed from ASSIGNED to POST

#9 Updated by dalley 1 day ago

  • Sprint/Milestone set to 3.16.0

#10 Updated by dalley 1 day ago

  • Copied to Backport #9400: Backport #9395 "RemoteArtifacts are not being saved properly" to 3.14.z added

Please register to edit this issue

Also available in: Atom PDF