Project

Profile

Help

Issue #9395

RemoteArtifacts are not being saved properly

Added by dalley 3 months ago. Updated 2 months ago.

Status:
CLOSED - CURRENTRELEASE
Priority:
High
Assignee:
Category:
-
Sprint/Milestone:
Start date:
Due date:
Estimated time:
Severity:
4. Urgent
Version:
Platform Release:
OS:
Triaged:
Yes
Groomed:
No
Sprint Candidate:
No
Tags:
Sprint:
Sprint 105
Quarter:

Description

This is the result of a long discussion on the Katello forums: https://community.theforeman.org/t/katello-4-1-2-1-404-error-through-content-proxy-due-to-incorrect-location-href/24812/26?u=dralley

TL;DR if you sync a repository on-demand multiple times against different repos, only the first set of RemoteArtifacts is saved. If the layout of the repository changes or the repository disappears, all of these URLs get broken, and the RemoteArtifacts are thus broken even if they were supposed to have multiple different potential sources.

I've confirmed this by syncing a single repository, changing the layout of that repository, and resyncing. Only the original RemoteArtifact will exist and the new ones will not. A script is attached to demonstrate this (if you look in the DB afterwards)

This is an especially severe issue because "metadata mirroring" and standard syncs have entirely different layouts, so re-publishing a mirrored repository or mirroring it after having otherwise not been doing so results in broken repositories due to all of the URLs changing.

no_new_remoteartifact.py (2.43 KB) no_new_remoteartifact.py dalley, 09/14/2021 05:33 AM

Related issues

Copied to Pulp - Backport #9400: Backport #9395 "RemoteArtifacts are not being saved properly" to 3.14.zCLOSED - CURRENTRELEASE

<a title="Actions" class="icon-only icon-actions js-contextmenu" href="#">Actions</a>

Associated revisions

Revision 489156eb View on GitHub
Added by dalley 2 months ago

Update remote artifact urls on sync if the remote or repo changes

closes: #9395 https://pulp.plan.io/issues/9395

History

#1 Updated by dalley 3 months ago

#2 Updated by dalley 3 months ago

  • Description updated (diff)

#3 Updated by dalley 3 months ago

  • Description updated (diff)

#4 Updated by dalley 3 months ago

  • Description updated (diff)

#5 Updated by dkliban@redhat.com 3 months ago

  • Triaged changed from No to Yes
  • Sprint set to Sprint 105

#6 Updated by dalley 3 months ago

More context. The way the uniqueness constraint for RemoteArtifact is set up precludes us from having more than one RemoteArtifact saved for any given remote.

https://github.com/pulp/pulpcore/blob/master/pulpcore/app/models/content.py#L652

This means that if the remote URL is changed, but the "relative path" stays the same, then new RemoteArtifacts cannot be created due to this constraint. And as mentioned above this is a bigger problem for the RPM plugin which uses the filename as the relative_path, and the layout of the repository can potentially change on you.

There are really only two potential solutions for this:

  • Allow the RemoteArtifactSaver stage to update the URLs of existing RemoteArtifacts to match the current URL of the remote during a resync
    • This feels like the more correct option and is not too difficult to implement
  • Relax the uniqueness constraint to allow more than one RemoteArtifact to be stored for a given remote e.g. change the constraint to ("content_artifact", "url", "remote").
    • This would probably not be backportable due to the migration.

#7 Updated by pulpbot 3 months ago

  • Status changed from ASSIGNED to POST

#9 Updated by dalley 3 months ago

  • Sprint/Milestone set to 3.16.0

#10 Updated by dalley 3 months ago

  • Copied to Backport #9400: Backport #9395 "RemoteArtifacts are not being saved properly" to 3.14.z added

#11 Updated by dalley 2 months ago

  • Status changed from POST to MODIFIED

#13 Updated by pulpbot 2 months ago

  • Status changed from MODIFIED to CLOSED - CURRENTRELEASE

Please register to edit this issue

Also available in: Atom PDF