Project

Profile

Help

Issue #5304

Pulp 3 publishes metadata outside of artifact storage

Added by dkliban@redhat.com about 2 months ago. Updated 18 days ago.

Status:
MODIFIED
Priority:
Normal
Category:
-
Sprint/Milestone:
Start date:
Due date:
Severity:
2. Medium
Version:
Platform Release:
Blocks Release:
OS:
Backwards Incompatible:
No
Triaged:
Yes
Groomed:
Yes
Sprint Candidate:
Yes
Tags:
QA Contact:
Complexity:
Smash Test:
Verified:
No
Verification Required:
No
Sprint:
Sprint 59

Description

The initial bug report has led to the identification of the following problems:
  • pulp 3 cannot create PublishedMetadata files in /var/lib/pulp/published/ because POSIX permissions prevent it on a machine where Pulp 2 has already published content
  • PublishedMetadata are the only files that are stored in /var/lib/pulp/published, everything else is stored in /var/lib/pulp/artifact
  • Plugin API provides 2 ways to represent content: Content (made up of Artifacts) and PublishedMetadata
    • content app has two code paths for serving PublishedMetada and Content
    • metadata that is mirrored exactly as published is represented as a Content model and the same metadata generated by Pulp is represented as PublishedArtifcat
    • any code that exports a publication to a filesystem needs to have 2 code paths to export the contents of a publication

The solution to all these problems is to make PublishedMetadata inherit from Content. This way the artifact backing the PublishedMetadata would be stored in artifact storage. The plugin writer experience can be simplified by providing a constructor that allows the plugin writer to pass in a file, a relative path, and a publication. The creation of Artifact, ContentArtifact, and PublishedArtifact is handled by the constructor. The constructor will also save the PublishedMetada to the database.

Here is some example code from the publish task in the File plugin:

with WorkingDirectory():
    with FilePublication.create(repo_version, pass_through=True) as publication:
        manifest = Manifest(manifest)
        manifest.write(populate(publication))
        metadata = PublishedMetadata(
            relative_path=os.path.basename(manifest.relative_path),
            publication=publication,
            file=File(open(manifest.relative_path, "rb")),
        )
        metadata.save()

So the only difference for the plugin writer will be not having to call save():

with WorkingDirectory():
    with FilePublication.create(repo_version, pass_through=True) as publication:
        manifest = Manifest(manifest)
        manifest.write(populate(publication))
        metadata = PublishedMetadata(
            relative_path=os.path.basename(manifest.relative_path),
            publication=publication,
            file=File(open(manifest.relative_path, "rb"))
        )

The orphan cleanup query needs to be updated to consider content that is part of a publication to not be considered orphaned.


Related issues

Duplicated by Pulp - Issue #4834: File directories between pulp 2 and pulp 3 conflict CLOSED - DUPLICATE Actions

Associated revisions

Revision c1ed1488 View on GitHub
Added by dkliban@redhat.com 21 days ago

Store PublishedMetadata files in artifact storage

This patch changes PublishedMetadata into Content. This requires migrating the database.
The migration that comes with this change will only work on an empty database. It will
fail to migrate any existing PublishedMetadata. This is a backwards incompatible change.

The PublishedArtifact now has a classmethod called 'create_from_file'. This method creates
PublishedMetadata along with an Artifact, ContentArtifact, and PublishedArtifact.

Required PR: https://github.com/pulp/pulp_file/pull/278/

closes: #5304
https://pulp.plan.io/issues/5304

Revision 930382ac View on GitHub
Added by dkliban@redhat.com 20 days ago

Switch to using PublishedMetada.create_from_file()

re: #5304
https://pulp.plan.io/issues/5304

Revision c4e70eb3 View on GitHub
Added by dkliban@redhat.com 20 days ago

Add change log entries for PublishedMetadata change

re: #5304
https://pulp.plan.io/issues/5304

Revision 2baf6153 View on GitHub
Added by dkliban@redhat.com 19 days ago

Switch to using PublishedMetada.create_from_file()

re: #5304
https://pulp.plan.io/issues/5304

Revision 90580ccc View on GitHub
Added by dkliban@redhat.com 19 days ago

Switch to using PublishedMetada.create_from_file()

re: #5304
https://pulp.plan.io/issues/5304

History

#1 Updated by amacdona@redhat.com about 2 months ago

  • Triaged changed from No to Yes

#2 Updated by daviddavis about 1 month ago

  • Duplicated by Issue #4834: File directories between pulp 2 and pulp 3 conflict added

#3 Updated by dkliban@redhat.com about 1 month ago

Pulp 3 should not create any files outside of the artifact storage location. We should update the publish code to simply create artifacts.

#4 Updated by dkliban@redhat.com about 1 month ago

  • Subject changed from Pulp 3 can't publish metadata when installed together with Pulp 2 to Pulp 3 publishes metadata outside of artifact storage
  • Description updated (diff)

#5 Updated by dkliban@redhat.com about 1 month ago

  • Description updated (diff)

#6 Updated by dkliban@redhat.com about 1 month ago

  • Tags deleted (Pulp 2, Pulp 3 installer)

#7 Updated by dkliban@redhat.com about 1 month ago

  • Description updated (diff)

#8 Updated by dkliban@redhat.com about 1 month ago

  • Description updated (diff)

#9 Updated by daviddavis about 1 month ago

This looks great. Thanks for updating this. What fields will be on PublishedMetadata?

#10 Updated by daviddavis about 1 month ago

  • Sprint/Milestone changed from 3.0 to 71

#11 Updated by daviddavis about 1 month ago

  • Groomed changed from No to Yes
  • Sprint Candidate changed from No to Yes

#12 Updated by bmbouter about 1 month ago

  • Sprint/Milestone changed from 71 to 3.0

#13 Updated by rchan about 1 month ago

  • Sprint set to Sprint 59

#14 Updated by rchan about 1 month ago

  • Assignee set to dkliban@redhat.com

#15 Updated by daviddavis about 1 month ago

  • Status changed from NEW to ASSIGNED

#16 Updated by dkliban@redhat.com 28 days ago

The content app relies on PublishedArtifact to serve published content0. ContentArtifacts are only needed for 'pass-through' publications. This means that PublishedMetadata is not needed. Only a PublishedArtifact is needed. What if we simply added a staticmethod to PublishedArtifact that would create a PublishedArtifact from a file, relative_path, and publication?

The code snippet from above would become:

with WorkingDirectory():
    with FilePublication.create(repo_version, pass_through=True) as publication:
        manifest = Manifest(manifest)
        manifest.write(populate(publication))
        metadata = PublishedArtifact._create_from_file(
            relative_path=os.path.basename(manifest.relative_path),
            publication=publication,
            file=File(open(manifest.relative_path, "rb"))
        )

[0] https://github.com/pulp/pulpcore/blob/master/pulpcore/content/handler.py#L206

#17 Updated by daviddavis 28 days ago

What models would _create_from_file create? Because a PublishedArtifact requires a ContentArtifact. Would _create_from_file create that ContentArtifact or would you rework PublishedArtifact to not require a ContentArtifact.

#18 Updated by dkliban@redhat.com 27 days ago

I forgot about the ContentArtifact being needed for PublishedArtifact. I'll continue with the original plan.

#19 Updated by dkliban@redhat.com 27 days ago

This change breaks the very first migration here0 because PublishedMetadata no longer has a _storage_path attribute. This will require regenerating a new 1st migration.

[0] https://github.com/pulp/pulpcore/blob/master/pulpcore/app/migrations/0001_initial.py#L429

#20 Updated by dkliban@redhat.com 27 days ago

  • Status changed from ASSIGNED to POST

#21 Updated by dkliban@redhat.com 26 days ago

  • Status changed from POST to ASSIGNED

I ran into a problem when overriding the init. Then I learned that it's possible to do, but Django docs do not recommend it0. We should add a class method with the following signature: PublishedMetadata.create_from_file(file, relative_path=None). 'file' is a django.core.files.File object. When relative_path is omitted, the name in the File object is used as the relative path for the PublishedMetadata.

[0] https://docs.djangoproject.com/en/2.2/ref/models/instances/#creating-objects

#22 Updated by dkliban@redhat.com 20 days ago

  • Status changed from ASSIGNED to MODIFIED

#23 Updated by dkliban@redhat.com 20 days ago

  • Status changed from MODIFIED to POST

#24 Updated by dkliban@redhat.com 18 days ago

  • Status changed from POST to MODIFIED

Please register to edit this issue

Also available in: Atom PDF