Project

Profile

Help

Issue #5304

closed

Pulp 3 publishes metadata outside of artifact storage

Added by dkliban@redhat.com over 4 years ago. Updated over 4 years ago.

Status:
CLOSED - CURRENTRELEASE
Priority:
Normal
Category:
-
Sprint/Milestone:
Start date:
Due date:
Estimated time:
Severity:
2. Medium
Version:
Platform Release:
OS:
Triaged:
Yes
Groomed:
Yes
Sprint Candidate:
Yes
Tags:
Sprint:
Sprint 59
Quarter:

Description

The initial bug report has led to the identification of the following problems:

  • pulp 3 cannot create PublishedMetadata files in /var/lib/pulp/published/ because POSIX permissions prevent it on a machine where Pulp 2 has already published content
  • PublishedMetadata are the only files that are stored in /var/lib/pulp/published, everything else is stored in /var/lib/pulp/artifact
  • Plugin API provides 2 ways to represent content: Content (made up of Artifacts) and PublishedMetadata
    • content app has two code paths for serving PublishedMetada and Content
    • metadata that is mirrored exactly as published is represented as a Content model and the same metadata generated by Pulp is represented as PublishedArtifcat
    • any code that exports a publication to a filesystem needs to have 2 code paths to export the contents of a publication

The solution to all these problems is to make PublishedMetadata inherit from Content. This way the artifact backing the PublishedMetadata would be stored in artifact storage. The plugin writer experience can be simplified by providing a constructor that allows the plugin writer to pass in a file, a relative path, and a publication. The creation of Artifact, ContentArtifact, and PublishedArtifact is handled by the constructor. The constructor will also save the PublishedMetada to the database.

Here is some example code from the publish task in the File plugin:

with WorkingDirectory():
    with FilePublication.create(repo_version, pass_through=True) as publication:
        manifest = Manifest(manifest)
        manifest.write(populate(publication))
        metadata = PublishedMetadata(
            relative_path=os.path.basename(manifest.relative_path),
            publication=publication,
            file=File(open(manifest.relative_path, "rb")),
        )
        metadata.save()

So the only difference for the plugin writer will be not having to call save():

with WorkingDirectory():
    with FilePublication.create(repo_version, pass_through=True) as publication:
        manifest = Manifest(manifest)
        manifest.write(populate(publication))
        metadata = PublishedMetadata(
            relative_path=os.path.basename(manifest.relative_path),
            publication=publication,
            file=File(open(manifest.relative_path, "rb"))
        )

The orphan cleanup query needs to be updated to consider content that is part of a publication to not be considered orphaned.


Related issues

Has duplicate Pulp - Issue #4834: File directories between pulp 2 and pulp 3 conflictCLOSED - DUPLICATEActions
Actions #1

Updated by amacdona@redhat.com over 4 years ago

  • Triaged changed from No to Yes
Actions #2

Updated by daviddavis over 4 years ago

  • Has duplicate Issue #4834: File directories between pulp 2 and pulp 3 conflict added
Actions #3

Updated by dkliban@redhat.com over 4 years ago

Pulp 3 should not create any files outside of the artifact storage location. We should update the publish code to simply create artifacts.

Actions #4

Updated by dkliban@redhat.com over 4 years ago

  • Subject changed from Pulp 3 can't publish metadata when installed together with Pulp 2 to Pulp 3 publishes metadata outside of artifact storage
  • Description updated (diff)
Actions #5

Updated by dkliban@redhat.com over 4 years ago

  • Description updated (diff)
Actions #6

Updated by dkliban@redhat.com over 4 years ago

  • Tags deleted (Pulp 2, Pulp 3 installer)
Actions #7

Updated by dkliban@redhat.com over 4 years ago

  • Description updated (diff)
Actions #8

Updated by dkliban@redhat.com over 4 years ago

  • Description updated (diff)
Actions #9

Updated by daviddavis over 4 years ago

This looks great. Thanks for updating this. What fields will be on PublishedMetadata?

Actions #10

Updated by daviddavis over 4 years ago

  • Sprint/Milestone changed from 3.0.0 to 71
Actions #11

Updated by daviddavis over 4 years ago

  • Groomed changed from No to Yes
  • Sprint Candidate changed from No to Yes
Actions #12

Updated by bmbouter over 4 years ago

  • Sprint/Milestone changed from 71 to 3.0.0
Actions #13

Updated by rchan over 4 years ago

  • Sprint set to Sprint 59
Actions #14

Updated by rchan over 4 years ago

  • Assignee set to dkliban@redhat.com
Actions #15

Updated by daviddavis over 4 years ago

  • Status changed from NEW to ASSIGNED
Actions #16

Updated by dkliban@redhat.com over 4 years ago

The content app relies on PublishedArtifact to serve published content[0]. ContentArtifacts are only needed for 'pass-through' publications. This means that PublishedMetadata is not needed. Only a PublishedArtifact is needed. What if we simply added a staticmethod to PublishedArtifact that would create a PublishedArtifact from a file, relative_path, and publication?

The code snippet from above would become:

with WorkingDirectory():
    with FilePublication.create(repo_version, pass_through=True) as publication:
        manifest = Manifest(manifest)
        manifest.write(populate(publication))
        metadata = PublishedArtifact._create_from_file(
            relative_path=os.path.basename(manifest.relative_path),
            publication=publication,
            file=File(open(manifest.relative_path, "rb"))
        )

[0] https://github.com/pulp/pulpcore/blob/master/pulpcore/content/handler.py#L206

Actions #17

Updated by daviddavis over 4 years ago

What models would _create_from_file create? Because a PublishedArtifact requires a ContentArtifact. Would _create_from_file create that ContentArtifact or would you rework PublishedArtifact to not require a ContentArtifact.

Actions #18

Updated by dkliban@redhat.com over 4 years ago

I forgot about the ContentArtifact being needed for PublishedArtifact. I'll continue with the original plan.

Actions #19

Updated by dkliban@redhat.com over 4 years ago

This change breaks the very first migration here[0] because PublishedMetadata no longer has a _storage_path attribute. This will require regenerating a new 1st migration.

[0] https://github.com/pulp/pulpcore/blob/master/pulpcore/app/migrations/0001_initial.py#L429

Actions #20

Updated by dkliban@redhat.com over 4 years ago

  • Status changed from ASSIGNED to POST
Actions #21

Updated by dkliban@redhat.com over 4 years ago

  • Status changed from POST to ASSIGNED

I ran into a problem when overriding the init. Then I learned that it's possible to do, but Django docs do not recommend it[0]. We should add a class method with the following signature: PublishedMetadata.create_from_file(file, relative_path=None). 'file' is a django.core.files.File object. When relative_path is omitted, the name in the File object is used as the relative path for the PublishedMetadata.

[0] https://docs.djangoproject.com/en/2.2/ref/models/instances/#creating-objects

Added by dkliban@redhat.com over 4 years ago

Revision c1ed1488 | View on GitHub

Store PublishedMetadata files in artifact storage

This patch changes PublishedMetadata into Content. This requires migrating the database. The migration that comes with this change will only work on an empty database. It will fail to migrate any existing PublishedMetadata. This is a backwards incompatible change.

The PublishedArtifact now has a classmethod called 'create_from_file'. This method creates PublishedMetadata along with an Artifact, ContentArtifact, and PublishedArtifact.

Required PR: https://github.com/pulp/pulp_file/pull/278/

closes: #5304 https://pulp.plan.io/issues/5304

Actions #22

Updated by dkliban@redhat.com over 4 years ago

  • Status changed from ASSIGNED to MODIFIED

Added by dkliban@redhat.com over 4 years ago

Revision 930382ac | View on GitHub

Switch to using PublishedMetada.create_from_file()

re: #5304 https://pulp.plan.io/issues/5304

Actions #23

Updated by dkliban@redhat.com over 4 years ago

  • Status changed from MODIFIED to POST

Added by dkliban@redhat.com over 4 years ago

Revision 2baf6153 | View on GitHub

Switch to using PublishedMetada.create_from_file()

re: #5304 https://pulp.plan.io/issues/5304

Added by dkliban@redhat.com over 4 years ago

Revision 90580ccc | View on GitHub

Switch to using PublishedMetada.create_from_file()

re: #5304 https://pulp.plan.io/issues/5304

Actions #24

Updated by dkliban@redhat.com over 4 years ago

  • Status changed from POST to MODIFIED
Actions #25

Updated by bmbouter over 4 years ago

  • Status changed from MODIFIED to CLOSED - CURRENTRELEASE

Also available in: Atom PDF