Issue #5304
closedPulp 3 publishes metadata outside of artifact storage
Description
The initial bug report has led to the identification of the following problems:
- pulp 3 cannot create PublishedMetadata files in /var/lib/pulp/published/ because POSIX permissions prevent it on a machine where Pulp 2 has already published content
- PublishedMetadata are the only files that are stored in /var/lib/pulp/published, everything else is stored in /var/lib/pulp/artifact
- Plugin API provides 2 ways to represent content: Content (made up of Artifacts) and PublishedMetadata
- content app has two code paths for serving PublishedMetada and Content
- metadata that is mirrored exactly as published is represented as a Content model and the same metadata generated by Pulp is represented as PublishedArtifcat
- any code that exports a publication to a filesystem needs to have 2 code paths to export the contents of a publication
The solution to all these problems is to make PublishedMetadata inherit from Content. This way the artifact backing the PublishedMetadata would be stored in artifact storage. The plugin writer experience can be simplified by providing a constructor that allows the plugin writer to pass in a file, a relative path, and a publication. The creation of Artifact, ContentArtifact, and PublishedArtifact is handled by the constructor. The constructor will also save the PublishedMetada to the database.
Here is some example code from the publish task in the File plugin:
with WorkingDirectory():
with FilePublication.create(repo_version, pass_through=True) as publication:
manifest = Manifest(manifest)
manifest.write(populate(publication))
metadata = PublishedMetadata(
relative_path=os.path.basename(manifest.relative_path),
publication=publication,
file=File(open(manifest.relative_path, "rb")),
)
metadata.save()
So the only difference for the plugin writer will be not having to call save():
with WorkingDirectory():
with FilePublication.create(repo_version, pass_through=True) as publication:
manifest = Manifest(manifest)
manifest.write(populate(publication))
metadata = PublishedMetadata(
relative_path=os.path.basename(manifest.relative_path),
publication=publication,
file=File(open(manifest.relative_path, "rb"))
)
The orphan cleanup query needs to be updated to consider content that is part of a publication to not be considered orphaned.
Related issues
Updated by daviddavis over 5 years ago
- Has duplicate Issue #4834: File directories between pulp 2 and pulp 3 conflict added
Updated by dkliban@redhat.com over 5 years ago
Pulp 3 should not create any files outside of the artifact storage location. We should update the publish code to simply create artifacts.
Updated by dkliban@redhat.com over 5 years ago
- Subject changed from Pulp 3 can't publish metadata when installed together with Pulp 2 to Pulp 3 publishes metadata outside of artifact storage
- Description updated (diff)
Updated by dkliban@redhat.com over 5 years ago
- Tags deleted (
Pulp 2, Pulp 3 installer)
Updated by daviddavis over 5 years ago
This looks great. Thanks for updating this. What fields will be on PublishedMetadata?
Updated by daviddavis over 5 years ago
- Sprint/Milestone changed from 3.0.0 to 71
Updated by daviddavis over 5 years ago
- Groomed changed from No to Yes
- Sprint Candidate changed from No to Yes
Updated by dkliban@redhat.com over 5 years ago
The content app relies on PublishedArtifact to serve published content[0]. ContentArtifacts are only needed for 'pass-through' publications. This means that PublishedMetadata is not needed. Only a PublishedArtifact is needed. What if we simply added a staticmethod to PublishedArtifact that would create a PublishedArtifact from a file, relative_path, and publication?
The code snippet from above would become:
with WorkingDirectory():
with FilePublication.create(repo_version, pass_through=True) as publication:
manifest = Manifest(manifest)
manifest.write(populate(publication))
metadata = PublishedArtifact._create_from_file(
relative_path=os.path.basename(manifest.relative_path),
publication=publication,
file=File(open(manifest.relative_path, "rb"))
)
[0] https://github.com/pulp/pulpcore/blob/master/pulpcore/content/handler.py#L206
Updated by daviddavis over 5 years ago
What models would _create_from_file
create? Because a PublishedArtifact requires a ContentArtifact. Would _create_from_file
create that ContentArtifact or would you rework PublishedArtifact to not require a ContentArtifact.
Updated by dkliban@redhat.com over 5 years ago
I forgot about the ContentArtifact being needed for PublishedArtifact. I'll continue with the original plan.
Updated by dkliban@redhat.com over 5 years ago
This change breaks the very first migration here[0] because PublishedMetadata no longer has a _storage_path attribute. This will require regenerating a new 1st migration.
[0] https://github.com/pulp/pulpcore/blob/master/pulpcore/app/migrations/0001_initial.py#L429
Updated by dkliban@redhat.com over 5 years ago
- Status changed from ASSIGNED to POST
Updated by dkliban@redhat.com over 5 years ago
- Status changed from POST to ASSIGNED
I ran into a problem when overriding the init. Then I learned that it's possible to do, but Django docs do not recommend it[0]. We should add a class method with the following signature: PublishedMetadata.create_from_file(file, relative_path=None). 'file' is a django.core.files.File object. When relative_path is omitted, the name in the File object is used as the relative path for the PublishedMetadata.
[0] https://docs.djangoproject.com/en/2.2/ref/models/instances/#creating-objects
Added by dkliban@redhat.com over 5 years ago
Added by dkliban@redhat.com about 5 years ago
Revision c1ed1488 | View on GitHub
Store PublishedMetadata files in artifact storage
This patch changes PublishedMetadata into Content. This requires migrating the database. The migration that comes with this change will only work on an empty database. It will fail to migrate any existing PublishedMetadata. This is a backwards incompatible change.
The PublishedArtifact now has a classmethod called 'create_from_file'. This method creates PublishedMetadata along with an Artifact, ContentArtifact, and PublishedArtifact.
Required PR: https://github.com/pulp/pulp_file/pull/278/
Updated by dkliban@redhat.com about 5 years ago
- Status changed from ASSIGNED to MODIFIED
Applied in changeset pulpcore|c1ed1488c24bedc4841ae77f1ede01852745dc5d.
Added by dkliban@redhat.com about 5 years ago
Revision 930382ac | View on GitHub
Switch to using PublishedMetada.create_from_file()
Updated by dkliban@redhat.com about 5 years ago
- Status changed from MODIFIED to POST
Added by dkliban@redhat.com about 5 years ago
Revision 2baf6153 | View on GitHub
Switch to using PublishedMetada.create_from_file()
Added by dkliban@redhat.com about 5 years ago
Revision 90580ccc | View on GitHub
Switch to using PublishedMetada.create_from_file()
Updated by dkliban@redhat.com about 5 years ago
- Status changed from POST to MODIFIED
Updated by bmbouter about 5 years ago
- Status changed from MODIFIED to CLOSED - CURRENTRELEASE
Switch to using PublishedMetada.create_from_file()
Required PR: https://github.com/pulp/pulpcore/pull/303
re: #5304 https://pulp.plan.io/issues/5304