Issue #5304
Pulp 3 publishes metadata outside of artifact storage
Description
The initial bug report has led to the identification of the following problems:
- pulp 3 cannot create PublishedMetadata files in /var/lib/pulp/published/ because POSIX permissions prevent it on a machine where Pulp 2 has already published content
- PublishedMetadata are the only files that are stored in /var/lib/pulp/published, everything else is stored in /var/lib/pulp/artifact
- Plugin API provides 2 ways to represent content: Content (made up of Artifacts) and PublishedMetadata
- content app has two code paths for serving PublishedMetada and Content
- metadata that is mirrored exactly as published is represented as a Content model and the same metadata generated by Pulp is represented as PublishedArtifcat
- any code that exports a publication to a filesystem needs to have 2 code paths to export the contents of a publication
The solution to all these problems is to make PublishedMetadata inherit from Content. This way the artifact backing the PublishedMetadata would be stored in artifact storage. The plugin writer experience can be simplified by providing a constructor that allows the plugin writer to pass in a file, a relative path, and a publication. The creation of Artifact, ContentArtifact, and PublishedArtifact is handled by the constructor. The constructor will also save the PublishedMetada to the database.
Here is some example code from the publish task in the File plugin:
with WorkingDirectory():
with FilePublication.create(repo_version, pass_through=True) as publication:
manifest = Manifest(manifest)
manifest.write(populate(publication))
metadata = PublishedMetadata(
relative_path=os.path.basename(manifest.relative_path),
publication=publication,
file=File(open(manifest.relative_path, "rb")),
)
metadata.save()
So the only difference for the plugin writer will be not having to call save():
with WorkingDirectory():
with FilePublication.create(repo_version, pass_through=True) as publication:
manifest = Manifest(manifest)
manifest.write(populate(publication))
metadata = PublishedMetadata(
relative_path=os.path.basename(manifest.relative_path),
publication=publication,
file=File(open(manifest.relative_path, "rb"))
)
The orphan cleanup query needs to be updated to consider content that is part of a publication to not be considered orphaned.
Related issues
Associated revisions
Revision c1ed1488
View on GitHub
Store PublishedMetadata files in artifact storage
This patch changes PublishedMetadata into Content. This requires migrating the database. The migration that comes with this change will only work on an empty database. It will fail to migrate any existing PublishedMetadata. This is a backwards incompatible change.
The PublishedArtifact now has a classmethod called 'create_from_file'. This method creates PublishedMetadata along with an Artifact, ContentArtifact, and PublishedArtifact.
Required PR: https://github.com/pulp/pulp_file/pull/278/
Revision 930382ac
View on GitHub
Switch to using PublishedMetada.create_from_file()
Revision 2baf6153
View on GitHub
Switch to using PublishedMetada.create_from_file()
Revision 90580ccc
View on GitHub
Switch to using PublishedMetada.create_from_file()
History
#1
Updated by amacdona@redhat.com over 1 year ago
- Triaged changed from No to Yes
#2
Updated by daviddavis over 1 year ago
- Has duplicate Issue #4834: File directories between pulp 2 and pulp 3 conflict added
#3
Updated by dkliban@redhat.com over 1 year ago
Pulp 3 should not create any files outside of the artifact storage location. We should update the publish code to simply create artifacts.
#4
Updated by dkliban@redhat.com over 1 year ago
- Subject changed from Pulp 3 can't publish metadata when installed together with Pulp 2 to Pulp 3 publishes metadata outside of artifact storage
- Description updated (diff)
#5
Updated by dkliban@redhat.com over 1 year ago
- Description updated (diff)
#6
Updated by dkliban@redhat.com over 1 year ago
- Tags deleted (
Pulp 2, Pulp 3 installer)
#7
Updated by dkliban@redhat.com over 1 year ago
- Description updated (diff)
#8
Updated by dkliban@redhat.com over 1 year ago
- Description updated (diff)
#9
Updated by daviddavis over 1 year ago
This looks great. Thanks for updating this. What fields will be on PublishedMetadata?
#10
Updated by daviddavis over 1 year ago
- Sprint/Milestone changed from 3.0.0 to 71
#11
Updated by daviddavis over 1 year ago
- Groomed changed from No to Yes
- Sprint Candidate changed from No to Yes
#12
Updated by bmbouter over 1 year ago
- Sprint/Milestone changed from 71 to 3.0.0
#13
Updated by rchan over 1 year ago
- Sprint set to Sprint 59
#14
Updated by rchan over 1 year ago
- Assignee set to dkliban@redhat.com
#15
Updated by daviddavis over 1 year ago
- Status changed from NEW to ASSIGNED
#16
Updated by dkliban@redhat.com over 1 year ago
The content app relies on PublishedArtifact to serve published content[0]. ContentArtifacts are only needed for 'pass-through' publications. This means that PublishedMetadata is not needed. Only a PublishedArtifact is needed. What if we simply added a staticmethod to PublishedArtifact that would create a PublishedArtifact from a file, relative_path, and publication?
The code snippet from above would become:
with WorkingDirectory():
with FilePublication.create(repo_version, pass_through=True) as publication:
manifest = Manifest(manifest)
manifest.write(populate(publication))
metadata = PublishedArtifact._create_from_file(
relative_path=os.path.basename(manifest.relative_path),
publication=publication,
file=File(open(manifest.relative_path, "rb"))
)
[0] https://github.com/pulp/pulpcore/blob/master/pulpcore/content/handler.py#L206
#17
Updated by daviddavis over 1 year ago
What models would _create_from_file
create? Because a PublishedArtifact requires a ContentArtifact. Would _create_from_file
create that ContentArtifact or would you rework PublishedArtifact to not require a ContentArtifact.
#18
Updated by dkliban@redhat.com over 1 year ago
I forgot about the ContentArtifact being needed for PublishedArtifact. I'll continue with the original plan.
#19
Updated by dkliban@redhat.com over 1 year ago
This change breaks the very first migration here[0] because PublishedMetadata no longer has a _storage_path attribute. This will require regenerating a new 1st migration.
[0] https://github.com/pulp/pulpcore/blob/master/pulpcore/app/migrations/0001_initial.py#L429
#20
Updated by dkliban@redhat.com over 1 year ago
- Status changed from ASSIGNED to POST
#21
Updated by dkliban@redhat.com over 1 year ago
- Status changed from POST to ASSIGNED
I ran into a problem when overriding the init. Then I learned that it's possible to do, but Django docs do not recommend it[0]. We should add a class method with the following signature: PublishedMetadata.create_from_file(file, relative_path=None). 'file' is a django.core.files.File object. When relative_path is omitted, the name in the File object is used as the relative path for the PublishedMetadata.
[0] https://docs.djangoproject.com/en/2.2/ref/models/instances/#creating-objects
#22
Updated by dkliban@redhat.com over 1 year ago
- Status changed from ASSIGNED to MODIFIED
Applied in changeset pulpcore|c1ed1488c24bedc4841ae77f1ede01852745dc5d.
#23
Updated by dkliban@redhat.com over 1 year ago
- Status changed from MODIFIED to POST
#24
Updated by dkliban@redhat.com over 1 year ago
- Status changed from POST to MODIFIED
#25
Updated by bmbouter about 1 year ago
- Status changed from MODIFIED to CLOSED - CURRENTRELEASE
Please register to edit this issue
Switch to using PublishedMetada.create_from_file()
Required PR: https://github.com/pulp/pulpcore/pull/303
re: #5304 https://pulp.plan.io/issues/5304