Project

Profile

Help

Story #4020

Extend Content App to serve Artifacts from ContentArtifact.relative_path data associated w/ the repo_version associated with the publication

Added by bmbouter about 1 year ago. Updated 6 months ago.

Status:
MODIFIED
Priority:
Normal
Category:
-
Sprint/Milestone:
Start date:
Due date:
% Done:

100%

Platform Release:
Blocks Release:
Backwards Incompatible:
No
Groomed:
Yes
Sprint Candidate:
No
Tags:
QA Contact:
Complexity:
Smash Test:
Verified:
No
Verification Required:
No
Sprint:
Sprint 43

Description

Problem

Publishing a large repository, many times will create a huge number of PublishedArtifact objects. These effectively duplicate data that already exists in Pulp, e.g. ContentArtifact.relative_path.

Also recently a plugin writer pointed out they don't need PublishedMetadata objects, the content itself being served by a distribution at ContentArtifact.relative_path is enough. They indicated (a) having them provide a publisher didn't add value for them and (b) having to make a huge number of duplicate records they don't need was concerning.

Solution

Introduce an attribute on Publication called pass_through that defaults to False. If True, the content app would:

1. Query like it does normally, looking for PublishedArtifact and PublishedMetadata objects
2. Second, if Publication.pass_through == True also search in the ContentArtifact.relative_path for units in the associated RepositoryVersion.

This is pretty easy to add because Publication already has a ForeignKey to RepositoryVersion Also note that PublishedArtifact and PublishedMetadata are still searched first.

Benefits

  • Users who don't need PublishedMetadata will have a simpler experience (as they requested)

Checklist


Related issues

Blocks File Support - Task #4034: Use the pass_through option when generating new publications MODIFIED Actions

Associated revisions

Revision 1c8ef717 View on GitHub
Added by jortel@redhat.com about 1 year ago

Add support for pass-through publications.
closes #4020

Revision 1c8ef717 View on GitHub
Added by jortel@redhat.com about 1 year ago

Add support for pass-through publications.
closes #4020

Revision 1c8ef717 View on GitHub
Added by jortel@redhat.com about 1 year ago

Add support for pass-through publications.
closes #4020

History

#1 Updated by dkliban@redhat.com about 1 year ago

Our current design allows the plugin writer to create publishers that can create publications that filter out some content from a repository version. This means users have two opportunities to compose a repository - at repository version creation time and when creating a publication. I would prefer to provide only one such opportunity at repository version creation time.

Another content type that does not require generating metadata at publish time is Maven. All the metadata is part of the content. So I can see a benefit for that plugin.

#2 Updated by daviddavis about 1 year ago

+1 from me. I think PublishedArtifact is a remnant from before repo versions. Being able to remove the table and the need to create a ton of records for every publication would be a big improvement and simplification.

Also, publication already has a FK to repo version so I think we're good there.

#3 Updated by jortel@redhat.com about 1 year ago

The purpose of the PublishedArtifact was to provide the publisher with the opportunity to publish each Artifact with a custom relative path.

For example: A (remote) DNF repository that looks like:

a.rpm
b.rpm
repodata/

The ContentArtifact.relative_path (and PublishedArtifact) for each artifact would be:

a.rpm
b.rpm

The PublishedArtifact provides for publishing in a different (custom) structure. Eg: to a packages/ directory.

packages/
    a.rpm
    b.rpm
repodata/

The PublishedArtifact.relative_path (for rpms) would be:

packages/a.rpm
packages/b.rpm

Perhaps we can support the publish-as-is use case by providing something like an as-is attribute to Publication since it already has a FK to the version. The content app could then resolve to ContentArtifact.relative_path as suggested when not matched to PublishedMetadata or PublishedArtifact. The core could provide base publishing that produces a PublishedVersion.

#4 Updated by bmbouter about 1 year ago

@jortel I see what you're saying about it providing that customization point, but is anyone using it? I was trying to think if anyone was, but all the plugins I found were using ContentArtifact.relative_path as-is. Even RPM uses it as-is: https://github.com/pulp/pulp_rpm/blob/9cd7b8237194bc79b5454c7e53eaba673a21077a/pulp_rpm/app/tasks/publishing.py#L194

Everything we add takes away from the other things so if this feature isn't being used by any plugins we should consider removing it until it's needed by someone (I think). Are plugin writers using this?

#5 Updated by gmbnomis about 1 year ago

Doesn't this proposal mean that, on content creation, the plugin has to decide on the exact relative location of the artifact in every publication that will ever occur?

What if:
- The plugin writer wants to change the publication layout like done for RPM in Pulp 2.12 [0]
- The plugin writer wants to support user selectable layouts when publishing
- There is a new "v2" of the repo structure, which necessitates path changes. The plugin writer wants to support publishing existent artifacts using the new structure.

I am not saying that this is absolutely necessary, but we should know the consequences of this decision. We could say that these cases are use cases for a live API because it is more complex than what Pulp core is willing to handle.

The other idea I had is what @jortel just described. :-)

[0] https://pulpproject.org/2016/11/14/yum-repo-layout-changes/

#6 Updated by gmbnomis about 1 year ago

bmbouter wrote:

Everything we add takes away from the other things so if this feature isn't being used by any plugins we should consider removing it until it's needed by someone (I think). Are plugin writers using this?

Yes, I am using this in pulp_cookbook. But only because it is convenient. Content units just have the name of the tar file as relative path (e.g. pulp-1.2.3.tar.gz). Publication is at cookbook_files/pulp/1_2_3/pulp-1.2.3.tar.gz. But I could use this path at content creation already.

#7 Updated by bmbouter about 1 year ago

  • Subject changed from Replace PublishedArtifact with a reference to a RepositoryVersion to Extend Content App to serve Artifacts from ContentArtifact.relative_path data associated w/ the repo_version associated with the publication
  • Description updated (diff)

I'm convinced from these posts that there are valid use cases for PublishedArtifact and PublishedMetadata. I rewrote this ticket to leave those things alone, and instead extend the content app.

The plugin writer I'm working w/ would really like to have this soon.

#8 Updated by bmbouter about 1 year ago

  • Checklist item add to serializer added
  • Checklist item add to model layer as Boolean w/ default=False added
  • Checklist item update docs (somewhere) added
  • Checklist item update the content app to check for the pass_through=True and respond accordingly added
  • Description updated (diff)

Through list and irc discussion we should make this an option that defaults to off. I've revised the ticket as such.

#9 Updated by jortel@redhat.com about 1 year ago

  • Groomed changed from No to Yes

#10 Updated by jortel@redhat.com about 1 year ago

  • Sprint set to Sprint 43

#11 Updated by jortel@redhat.com about 1 year ago

  • Status changed from NEW to ASSIGNED
  • Assignee set to jortel@redhat.com

#12 Updated by bmbouter about 1 year ago

I had not written this before but a main goal of this (for me) is to remove a step for the user! Currently you need to create a "publisher". I believe with this change they could just POST to create a publication w/ pass_through=True and the repository version and they could be done. One more call to a distributor and that content is live.

@jortel is ^ make sense to you? Can ^ be done as part of this piece of work?

#13 Updated by jortel@redhat.com about 1 year ago

bmbouter wrote:

I had not written this before but a main goal of this (for me) is to remove a step for the user! Currently you need to create a "publisher". I believe with this change they could just POST to create a publication w/ pass_through=True and the repository version and they could be done. One more call to a distributor and that content is live.

@jortel is ^ make sense to you? Can ^ be done as part of this piece of work?

Yes. I envisioned the same thing. Will include it.

#14 Updated by jortel@redhat.com about 1 year ago

  • Checklist item add to serializer set to Done
  • Checklist item add to model layer as Boolean w/ default=False set to Done
  • Checklist item update the content app to check for the pass_through=True and respond accordingly set to Done
  • Status changed from ASSIGNED to POST

#15 Updated by daviddavis about 1 year ago

  • Blocks Task #4034: Use the pass_through option when generating new publications added

#16 Updated by jortel@redhat.com about 1 year ago

  • Status changed from POST to MODIFIED
  • % Done changed from 0 to 100

#17 Updated by daviddavis 6 months ago

  • Sprint/Milestone set to 3.0

#18 Updated by bmbouter 6 months ago

  • Tags deleted (Pulp 3)

Please register to edit this issue

Also available in: Atom PDF