Project

Profile

Help

Story #4020

closed

Extend Content App to serve Artifacts from ContentArtifact.relative_path data associated w/ the repo_version associated with the publication

Added by bmbouter over 6 years ago. Updated about 5 years ago.

Status:
CLOSED - CURRENTRELEASE
Priority:
Normal
Category:
-
Sprint/Milestone:
Start date:
Due date:
% Done:

100%

Estimated time:
Platform Release:
Groomed:
Yes
Sprint Candidate:
No
Tags:
Sprint:
Sprint 43
Quarter:

Description

Problem

Publishing a large repository, many times will create a huge number of PublishedArtifact objects. These effectively duplicate data that already exists in Pulp, e.g. ContentArtifact.relative_path.

Also recently a plugin writer pointed out they don't need PublishedMetadata objects, the content itself being served by a distribution at ContentArtifact.relative_path is enough. They indicated (a) having them provide a publisher didn't add value for them and (b) having to make a huge number of duplicate records they don't need was concerning.

Solution

Introduce an attribute on Publication called pass_through that defaults to False. If True, the content app would:

1. Query like it does normally, looking for PublishedArtifact and PublishedMetadata objects
2. Second, if Publication.pass_through == True also search in the ContentArtifact.relative_path for units in the associated RepositoryVersion.

This is pretty easy to add because Publication already has a ForeignKey to RepositoryVersion Also note that PublishedArtifact and PublishedMetadata are still searched first.

Benefits

  • Users who don't need PublishedMetadata will have a simpler experience (as they requested)

Related issues

Blocks File Support - Task #4034: Use the pass_through option when generating new publicationsCLOSED - CURRENTRELEASEppicka

Actions
Actions #1

Updated by dkliban@redhat.com over 6 years ago

Our current design allows the plugin writer to create publishers that can create publications that filter out some content from a repository version. This means users have two opportunities to compose a repository - at repository version creation time and when creating a publication. I would prefer to provide only one such opportunity at repository version creation time.

Another content type that does not require generating metadata at publish time is Maven. All the metadata is part of the content. So I can see a benefit for that plugin.

Actions #2

Updated by daviddavis over 6 years ago

+1 from me. I think PublishedArtifact is a remnant from before repo versions. Being able to remove the table and the need to create a ton of records for every publication would be a big improvement and simplification.

Also, publication already has a FK to repo version so I think we're good there.

Actions #3

Updated by jortel@redhat.com over 6 years ago

The purpose of the PublishedArtifact was to provide the publisher with the opportunity to publish each Artifact with a custom relative path.

For example: A (remote) DNF repository that looks like:

a.rpm
b.rpm
repodata/

The ContentArtifact.relative_path (and PublishedArtifact) for each artifact would be:

a.rpm
b.rpm

The PublishedArtifact provides for publishing in a different (custom) structure. Eg: to a packages/ directory.

packages/
    a.rpm
    b.rpm
repodata/

The PublishedArtifact.relative_path (for rpms) would be:

packages/a.rpm
packages/b.rpm

Perhaps we can support the publish-as-is use case by providing something like an as-is attribute to Publication since it already has a FK to the version. The content app could then resolve to ContentArtifact.relative_path as suggested when not matched to PublishedMetadata or PublishedArtifact. The core could provide base publishing that produces a PublishedVersion.

Actions #4

Updated by bmbouter over 6 years ago

@jortel I see what you're saying about it providing that customization point, but is anyone using it? I was trying to think if anyone was, but all the plugins I found were using ContentArtifact.relative_path as-is. Even RPM uses it as-is: https://github.com/pulp/pulp_rpm/blob/9cd7b8237194bc79b5454c7e53eaba673a21077a/pulp_rpm/app/tasks/publishing.py#L194

Everything we add takes away from the other things so if this feature isn't being used by any plugins we should consider removing it until it's needed by someone (I think). Are plugin writers using this?

Actions #5

Updated by gmbnomis over 6 years ago

Doesn't this proposal mean that, on content creation, the plugin has to decide on the exact relative location of the artifact in every publication that will ever occur?

What if:
- The plugin writer wants to change the publication layout like done for RPM in Pulp 2.12 [0]
- The plugin writer wants to support user selectable layouts when publishing
- There is a new "v2" of the repo structure, which necessitates path changes. The plugin writer wants to support publishing existent artifacts using the new structure.

I am not saying that this is absolutely necessary, but we should know the consequences of this decision. We could say that these cases are use cases for a live API because it is more complex than what Pulp core is willing to handle.

The other idea I had is what @jortel just described. :-)

[0] https://pulpproject.org/2016/11/14/yum-repo-layout-changes/

Actions #6

Updated by gmbnomis over 6 years ago

bmbouter wrote:

Everything we add takes away from the other things so if this feature isn't being used by any plugins we should consider removing it until it's needed by someone (I think). Are plugin writers using this?

Yes, I am using this in pulp_cookbook. But only because it is convenient. Content units just have the name of the tar file as relative path (e.g. pulp-1.2.3.tar.gz). Publication is at cookbook_files/pulp/1_2_3/pulp-1.2.3.tar.gz. But I could use this path at content creation already.

Actions #7

Updated by bmbouter over 6 years ago

  • Subject changed from Replace PublishedArtifact with a reference to a RepositoryVersion to Extend Content App to serve Artifacts from ContentArtifact.relative_path data associated w/ the repo_version associated with the publication
  • Description updated (diff)

I'm convinced from these posts that there are valid use cases for PublishedArtifact and PublishedMetadata. I rewrote this ticket to leave those things alone, and instead extend the content app.

The plugin writer I'm working w/ would really like to have this soon.

Actions #8

Updated by bmbouter over 6 years ago

  • Description updated (diff)

Through list and irc discussion we should make this an option that defaults to off. I've revised the ticket as such.

Actions #9

Updated by jortel@redhat.com over 6 years ago

  • Groomed changed from No to Yes
Actions #10

Updated by jortel@redhat.com over 6 years ago

  • Sprint set to Sprint 43
Actions #11

Updated by jortel@redhat.com over 6 years ago

  • Status changed from NEW to ASSIGNED
  • Assignee set to jortel@redhat.com
Actions #12

Updated by bmbouter over 6 years ago

I had not written this before but a main goal of this (for me) is to remove a step for the user! Currently you need to create a "publisher". I believe with this change they could just POST to create a publication w/ pass_through=True and the repository version and they could be done. One more call to a distributor and that content is live.

@jortel is ^ make sense to you? Can ^ be done as part of this piece of work?

Actions #13

Updated by jortel@redhat.com over 6 years ago

bmbouter wrote:

I had not written this before but a main goal of this (for me) is to remove a step for the user! Currently you need to create a "publisher". I believe with this change they could just POST to create a publication w/ pass_through=True and the repository version and they could be done. One more call to a distributor and that content is live.

@jortel is ^ make sense to you? Can ^ be done as part of this piece of work?

Yes. I envisioned the same thing. Will include it.

Actions #14

Updated by jortel@redhat.com over 6 years ago

  • Status changed from ASSIGNED to POST
Actions #15

Updated by daviddavis over 6 years ago

  • Blocks Task #4034: Use the pass_through option when generating new publications added

Added by jortel@redhat.com over 6 years ago

Revision 1c8ef717 | View on GitHub

Add support for pass-through publications. closes #4020

Added by jortel@redhat.com over 6 years ago

Revision 1c8ef717 | View on GitHub

Add support for pass-through publications. closes #4020

Actions #16

Updated by jortel@redhat.com over 6 years ago

  • Status changed from POST to MODIFIED
  • % Done changed from 0 to 100
Actions #17

Updated by daviddavis over 5 years ago

  • Sprint/Milestone set to 3.0.0
Actions #18

Updated by bmbouter over 5 years ago

  • Tags deleted (Pulp 3)
Actions #19

Updated by bmbouter about 5 years ago

  • Status changed from MODIFIED to CLOSED - CURRENTRELEASE

Also available in: Atom PDF