Project

Profile

Help

Story #4832

As a user, I can see what repo versions and publications a content unit belongs to

Added by daviddavis about 2 years ago. Updated 12 days ago.

Status:
MODIFIED
Priority:
Normal
Assignee:
Category:
-
Sprint/Milestone:
-
Start date:
Due date:
% Done:

100%

Estimated time:
Platform Release:
Groomed:
No
Sprint Candidate:
No
Tags:
Sprint:
Quarter:
Q2-2021

Description

If you want to delete a content unit, there's no way to check which repo versions and publications it belongs to. This means that deleting content is virtually impossible unless it happens to be orphaned.

We need to give users a way to see what repo versions and publications a piece of content belongs to.

See comment https://pulp.plan.io/issues/4832#note-25 for implementation details.


Related issues

Related to Pulp - Story #4831: As a user, I have docs on how to delete contentASSIGNED

<a title="Actions" class="icon-only icon-actions js-contextmenu" href="#">Actions</a>
Has duplicate Pulp - Story #8372: As a user, I can remove a specific content unit from Pulp easierCLOSED - DUPLICATE

<a title="Actions" class="icon-only icon-actions js-contextmenu" href="#">Actions</a>

Associated revisions

Revision 24ad202b View on GitHub
Added by ppicka 12 days ago

Content contained in RepoVer and Publication

Views to show repository_versions or publications which contains specific content.

closes: #4832 https://pulp.plan.io/issues/4832

History

#1 Updated by daviddavis about 2 years ago

  • Related to Story #4831: As a user, I have docs on how to delete content added

#2 Updated by daviddavis about 2 years ago

  • Subject changed from As a user, I can filter repo versions by content unit to As a user, I can filter all repo versions by content unit

Looks like there's a repo version filter already for filtering by content:

https://github.com/pulp/pulpcore/blob/master/pulpcore/app/viewsets/repository.py#L96

The only problem is that this is nested under repos.

#3 Updated by amacdona@redhat.com about 2 years ago

I am not very fond of the idea that users will need to "delete history" and remove all versions that contained a particular content unit jsut to remove it. As an alternative, I think it would be nice to add a boolean field to content like `disable_distribution`, which will allow the content to be added/removed from repository versions, but will not be served by the content app. Perhaps disabled content should be omitted from publications as well on a plugin-to-plugin basis.

Heres an example of why this is preferable. Say repo x is published and distributed in production, and one of the packages is determined to have a major security flaw. This package has been in the repository since the beginning. Is the user now expected to completely wipe all their history just to prevent the content from being distributed?

#4 Updated by daviddavis about 2 years ago

@asmacdo, I think your point makes sense. Let me give another example though: a user uploaded a package with sensitive information like customer info or a password. Even git lets you rewrite history to remove stuff. It's painful but possible.

Also, this story isn't entirely tied to deletion. What if a user wants to see how many/which repo versions have a particular content unit? Like in the case they're doing a security audit or something.

#5 Updated by amacdona@redhat.com about 2 years ago

+1 this story is necessary even outside of the content unit deletion use case.

#6 Updated by bmbouter about 2 years ago

I think there are several use cases here and we probably need a variety of mechanisms to cover them all.

1. A security-unsafe package is in Pulp and must be removed. We gotta get it out, quickly, and no matter what.
2. Which repo versions contain unit X?
3. Where is unit x being served right now?

What other aspects of this area could we consider use cases? What about ^ use cases, do they make any sense?

#7 Updated by daviddavis over 1 year ago

  • Sprint/Milestone deleted (3.0.0)

#8 Updated by dkliban@redhat.com 7 months ago

Here are some options for implementing this:

  1. A new API endpoint at /pulp/api/v3/content-search/

  2. An additional GET parameter for the /pulp/api/v3/content/// endpoint. When 'include_repository_versions' is set to 'true', a 'repository_versions' list will be populated for each content requested.

I am in favor of number 2.

#9 Updated by daviddavis 7 months ago

wrote:

Here are some options for implementing this:

  1. A new API endpoint at /pulp/api/v3/content-search/

  2. An additional GET parameter for the /pulp/api/v3/content/// endpoint. When 'include_repository_versions' is set to 'true', a 'repository_versions' list will be populated for each content requested.

I am in favor of number 2.

For option one, I'm not sure how this would work. Would this return content units or repository versions? I would expect it to return the former but then I am not sure how we'd know which repository versions a content unit belongs to unless we list the repo versions for each content unit. If we do that though, calling this endpoint would be extremely expensive as content units could belong to hundreds or thousands of repo verisons.

For option two, would this repository_versions field always be visible regardless of whether include_repository_versions is supplied or not? I seem to remember some problems in the api/bindings with response objects having optional fields.

I'd also like to call out a third option: having a repo version search endpoint that searches across repositories and accepts a content unit href. I think this would be useful for other potential use cases outside of this one.

#10 Updated by ipanova@redhat.com 7 months ago

daviddavis wrote:

wrote:

Here are some options for implementing this:

  1. A new API endpoint at /pulp/api/v3/content-search/

  2. An additional GET parameter for the /pulp/api/v3/content/// endpoint. When 'include_repository_versions' is set to 'true', a 'repository_versions' list will be populated for each content requested.

I am in favor of number 2.

For option one, I'm not sure how this would work. Would this return content units or repository versions? I would expect it to return the former but then I am not sure how we'd know which repository versions a content unit belongs to unless we list the repo versions for each content unit. If we do that though, calling this endpoint would be extremely expensive as content units could belong to hundreds or thousands of repo verisons.

For option two, would this repository_versions field always be visible regardless of whether include_repository_versions is supplied or not? I seem to remember some problems in the api/bindings with response objects having optional fields.

I'd also like to call out a third option: having a repo version search endpoint that searches across repositories and accepts a content unit href. I think this would be useful for other potential use cases outside of this one.

I think it will extremely simplify our life if we filter only by one content unit and not set of them, otherwise we'd need to provide some sort of mapping.

How do you image that endpoint? GET /pulp/api/v3/repositories/versions/?content_unit_href=/pulp/api/v3/content/rpm/rpm/d17b1f9a-093f-49e1-b486-a993f0054b05/ or rather a POST one?

#11 Updated by daviddavis 7 months ago

@ipanova, Yes or maybe /pulp/api/v3/repository_versions/ or /pulp/api/v3/repository-versions/.

#12 Updated by ipanova@redhat.com 7 months ago

daviddavis wrote:

@ipanova, Yes or maybe /pulp/api/v3/repository_versions/ or /pulp/api/v3/repository-versions/.

sounds good, i'm on-board

#13 Updated by ipanova@redhat.com 7 months ago

  • Quarter set to Q1-2021

#14 Updated by ipanova@redhat.com 2 months ago

  • Related to Story #8372: As a user, I can remove a specific content unit from Pulp easier added

#15 Updated by daviddavis 2 months ago

  • Subject changed from As a user, I can filter all repo versions by content unit to As a user, I can see what repo versoins and publications a content unit belongs to

#16 Updated by daviddavis 2 months ago

  • Description updated (diff)

#17 Updated by daviddavis 2 months ago

  • Subject changed from As a user, I can see what repo versoins and publications a content unit belongs to to As a user, I can see what repo versions and publications a content unit belongs to

#18 Updated by daviddavis 2 months ago

  • Description updated (diff)

#19 Updated by daviddavis 2 months ago

  • Description updated (diff)

#20 Updated by daviddavis 2 months ago

  • Related to deleted (Story #8372: As a user, I can remove a specific content unit from Pulp easier)

#21 Updated by daviddavis 2 months ago

  • Has duplicate Story #8372: As a user, I can remove a specific content unit from Pulp easier added

#22 Updated by bmbouter 2 months ago

What about having the response body for the inspection of a single content unit, e.g. https://docs.pulpproject.org/pulpcore/restapi.html#operation/content_file_files_read return a additional section including:

{
<snip>
    "repository_versions": ["/pulp/api/v3/repositories/file/file/:UUID/", ... "/pulp/api/v3/repositories/file/file/:UUID/"]
    "publications": ["/pulp/api/v3/publication/file/file/:UUID/", ... "/pulp/api/v3/publication/file/file/:UUID/"]
</snip>
}

It's possible this could be customizable on the detail model to only include Publications if that unit type has publications, and plugins could "opt-in" to enable it via the detail ContentUnit definition.

#23 Updated by daviddavis 2 months ago

If it's only for the read endpoint (and not the list endpoint), I think that would be ok. I still lean towards having generic endpoints for repo versions and publications because I think doing so would be more powerful (e.g. we could add more filters in the future).

#24 Updated by bmbouter 2 months ago

daviddavis wrote:

If it's only for the read endpoint (and not the list endpoint), I think that would be ok. I still lean towards having generic endpoints for repo versions and publications because I think doing so would be more powerful (e.g. we could add more filters in the future).

Let's explore that too can you write out the idea some?

#25 Updated by daviddavis 2 months ago

Currently today we have a RepositoryVersionContentFilter:

GET http :/pulp/api/v3/repositories/file/file/:UUID/versions/?content=<content_href>

I believe we could reuse this and some generic endpoints (see the ListRepositoryViewset as an example of a generic endpoint):

GET http :/pulp/api/v3/repository_versions/?content=<content_href>

[ 
  {
    "pulp_href": "/pulp/api/v3/repositories/rpm/rpm/07c41c5f-59e4-4371-942a-b6a006a6d2cf/versions/1/",
    "pulp_created": "2021-03-18T19:23:31.661940Z",
    "repository_href": "/pulp/api/v3/repositories/rpm/rpm/07c41c5f-59e4-4371-942a-b6a006a6d2cf/",
    "version": 1,
  }
  ... 
]

GET http :/pulp/api/v3/publications/?content=<content_href>

[ 
  {
    "pulp_href": "/pulp/api/v3/publications/rpm/rpm/4d5bb614-4318-4408-8c18-ab8b6b4c016f/",
    "pulp_created": "2021-03-17T12:24:31.661940Z",
    "repository_version_href": "/pulp/api/v3/repositories/rpm/rpm/07c41c5f-59e4-4371-942a-b6a006a6d2cf/versions/1/",
  }
  ... 
]

#26 Updated by bmbouter 2 months ago

+1 to your proposal @daviddavis. I like it because it does allow for a more powerful expression and also isn't a completely new endpoint, just another subresource under existing ones. It also nicely avoids the issue where some content doesn't need publication responses, users who never expect those, just won't make those calls.

#27 Updated by daviddavis about 2 months ago

  • Description updated (diff)

#28 Updated by ppicka about 2 months ago

  • Status changed from NEW to ASSIGNED
  • Assignee set to ppicka

#29 Updated by pulpbot about 1 month ago

  • Status changed from ASSIGNED to POST

#30 Updated by daviddavis 21 days ago

  • Quarter changed from Q1-2021 to Q2-2021

#31 Updated by ppicka 12 days ago

  • Status changed from POST to MODIFIED
  • % Done changed from 0 to 100

Please register to edit this issue

Also available in: Atom PDF