Story #4832
closedAs a user, I can see what repo versions and publications a content unit belongs to
100%
Description
If you want to delete a content unit, there's no way to check which repo versions and publications it belongs to. This means that deleting content is virtually impossible unless it happens to be orphaned.
We need to give users a way to see what repo versions and publications a piece of content belongs to.
See comment https://pulp.plan.io/issues/4832#note-25 for implementation details.
Related issues
Updated by daviddavis over 5 years ago
- Related to Story #4831: As a user, I have docs on how to delete content added
Updated by daviddavis over 5 years ago
- Subject changed from As a user, I can filter repo versions by content unit to As a user, I can filter all repo versions by content unit
Looks like there's a repo version filter already for filtering by content:
https://github.com/pulp/pulpcore/blob/master/pulpcore/app/viewsets/repository.py#L96
The only problem is that this is nested under repos.
Updated by amacdona@redhat.com over 5 years ago
I am not very fond of the idea that users will need to "delete history" and remove all versions that contained a particular content unit jsut to remove it. As an alternative, I think it would be nice to add a boolean field to content like `disable_distribution`, which will allow the content to be added/removed from repository versions, but will not be served by the content app. Perhaps disabled content should be omitted from publications as well on a plugin-to-plugin basis.
Heres an example of why this is preferable. Say repo x is published and distributed in production, and one of the packages is determined to have a major security flaw. This package has been in the repository since the beginning. Is the user now expected to completely wipe all their history just to prevent the content from being distributed?
Updated by daviddavis over 5 years ago
@asmacdo, I think your point makes sense. Let me give another example though: a user uploaded a package with sensitive information like customer info or a password. Even git lets you rewrite history to remove stuff. It's painful but possible.
Also, this story isn't entirely tied to deletion. What if a user wants to see how many/which repo versions have a particular content unit? Like in the case they're doing a security audit or something.
Updated by amacdona@redhat.com over 5 years ago
+1 this story is necessary even outside of the content unit deletion use case.
Updated by bmbouter over 5 years ago
I think there are several use cases here and we probably need a variety of mechanisms to cover them all.
1. A security-unsafe package is in Pulp and must be removed. We gotta get it out, quickly, and no matter what.
2. Which repo versions contain unit X?
3. Where is unit x being served right now?
What other aspects of this area could we consider use cases? What about ^ use cases, do they make any sense?
Updated by dkliban@redhat.com over 4 years ago
Here are some options for implementing this:
-
A new API endpoint at /pulp/api/v3/content-search/
-
An additional GET parameter for the /pulp/api/v3/content/// endpoint. When 'include_repository_versions' is set to 'true', a 'repository_versions' list will be populated for each content requested.
I am in favor of number 2.
Updated by daviddavis over 4 years ago
dkliban@redhat.com wrote:
Here are some options for implementing this:
A new API endpoint at /pulp/api/v3/content-search/
An additional GET parameter for the /pulp/api/v3/content/// endpoint. When 'include_repository_versions' is set to 'true', a 'repository_versions' list will be populated for each content requested.
I am in favor of number 2.
For option one, I'm not sure how this would work. Would this return content units or repository versions? I would expect it to return the former but then I am not sure how we'd know which repository versions a content unit belongs to unless we list the repo versions for each content unit. If we do that though, calling this endpoint would be extremely expensive as content units could belong to hundreds or thousands of repo verisons.
For option two, would this repository_versions
field always be visible regardless of whether include_repository_versions
is supplied or not? I seem to remember some problems in the api/bindings with response objects having optional fields.
I'd also like to call out a third option: having a repo version search endpoint that searches across repositories and accepts a content unit href. I think this would be useful for other potential use cases outside of this one.
Updated by ipanova@redhat.com about 4 years ago
daviddavis wrote:
dkliban@redhat.com wrote:
Here are some options for implementing this:
A new API endpoint at /pulp/api/v3/content-search/
An additional GET parameter for the /pulp/api/v3/content/// endpoint. When 'include_repository_versions' is set to 'true', a 'repository_versions' list will be populated for each content requested.
I am in favor of number 2.
For option one, I'm not sure how this would work. Would this return content units or repository versions? I would expect it to return the former but then I am not sure how we'd know which repository versions a content unit belongs to unless we list the repo versions for each content unit. If we do that though, calling this endpoint would be extremely expensive as content units could belong to hundreds or thousands of repo verisons.
For option two, would this
repository_versions
field always be visible regardless of whetherinclude_repository_versions
is supplied or not? I seem to remember some problems in the api/bindings with response objects having optional fields.I'd also like to call out a third option: having a repo version search endpoint that searches across repositories and accepts a content unit href. I think this would be useful for other potential use cases outside of this one.
I think it will extremely simplify our life if we filter only by one content unit and not set of them, otherwise we'd need to provide some sort of mapping.
How do you image that endpoint? GET /pulp/api/v3/repositories/versions/?content_unit_href=/pulp/api/v3/content/rpm/rpm/d17b1f9a-093f-49e1-b486-a993f0054b05/ or rather a POST one?
Updated by daviddavis about 4 years ago
@ipanova, Yes or maybe /pulp/api/v3/repository_versions/ or /pulp/api/v3/repository-versions/.
Updated by ipanova@redhat.com about 4 years ago
daviddavis wrote:
@ipanova, Yes or maybe /pulp/api/v3/repository_versions/ or /pulp/api/v3/repository-versions/.
sounds good, i'm on-board
Updated by ipanova@redhat.com almost 4 years ago
- Related to Story #8372: As a user, I can remove a specific content unit from Pulp easier added
Updated by daviddavis almost 4 years ago
- Subject changed from As a user, I can filter all repo versions by content unit to As a user, I can see what repo versoins and publications a content unit belongs to
Updated by daviddavis almost 4 years ago
- Subject changed from As a user, I can see what repo versoins and publications a content unit belongs to to As a user, I can see what repo versions and publications a content unit belongs to
Updated by daviddavis almost 4 years ago
- Related to deleted (Story #8372: As a user, I can remove a specific content unit from Pulp easier)
Updated by daviddavis almost 4 years ago
- Has duplicate Story #8372: As a user, I can remove a specific content unit from Pulp easier added
Updated by bmbouter almost 4 years ago
What about having the response body for the inspection of a single content unit, e.g. https://docs.pulpproject.org/pulpcore/restapi.html#operation/content_file_files_read return a additional section including:
{
<snip>
"repository_versions": ["/pulp/api/v3/repositories/file/file/:UUID/", ... "/pulp/api/v3/repositories/file/file/:UUID/"]
"publications": ["/pulp/api/v3/publication/file/file/:UUID/", ... "/pulp/api/v3/publication/file/file/:UUID/"]
</snip>
}
It's possible this could be customizable on the detail model to only include Publications if that unit type has publications, and plugins could "opt-in" to enable it via the detail ContentUnit definition.
Updated by daviddavis almost 4 years ago
If it's only for the read endpoint (and not the list endpoint), I think that would be ok. I still lean towards having generic endpoints for repo versions and publications because I think doing so would be more powerful (e.g. we could add more filters in the future).
Updated by bmbouter almost 4 years ago
daviddavis wrote:
If it's only for the read endpoint (and not the list endpoint), I think that would be ok. I still lean towards having generic endpoints for repo versions and publications because I think doing so would be more powerful (e.g. we could add more filters in the future).
Let's explore that too can you write out the idea some?
Updated by daviddavis almost 4 years ago
Currently today we have a RepositoryVersionContentFilter
:
GET http :/pulp/api/v3/repositories/file/file/:UUID/versions/?content=<content_href>
I believe we could reuse this and some generic endpoints (see the ListRepositoryViewset as an example of a generic endpoint):
GET http :/pulp/api/v3/repository_versions/?content=<content_href>
[
{
"pulp_href": "/pulp/api/v3/repositories/rpm/rpm/07c41c5f-59e4-4371-942a-b6a006a6d2cf/versions/1/",
"pulp_created": "2021-03-18T19:23:31.661940Z",
"repository_href": "/pulp/api/v3/repositories/rpm/rpm/07c41c5f-59e4-4371-942a-b6a006a6d2cf/",
"version": 1,
}
...
]
GET http :/pulp/api/v3/publications/?content=<content_href>
[
{
"pulp_href": "/pulp/api/v3/publications/rpm/rpm/4d5bb614-4318-4408-8c18-ab8b6b4c016f/",
"pulp_created": "2021-03-17T12:24:31.661940Z",
"repository_version_href": "/pulp/api/v3/repositories/rpm/rpm/07c41c5f-59e4-4371-942a-b6a006a6d2cf/versions/1/",
}
...
]
Updated by bmbouter almost 4 years ago
+1 to your proposal @daviddavis. I like it because it does allow for a more powerful expression and also isn't a completely new endpoint, just another subresource under existing ones. It also nicely avoids the issue where some content doesn't need publication responses, users who never expect those, just won't make those calls.
Updated by ppicka almost 4 years ago
- Status changed from NEW to ASSIGNED
- Assignee set to ppicka
Updated by pulpbot almost 4 years ago
- Status changed from ASSIGNED to POST
Added by ppicka over 3 years ago
Updated by ppicka over 3 years ago
- Status changed from POST to MODIFIED
- % Done changed from 0 to 100
Applied in changeset pulpcore|24ad202bc7d007623c2b7096a4cc3ba48f79e041.
Updated by pulpbot over 3 years ago
- Status changed from MODIFIED to CLOSED - CURRENTRELEASE
Content contained in RepoVer and Publication
Views to show repository_versions or publications which contains specific content.
closes: #4832 https://pulp.plan.io/issues/4832