Story #7127
closedAdd object labels
100%
Description
Proposal¶
As a developer, it would be nice to be able to label pulp objects such as repositories. To accomplish this, I propose adding Kubernetes style labels to pulp objects (more information on kubernetes labels here: https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/).
Labels are arbitrary key value pairs that can be applied to objects and used to query them. For example, I could attach a label1=foo
, label2=foo
, and label3=foobar
to a repository. The repository could then be queried by any of the following methods
-
foo=bar
: select all objects that have the foo label set to bar -
foo!=bar
: select all objects that have the foo label set to anything except bar -
foo in (bar, foobar)
: selects all objects with the foo label that contain either the values for bar or foobar -
foo notin (bar, foobar)
: selects all objects with the foo label that don't contain bar or foobar -
foo
: selects all objects that contain the label foo (regardless of value) -
!foo
: selects all objects that don't contain the label foo
Use cases¶
In galaxy_ng we need a way to filter repositories based on what they are used for. For example we are going to have the following repositories
-
published
- contains locally published collections. Should be searchable. Is the default repo if no repo is provided -
staging
- content waiting for approval. Not searchable -
rejected
- content that has been rejected. Not searchable -
rh-certified
- content synced from automation hub. Contains red hat certified content. Searchable -
community
- content synced from galaxy. Searchable
As well as an arbitrary number of repositories named inbound-<namespace_name> for uploading collections. Not searchable.
Right now the names for these repos are hard coded as a way for the API to identify which content should go where, however with label support we could add the following labels
- searchable= true | false
- default=true | false
- content-readiness= production | inbound | staging
- certification= community | rh-certified | local
With these labels we can:
- limit search results to repos that match
searchable=true
- separate inbound repos with
content-readiness!=inbound
- identify which repositories contain certified content with
certification=rh-certified
Design¶
API Design¶
Filtering¶
Labels can be filtered by passing a urlencoded string to a label_selector
parameter.
Some examples based on the kubernetes documentation:
-
?label_selector=environment%3Dproduction,tier%3Dfrontend
- Evaluates to
environment=production,tier=frontend
- Evaluates to
-
?label_selector=environment+in+%28production%2Cqa%29%2Ctier+in+%28frontend%29
- Evaluates to
environment in (production,qa),tier in (frontend)
- Evaluates to
Note: Ansible Galaxy and RHUI have agreed that for a first pass, we could just support a subset of operators (ie =
and !=
).
LabelSelectFilter¶
LabelSelectFilter
would be a django_filter.Filter
that parses the label_select parameter and then filters the queryset. It can be applied to a Queryset of any model with labels.
Note: for an example of a complex Filter
, see the RepositoryVersionFilter
.
LabelSerializer¶
Create a new LabelSerializer
that can be nested into other model serializers as a field (much like the CreatedResourceSerializer
). This serializer should be both readable and writable and should enable the following API calls.
Reading¶
# GET /pulp/api/v3/repositories/file/file/
{
...
"labels": {"foo": "bar", "foo2": "baz"},
...
}
Setting/Updating¶
# POST /pulp/api/v3/repositories/file/file/ name=test labels:='{"foo": "bar"}'
{
...
"labels": {"foo": "bar"},
...
}
# PATCH /pulp/api/v3/repositories/file/file/<uuid>/ labels:='{"something": "else"}'
{
...
"labels": {"something": "else"},
...
}
# PATCH /pulp/api/v3/repositories/file/file/<uuid>/ labels:='{}'
{
...
"labels": {},
...
}
Database Design¶
Label (extends GenericRelationModel)¶
- resource - generic foreign key
- key (CharField) - the key of the label
- value (TextField) - the value for a label
Notes
-
Label
resource
andkey
are unique together - We should also limit
key
to alphanumerics - The
Label
gets deleted when the resource gets deleted (whichGenericRelationModel
does) -
key
andvalue
should be indexed (db_index=True
)
Related issues
Updated by bmbouter over 4 years ago
- Has duplicate Story #5279: As a user I can label pulp resources added
Updated by fao89 over 4 years ago
- Tracker changed from Issue to Story
- % Done set to 0
- Severity deleted (
2. Medium) - Triaged deleted (
No)
Updated by ipanova@redhat.com over 4 years ago
Hearing from other stakeholders, the only limitation that would be good to drop is the Kubernetes style labels rules. Namely:
- drop 63 characters limit for both key and value
- value must be empty or begin and end with an alphanumeric character ([a-z0-9A-Z]) with dashes (-), underscores (_), dots (.), and alphanumerics between. --> release these rules so for example '/' are allowed.
Updated by ipanova@redhat.com about 4 years ago
- Related to Story #5510: I would like to have an option to associate key-value pair to repository - former "Notes" field in Pulp 2 added
Updated by daviddavis about 4 years ago
A couple possibilities for how the API param might work:
- labelSelector=environment%3Dproduction,tier%3Dfrontend (from the Kubernetes docs)
- labelSelector=foo=bar&labelSelector=bar=foo
Updated by dalley about 4 years ago
I'd kind of like to do this in a way that might solve multiple pain points at once.
Create a new "resource" model type which holds the following data:
- The pulp_href, stored pre-computed, with a unique index
- The tags. We can probably use Postgresql HStoreField for this, it's where we should start. If not then we could attach whatever data we want to this model.
- A generic foreign key to the database object.
"Resource" would take over from the "CreatedResource", and task would have a nullable foreign key to Resource rather than CreatedResource having a FK to Task.
Benefits:
- Tags only need to be stored in one place
- It's possible to search tags across all resource types at once
- It enables solving the problem where CreatedResource objects in the task records to go to null
Downsides:
- Managing a separate surrogate model that needs to be created whenever a new "resource" is created and deleted vice-versa may be difficult.
Updated by ipanova@redhat.com about 4 years ago
I suggest adding a unique_together of (resource, key, value). In the current proposal if you label your resource once with label X you cannot really unlabel unless the resource is gone.
Clarified on irc.
Updated by daviddavis about 4 years ago
EXD would like to search within the value. I imagining for example that os~RHEL
could be used to search for matches that have a label of os
that contain "RHEL". This would match RHEL6, RHEL7, MyRHEL, etc.
Updated by ipanova@redhat.com about 4 years ago
- Sprint Candidate changed from No to Yes
Updated by fao89 about 4 years ago
- Status changed from NEW to ASSIGNED
- Assignee set to fao89
- Sprint Candidate changed from Yes to No
- Sprint set to Sprint 87
Updated by fao89 about 4 years ago
- Status changed from ASSIGNED to NEW
- Assignee deleted (
fao89)
Tuning collections query is taking more time than I expected, I'm unassigning for now
Updated by daviddavis almost 4 years ago
- Status changed from NEW to ASSIGNED
- Assignee set to daviddavis
Updated by daviddavis almost 4 years ago
- Related to Story #8174: As a user, I can filter labels using `in` and `notin` added
Updated by daviddavis almost 4 years ago
- Status changed from ASSIGNED to CLOSED - CURRENTRELEASE