Project

Profile

Help

Task #3522

closed

Plan Master/Detail Tasks

Added by amacdona@redhat.com about 6 years ago. Updated almost 5 years ago.

Status:
CLOSED - WONTFIX
Priority:
Normal
Assignee:
-
Category:
-
Sprint/Milestone:
Start date:
Due date:
% Done:

0%

Estimated time:
Platform Release:
Groomed:
No
Sprint Candidate:
No
Tags:
Sprint:
Quarter:

Description

End to end design.

When the design reaches consensus, sub-tasks for core and plugins should be created.

Motivations:

Plugin involvement in add/remove

The original motivation of this design change was a problem: Core cannot add/remove content units without plugin involvement.

We can solve this problem by removing the core endpoint to create versions, and allow the plugin writers complete control. This approach is conceptually simple, and is used in this plan. There are other ideas as well (hooks) but they require a much fuller discussion of the complexity of the issue than this approach. To keep this issue clear, that discussion should be had on the issue describing the problem: https://pulp.plan.io/issues/3541

Design Pattern Consistency

Pulp 3 is built on a simple pattern.

Django is the base. It provides a framework for development, but does not provide behavior.
Django Rest Framework is the next level. Built on Django, it fills out the framework but again does not provide behavior.
Pulpcore is the next level. Built on DRF, it provides a framework for plugin development. It also implements the basic objects that are used identically between all plugins. In this plan there are 2 objects provided by core: Repository and Artifact. These objects are also managed by core, and do not need plugin involvement. All other objects are the responsibility of the plugin. Pulpcore simplifies the plugin responsibilities by providing abstract base classes.

A generic simple plugin needs to:
1) Implement a ContentUnit (Model, ViewSet, Serializer)
2) Implement a Remote (Model, ViewSet, Serializer)
3) Implement a Publisher (Model, ViewSet, Serializer)
4) Implement an AddRemoveTask (Model, ViewSet, Serializer, optional function/task)
5) Implement a SyncTask (Model, ViewSet, Serializer, function/task)
6) Implement a PublishTask (Model, ViewSet, Serializer, function/task)

By following this pattern for the whole plugin API, we can make the following guarantees to the plugin writers:
1) If your problem domain is simple and standard, most of the work is cheap. Just implement your classes, Pulp will handle the rest.
2) If your problem domain is more complex, you can override parts as necessary. The concerns are well separated, so most validation can be done with minimal overrides. Tools for customization are identical for all objects, and they are well documented by DRF.
3) If your problem domain is eccentric, Pulp will stay out of your way. You are in control from the request until the return, if you need to be.

Example add/remove with pulp_file

Create a AddRemoveTask(Model, ViewSet, Serializer)

The ViewSet is just like all other ViewSets. Please be polite, namespace the endpoint by plugin.

class FileAddRemoveTaskViewSet(TaskViewSet, mixins.CreateModelMixin):

    endpoint_name = 'file/add-remove'
    queryset = FileAddRemoveTask.objects.all()
    model = FileAddRemoveTask
    serializer_class = FileAddRemoveTaskSerializer

The Model is just like other Models. The fields represent the parameters for the function/task.

class FileAddRemoveTask(FileTask):

    TYPE = 'add-remove'
    repository = models.ForeignKey(Repository)
    add_content_units = models.ManyToManyField(FileContent, related_name="added")
    remove_content_units = models.ManyToManyField(FileContent, related_name="removed")

Serializers for tasks are also the same as all other serializers. I took the liberty of overriding create in the base TaskSerializer to auto-deploy tasks. This makes the plugin classes simple. This behavior is easily overridden by defining your own `create`. In this case, the file plugin is deploying the add/remove task from pulpcore. They could implement their own task if the core task doesn't suit their needs.

class FileAddRemoveTaskSerializer(TaskSerializer):

    repository = serializers.HyperlinkedRelatedField(
        view_name='repositories-detail',
        queryset=Repository.objects.all(),
    )

    add_content_units = DetailRelatedField(
        queryset=FileContent.objects.all(),
        many=True,
    )
    remove_content_units = DetailRelatedField(
        queryset=FileContent.objects.all(),
        many=True,
    )

    reservation_structure = ["repository"]

    # If there is custom logic related to dependencies, validation, etc, the plugin could create
    # their own task rather than using the general add/remove from pulpcore.
    celery_task = core_tasks.add_and_remove

    @property
    def task_kwargs(self):
        add_pks = [content_unit.pk for content_unit in self.task.add_content_units.all()]
        rm_pks = [content_unit.pk for content_unit in self.task.remove_content_units.all()]
        return {'repository_pk': self.task.repository.pk,
                'add_content_units': add_pks,
                'remove_content_units': rm_pks}

    # def validate(self, data):
    #     """
    #     OPTIONAL!
    #     Here, the plugin writer can provide **synchronous** validation. The plugin writer also has
    #     the opporunity to alter/clean the data.
    #
    #     Warning: The content in a repository could change between request time and task time.
    #     """
    #     for content_unit in data['add_content_units']:
    #         if content_unit in data['remove_content_units']:
    #             raise serializers.ValidationError("Cannot add and remove a single content unit")
    #     return data

    class Meta:
        model = FileAddRemoveTask
        fields = TaskSerializer.Meta.fields + ("add_content_units", "remove_content_units",
                                               "repository")

Sync is exactly the same:
ViewSet: https://github.com/asmacdo/pulp_file/blob/bf4138957aa4ac94319bba00c88e0043f8b26a03/pulp_file/app/viewsets.py#L33-L38
Model: https://github.com/asmacdo/pulp_file/blob/bf4138957aa4ac94319bba00c88e0043f8b26a03/pulp_file/app/models.py#L67-L71
Serializer: https://github.com/asmacdo/pulp_file/blob/bf4138957aa4ac94319bba00c88e0043f8b26a03/pulp_file/app/serializers.py#L14-L31
function/task (no change): https://github.com/asmacdo/pulp_file/blob/bf4138957aa4ac94319bba00c88e0043f8b26a03/pulp_file/app/tasks/synchronizing.py

Analysis

User Perspective

As a user, I interact with Pulp and plugins by RESTful CRUD of objects.

  • When I POST to v3/<object>, I get back a serialized <object>

As a user, I can tell Pulp to "do something" by creating a Task

  • All tasks are in the same place `v3/tasks/ ...`
  • Each Task has a separate endpoint
    • Each endpoint is autodocumented, including parameters
    • Each endpoint validates parameters
    • Each endpoint can autogenerate a binding for clients
  • I can view and filter tasks with a GET request to the same endpoint I created the task
  • Task history includes parameters
    • As a user, I can filter "sync" tasks by "remote/importer" or "repository"

Plugin writer perspective

I covered a lot of this above, but I'll reiterate:

As a plugin writer, I have a single learning curve:

  • How to implement ViewSets
  • How to implement Serializers
  • How to implement Models
  • How to customize each
  • How to implement logic specific to my plugin (connected via ^ customization)

I can't stress this enough. If we make this change, plugin writing is literally just Models, ViewSets, Serializers, and custom plugin-specific logic. This design will probably add more lines of code, in exchange for conceptual simplicity.

Here's the pulp_file app.
https://github.com/asmacdo/pulp_file/tree/bf4138957aa4ac94319bba00c88e0043f8b26a03/pulp_file/app

In particular, compare the custom code to dispatch sync (this design) to the custom code to dispatch publish (old design)

Sync: https://github.com/asmacdo/pulp_file/blob/bf4138957aa4ac94319bba00c88e0043f8b26a03/pulp_file/app/serializers.py#L14-L31

Publish:
https://github.com/asmacdo/pulp_file/blob/bf4138957aa4ac94319bba00c88e0043f8b26a03/pulp_file/app/viewsets.py#L88-L123

Example user experience, sync

The user creates a repository and importer the same way they do today.

Deploy a sync task

 http http://pulp3.dev:8000/api/v3/tasks/file/syncs/ importer=http://pulp3.dev:8000/api/v3/importers/file/7a8866a4-f6f4-4ab3-ade6-678e2328b238/ repository=http://pulp3.dev:8000/api/v3/repositories/7e91a5c3-4a96-4d27-b5f2-99ccc563eaab/
HTTP/1.0 201 Created
Allow: GET, POST, HEAD, OPTIONS
Content-Length: 412
Content-Type: application/json
Date: Fri, 23 Mar 2018 16:47:17 GMT
Location: http://pulp3.dev:8000/api/v3/tasks/file/syncs/9d78f7af-80e6-4b95-9a8c-f315eb6872cf/
Server: WSGIServer/0.2 CPython/3.6.4
Vary: Accept, Cookie
X-Frame-Options: SAMEORIGIN

{
    "_href": "http://pulp3.dev:8000/api/v3/tasks/file/syncs/9d78f7af-80e6-4b95-9a8c-f315eb6872cf/",
    "created_resources": [],
    "error": null,
    "finished_at": null,
    "importer": "http://pulp3.dev:8000/api/v3/importers/file/7a8866a4-f6f4-4ab3-ade6-678e2328b238/",
    "non_fatal_errors": [],
    "repository": "http://pulp3.dev:8000/api/v3/repositories/7e91a5c3-4a96-4d27-b5f2-99ccc563eaab/",
    "started_at": null,
    "state": "waiting",
    "worker": null
}

Retrieve Sync Task:

~ ❯ http http://pulp3.dev:8000/api/v3/tasks/file/syncs/9d78f7af-80e6-4b95-9a8c-f315eb6872cf/
HTTP/1.0 200 OK
Allow: GET, HEAD, OPTIONS
Content-Length: 624
Content-Type: application/json
Date: Fri, 23 Mar 2018 16:47:54 GMT
Server: WSGIServer/0.2 CPython/3.6.4
Vary: Accept, Cookie
X-Frame-Options: SAMEORIGIN

{
    "_href": "http://pulp3.dev:8000/api/v3/tasks/file/syncs/9d78f7af80e64b959a8cf315eb6872cf/",
    "created_resources": [
        "http://pulp3.dev:8000/api/v3/repositories/7e91a5c3-4a96-4d27-b5f2-99ccc563eaab/versions/6/"
    ],
    "error": null,
    "finished_at": "2018-03-23T16:47:18.484046Z",
    "importer": "http://pulp3.dev:8000/api/v3/importers/file/7a8866a4-f6f4-4ab3-ade6-678e2328b238/",
    "non_fatal_errors": [],
    "repository": "http://pulp3.dev:8000/api/v3/repositories/7e91a5c3-4a96-4d27-b5f2-99ccc563eaab/",
    "started_at": "2018-03-23T16:47:17.939148Z",
    "state": "completed",
    "worker": "http://pulp3.dev:8000/api/v3/workers/fd319bbb-789e-462b-b87a-83143543330a/"
}

Resultant REST API overview

This change would enocurage consistency between plugins by making an "easy obvious way" to deploy and view tasks. This would encourage (but not require) the following urls

v3/tasks/docker/add-removes/
v3/tasks/docker/syncs/
v3/tasks/docker/publishes/
v3/tasks/file/add-removes/
v3/tasks/file/syncs/
v3/tasks/file/publishes/
v3/tasks/core/updates/
v3/tasks/core/deletes/

It is trivial to create "tiered viewsets".

v3/tasks/file/ <--------- list all file tasks (add-remove, syncs, publishes)
v3/tasks/ <----------- list all tasks, which are .cast() and serialized

Object CRUD

Creation of Objects will be exactly the same.

Update/Delete of objects that require reservations will be identical except that the task href that is returned will be namespaced. (This made sense to me, but it is not critical to the design)

$ http PATCH http://pulp3.dev:8000/api/v3/repositories/7e91a5c3-4a96-4d27-b5f2-99ccc563eaab/ name=r2
HTTP/1.0 202 Accepted
Allow: GET, PUT, PATCH, DELETE, HEAD, OPTIONS
Content-Length: 148
Content-Type: application/json
Date: Fri, 23 Mar 2018 16:41:02 GMT
Server: WSGIServer/0.2 CPython/3.6.4
Vary: Accept, Cookie
X-Frame-Options: SAMEORIGIN

[
    {
        "_href": "http://pulp3.dev:8000/api/v3/tasks/core/updates/824f60ff-13a1-4de6-91a7-a3f6dc23707b/",
        "task_id": "824f60ff-13a1-4de6-91a7-a3f6dc23707b"
    }
]

$ http http://pulp3.dev:8000/api/v3/tasks/core/updates/824f60ff-13a1-4de6-91a7-a3f6dc23707b/
HTTP/1.0 200 OK
Allow: GET, HEAD, OPTIONS
Content-Length: 348
Content-Type: application/json
Date: Fri, 23 Mar 2018 16:41:08 GMT
Server: WSGIServer/0.2 CPython/3.6.4
Vary: Accept, Cookie
X-Frame-Options: SAMEORIGIN

{
    "_href": "http://pulp3.dev:8000/api/v3/tasks/core/updates/824f60ff-13a1-4de6-91a7-a3f6dc23707b/",
    "created_resources": [],
    "error": null,
    "finished_at": "2018-03-23T16:41:03.452739Z",
    "non_fatal_errors": [],
    "started_at": "2018-03-23T16:41:03.408065Z",
    "state": "completed",
    "worker": "http://pulp3.dev:8000/api/v3/workers/fd319bbb-789e-462b-b87a-83143543330a/"
}

To make sure this was feasible, I've created proof of concept PRs for pulpcore and pulp_file. All of the proposed changes are proofed, but not all work is done. For example, sync is implemented, but not publish.
https://github.com/pulp/pulp/pull/3394
https://github.com/pulp/pulp_file/pull/61


Related issues

Related to Pulp - Issue #3541: Core should not add/remove content to a repository or create a repository_version without plugin inputCLOSED - CURRENTRELEASEbmbouterActions

Also available in: Atom PDF