Task #3522
Updated by amacdona@redhat.com over 6 years ago
h2. End to end design. When the design reaches consensus, sub-tasks for core and plugins should be created. h2. Motivations: h3. Plugin involvement in add/remove Component 1. Make Tasks Master/Detail. The original motivation * Change inheritance of Tasks * Update TaskSerializer for master/detail * override TaskSerializer.create to deploy tasks * update apply_async_with_reservation to accept Tasks (the django model kind, "task_status" in this design change was a problem: code) * Update OperationPostponed to work with Core cannot add/remove content units without plugin involvement. We can solve this problem by removing the core endpoint Tasks h3. Component 2. Tasks relating to create versions, and allow the plugin writers complete control. This approach is conceptually simple, and is used in this plan. There are Update/Delete of other ideas as well (hooks) but they require a much fuller discussion of objects Some objects cause the complexity deployment of tasks for Update and Delete. (Importer, Publisher, Repository). From the issue than user's perspective, this approach. To keep this issue clear, that discussion should will be had on almost unchanged, except the issue describing href for the problem: https://pulp.plan.io/issues/3541 tasks will change: h3. Design Pattern Consistency Old: (Updates and Deletes) v3/tasks/1234 New: v3/tasks/core/updates/1234 v3/tasks/core/deletes/1234 Pulp 3 is built on a simple pattern. Django is the base. It provides a *framework* for development, but does not provide behavior. Django Rest Framework is the next level. Built on Django, it fills out the *framework* but again does not provide behavior. Pulpcore is the next level. Built on DRF, it provides a *framework* for plugin development. It also implements the basic objects that are used identically between all plugins. In this plan there are 2 objects provided by core: Repository This change will be in pulpcore, and Artifact. These objects are also managed by core, and do will not need plugin involvement. All other objects are the responsibility of the plugin. Pulpcore simplifies the plugin responsibilities by providing abstract base classes. affect plugins: A generic simple plugin needs to: * create detail task for Update 1) Implement a ContentUnit (Model, ViewSet, Serializer) * create detail task for Delete 2) Implement a Remote (Model, ViewSet, Serializer) * update viewset mixin for AsyncUpdate 3) Implement a Publisher (Model, ViewSet, Serializer) 4) Implement an AddRemoveTask (Model, ViewSet, Serializer, optional function/task) 5) Implement a SyncTask (Model, ViewSet, Serializer, function/task) 6) Implement a PublishTask (Model, ViewSet, Serializer, function/task) * update viewset mixin for AsyncDelete By following this pattern for **the whole plugin API**, we can make Allow the following guarantees direct creation of Master/Detail Tasks that are connected to the plugin writers: 1) If your problem domain is simple and standard, most of celery tasking system, replacing the work is cheap. Just implement your classes, Pulp will handle the rest. 2) If your problem domain is more complex, you can override parts as necessary. The concerns are well separated, so most validation can be done with minimal overrides. Tools for customization are identical for all objects, and they are **well documented by DRF**. 3) If your problem domain is eccentric, Pulp will stay out previous pattern of your way. You are in control from the request until the return, if you need to be. h2. Example add/remove with pulp_file "action endpoints". Create a AddRemoveTask(Model, ViewSet, Serializer) Old: POST /v3/importers/file/1234/sync/ repository=repohref New POST v3/tasks/file/syncs/ importer=importerhref repository=repositoryhref The ViewSet is just like all other ViewSets. Please be polite, namespace the endpoint by plugin. h3. Component 3: Plugins <pre><code class="python"> * Create models for Detail Tasks class FileAddRemoveTaskViewSet(TaskViewSet, mixins.CreateModelMixin): endpoint_name = 'file/add-remove' queryset = FileAddRemoveTask.objects.all() model = FileAddRemoveTask serializer_class = FileAddRemoveTaskSerializer </code></pre> The Model is just like other Models. The fields represent the parameters * Create serializers for the function/task. <pre><code class="python"> Detail Tasks ** 3 new class FileAddRemoveTask(FileTask): TYPE = 'add-remove' repository = models.ForeignKey(Repository) add_content_units = models.ManyToManyField(FileContent, related_name="added") remove_content_units = models.ManyToManyField(FileContent, related_name="removed") attrs: </code></pre> Serializers for tasks are also the same as all other serializers. I took the liberty of overriding create in the base TaskSerializer to auto-deploy tasks. This makes the plugin classes simple. This behavior is easily overridden by defining your own `create`. In this case, the file plugin is deploying the add/remove task from pulpcore. They could implement their own task if the core task doesn't suit their needs. <pre><code class="python"> *** reservation_structure class FileAddRemoveTaskSerializer(TaskSerializer): repository = serializers.HyperlinkedRelatedField( view_name='repositories-detail', queryset=Repository.objects.all(), ) add_content_units = DetailRelatedField( queryset=FileContent.objects.all(), many=True, ) remove_content_units = DetailRelatedField( queryset=FileContent.objects.all(), many=True, ) reservation_structure = ["repository"] # If there is custom logic related *** task_kwarg_structure *** celery_task ** new fields corresponding to dependencies, validation, etc, the plugin could create # their own task rather than using the general add/remove from pulpcore. celery_task = core_tasks.add_and_remove @property def task_kwargs(self): add_pks = [content_unit.pk arguments for content_unit in self.task.add_content_units.all()] rm_pks = [content_unit.pk for content_unit in self.task.remove_content_units.all()] return {'repository_pk': self.task.repository.pk, 'add_content_units': add_pks, 'remove_content_units': rm_pks} # def validate(self, data): # """ # OPTIONAL! # Here, the plugin writer can provide **synchronous** validation. The plugin writer also has # the opporunity to alter/clean the data. # # Warning: The content in a repository could change between request time and celery task time. # """ # * Create viewsets for content_unit in data['add_content_units']: # if content_unit in data['remove_content_units']: # raise serializers.ValidationError("Cannot add and Detail Tasks * remove a single content unit") # return data class Meta: model = FileAddRemoveTask fields = TaskSerializer.Meta.fields + ("add_content_units", "remove_content_units", "repository") </code></pre> "action" endpoints Sync is exactly the same: ViewSet: https://github.com/asmacdo/pulp_file/blob/bf4138957aa4ac94319bba00c88e0043f8b26a03/pulp_file/app/viewsets.py#L33-L38 Model: https://github.com/asmacdo/pulp_file/blob/bf4138957aa4ac94319bba00c88e0043f8b26a03/pulp_file/app/models.py#L67-L71 Serializer: https://github.com/asmacdo/pulp_file/blob/bf4138957aa4ac94319bba00c88e0043f8b26a03/pulp_file/app/serializers.py#L14-L31 function/task (no change): https://github.com/asmacdo/pulp_file/blob/bf4138957aa4ac94319bba00c88e0043f8b26a03/pulp_file/app/tasks/synchronizing.py h2. Analysis Motivations h3. User Perspective Some discussion on this: https://www.redhat.com/archives/pulp-dev/2018-March/msg00066.html As a user, I * Plugin code is simplified. (Plugin writers no longer need to deploy tasks. Detail Task classes require new attributes (which are easily explained) but do not need to interact with Pulp and plugins by RESTful CRUD of objects. the tasking system.) * When I POST to v3/<object>, I get back a serialized <object> As a user, I can tell Pulp to "do something" by creating a Task * All tasks parameters are in specified by the same place `v3/tasks/ ...` detail Task model, so validation is handled by the Detail Task serializer (href validation is free!) * Each Task has a separate endpoint history becomes more valuable. ** Each endpoint is autodocumented, including filter tasks based on task parameters like repository or importer ** Each endpoint validates parameters filter tasks based on created_resource, and see which importer was used ** Each endpoint can autogenerate a binding for clients Easy to repeat tasks? * I can view Tasks will be viewable by plugin, and filter tasks task type. * Replace "action endpoints" with a GET request standard DRF object CRUD Summary: The plugin writers will write less complex code, which is replaced by normal DRF objects (which are very similar to all the **same endpoint I created other work of writing plugins). This has a few advantages which are summarized under the task** * motivation section. # Task history includes parameters ** As are specified by the Detail Task model. ### Tasks are typed, so you can see whether a user, I can filter "sync" tasks by "remote/importer" version was created with a SyncTask or "repository" an AddRemoveTask h2. User Facing Comparison h3. Plugin writer perspective Object CRUD I covered a lot Creation of this above, but I'll reiterate: Objects will be exactly the same. As a plugin writer, I have a single learning curve: * How to implement ViewSets * How to implement Serializers * How to implement Models * How to customize each * How to implement logic specific to my plugin (connected via ^ customization) Update/Delete of objects that require reservations will be identical except that the task href that is returned will be namespaced. I can't stress One possible downside of this enough. If we make pattern is that "actions" are deployed to Pulp by creating Tasks, but Update/Delete tasks are side effects. Though this change, plugin writing could be confusing at first, I think it makes sense. When deploying an "action", the task creation is **literally** just Models, ViewSets, Serializers, primary and custom plugin-specific logic. This design will probably add more lines of code, in exchange for conceptual simplicity. Here's the pulp_file app. https://github.com/asmacdo/pulp_file/tree/bf4138957aa4ac94319bba00c88e0043f8b26a03/pulp_file/app In particular, compare created_resources is secondary. When deploying and "update" or "delete", the custom code to dispatch sync (this design) to object being affected is primary, and the custom code task to dispatch publish (old design) Sync: https://github.com/asmacdo/pulp_file/blob/bf4138957aa4ac94319bba00c88e0043f8b26a03/pulp_file/app/serializers.py#L14-L31 Publish: https://github.com/asmacdo/pulp_file/blob/bf4138957aa4ac94319bba00c88e0043f8b26a03/pulp_file/app/viewsets.py#L88-L123 h2. Example user experience, sync The user creates a repository and importer the same way they do today. Deploy a sync task it is secondary. <pre> $ http http://pulp3.dev:8000/api/v3/tasks/file/syncs/ importer=http://pulp3.dev:8000/api/v3/importers/file/7a8866a4-f6f4-4ab3-ade6-678e2328b238/ repository=http://pulp3.dev:8000/api/v3/repositories/7e91a5c3-4a96-4d27-b5f2-99ccc563eaab/ PATCH http://pulp3.dev:8000/api/v3/repositories/7e91a5c3-4a96-4d27-b5f2-99ccc563eaab/ name=r2 HTTP/1.0 201 Created 202 Accepted Allow: GET, POST, PUT, PATCH, DELETE, HEAD, OPTIONS Content-Length: 412 148 Content-Type: application/json Date: Fri, 23 Mar 2018 16:47:17 16:41:02 GMT Location: http://pulp3.dev:8000/api/v3/tasks/file/syncs/9d78f7af-80e6-4b95-9a8c-f315eb6872cf/ Server: WSGIServer/0.2 CPython/3.6.4 Vary: Accept, Cookie X-Frame-Options: SAMEORIGIN [ { "_href": "http://pulp3.dev:8000/api/v3/tasks/file/syncs/9d78f7af-80e6-4b95-9a8c-f315eb6872cf/", "http://pulp3.dev:8000/api/v3/tasks/core/updates/824f60ff-13a1-4de6-91a7-a3f6dc23707b/", "task_id": "824f60ff-13a1-4de6-91a7-a3f6dc23707b" "created_resources": [], "error": null, "finished_at": null, "importer": "http://pulp3.dev:8000/api/v3/importers/file/7a8866a4-f6f4-4ab3-ade6-678e2328b238/", "non_fatal_errors": [], "repository": "http://pulp3.dev:8000/api/v3/repositories/7e91a5c3-4a96-4d27-b5f2-99ccc563eaab/", "started_at": null, "state": "waiting", "worker": null } </pre> ] Retrieve Sync Task: <pre> ~ ❯ $ http http://pulp3.dev:8000/api/v3/tasks/file/syncs/9d78f7af-80e6-4b95-9a8c-f315eb6872cf/ http://pulp3.dev:8000/api/v3/tasks/core/updates/824f60ff-13a1-4de6-91a7-a3f6dc23707b/ HTTP/1.0 200 OK Allow: GET, HEAD, OPTIONS Content-Length: 624 348 Content-Type: application/json Date: Fri, 23 Mar 2018 16:47:54 16:41:08 GMT Server: WSGIServer/0.2 CPython/3.6.4 Vary: Accept, Cookie X-Frame-Options: SAMEORIGIN { "_href": "http://pulp3.dev:8000/api/v3/tasks/file/syncs/9d78f7af80e64b959a8cf315eb6872cf/", "http://pulp3.dev:8000/api/v3/tasks/core/updates/824f60ff-13a1-4de6-91a7-a3f6dc23707b/", "created_resources": [ "http://pulp3.dev:8000/api/v3/repositories/7e91a5c3-4a96-4d27-b5f2-99ccc563eaab/versions/6/" [], ], "error": null, "finished_at": "2018-03-23T16:47:18.484046Z", "2018-03-23T16:41:03.452739Z", "importer": "http://pulp3.dev:8000/api/v3/importers/file/7a8866a4-f6f4-4ab3-ade6-678e2328b238/", "non_fatal_errors": [], "repository": "http://pulp3.dev:8000/api/v3/repositories/7e91a5c3-4a96-4d27-b5f2-99ccc563eaab/", "started_at": "2018-03-23T16:47:17.939148Z", "2018-03-23T16:41:03.408065Z", "state": "completed", "worker": "http://pulp3.dev:8000/api/v3/workers/fd319bbb-789e-462b-b87a-83143543330a/" } </pre> </pre> h2. Resultant REST API overview This change would enocurage consistency between plugins by making an "easy obvious way" to deploy and view tasks. This would encourage (but not require) the following urls v3/tasks/docker/add-removes/ v3/tasks/docker/syncs/ v3/tasks/docker/publishes/ v3/tasks/file/add-removes/ v3/tasks/file/syncs/ v3/tasks/file/publishes/ v3/tasks/core/updates/ v3/tasks/core/deletes/ It is trivial to create "tiered viewsets". v3/tasks/file/ <--------- list all file tasks (add-remove, syncs, publishes) v3/tasks/ <----------- list all tasks, which are .cast() and serialized h3. Object CRUD Creation of Objects This will be exactly the same. Update/Delete of objects much more obvious than how we are doing it now, because tasks are spread out. Sync has a place now, because a sync task is associated with an importer (v3/importers/file/1234/sync/). But what about plugins that require reservations will be identical except want to allow multiple importers in a single sync? And what about tasks that don't use a plugin-defined object (like an importer)? For example, rich dependencies don't need an importer, so the task href that plugin writer is returned will be namespaced. (This made sense on their own to me, but it is not critical to the design) define everything, including dispatching their celery tasks and choosing and endpoint. h3. Sync Task example Deploy a sync task <pre> $ http PATCH http://pulp3.dev:8000/api/v3/repositories/7e91a5c3-4a96-4d27-b5f2-99ccc563eaab/ name=r2 http://pulp3.dev:8000/api/v3/tasks/file/syncs/ importer=http://pulp3.dev:8000/api/v3/importers/file/7a8866a4-f6f4-4ab3-ade6-678e2328b238/ repository=http://pulp3.dev:8000/api/v3/repositories/7e91a5c3-4a96-4d27-b5f2-99ccc563eaab/ HTTP/1.0 202 Accepted 201 Created Allow: GET, PUT, PATCH, DELETE, POST, HEAD, OPTIONS Content-Length: 148 412 Content-Type: application/json Date: Fri, 23 Mar 2018 16:41:02 16:47:17 GMT Location: http://pulp3.dev:8000/api/v3/tasks/file/syncs/9d78f7af-80e6-4b95-9a8c-f315eb6872cf/ Server: WSGIServer/0.2 CPython/3.6.4 Vary: Accept, Cookie X-Frame-Options: SAMEORIGIN [ { { "_href": "http://pulp3.dev:8000/api/v3/tasks/core/updates/824f60ff-13a1-4de6-91a7-a3f6dc23707b/", "task_id": "824f60ff-13a1-4de6-91a7-a3f6dc23707b" "http://pulp3.dev:8000/api/v3/tasks/file/syncs/9d78f7af-80e6-4b95-9a8c-f315eb6872cf/", "created_resources": [], "error": null, "finished_at": null, "importer": "http://pulp3.dev:8000/api/v3/importers/file/7a8866a4-f6f4-4ab3-ade6-678e2328b238/", "non_fatal_errors": [], "repository": "http://pulp3.dev:8000/api/v3/repositories/7e91a5c3-4a96-4d27-b5f2-99ccc563eaab/", "started_at": null, "state": "waiting", "worker": null } ] $ </pre> Retrieve Sync Task: <pre> ~ ❯ http http://pulp3.dev:8000/api/v3/tasks/core/updates/824f60ff-13a1-4de6-91a7-a3f6dc23707b/ http://pulp3.dev:8000/api/v3/tasks/file/syncs/9d78f7af-80e6-4b95-9a8c-f315eb6872cf/ HTTP/1.0 200 OK Allow: GET, HEAD, OPTIONS Content-Length: 348 624 Content-Type: application/json Date: Fri, 23 Mar 2018 16:41:08 16:47:54 GMT Server: WSGIServer/0.2 CPython/3.6.4 Vary: Accept, Cookie X-Frame-Options: SAMEORIGIN { "_href": "http://pulp3.dev:8000/api/v3/tasks/core/updates/824f60ff-13a1-4de6-91a7-a3f6dc23707b/", "http://pulp3.dev:8000/api/v3/tasks/file/syncs/9d78f7af80e64b959a8cf315eb6872cf/", "created_resources": [], [ "http://pulp3.dev:8000/api/v3/repositories/7e91a5c3-4a96-4d27-b5f2-99ccc563eaab/versions/6/" ], "error": null, "finished_at": "2018-03-23T16:41:03.452739Z", "2018-03-23T16:47:18.484046Z", "importer": "http://pulp3.dev:8000/api/v3/importers/file/7a8866a4-f6f4-4ab3-ade6-678e2328b238/", "non_fatal_errors": [], "repository": "http://pulp3.dev:8000/api/v3/repositories/7e91a5c3-4a96-4d27-b5f2-99ccc563eaab/", "started_at": "2018-03-23T16:41:03.408065Z", "2018-03-23T16:47:17.939148Z", "state": "completed", "worker": "http://pulp3.dev:8000/api/v3/workers/fd319bbb-789e-462b-b87a-83143543330a/" } </pre> To make sure this was feasible, feasable, I've created proof of concept PRs for pulpcore and pulp_file. All of the proposed changes are proofed, but not all work is done. For example, sync is implemented, but not publish. https://github.com/pulp/pulp/pull/3394 https://github.com/pulp/pulp_file/pull/61