Project

Profile

Help

Task #3522

Updated by amacdona@redhat.com about 6 years ago

h2. End to end design. 

 When the design reaches consensus, sub-tasks for core and plugins should be created. 

 h2. Motivations: 


 h3. Plugin involvement in add/remove Component 1. Make Tasks Master/Detail. 

 The original motivation * Change inheritance of Tasks 
 * Update TaskSerializer for master/detail 
 * override TaskSerializer.create to deploy tasks 
 * update apply_async_with_reservation to accept Tasks (the django model kind, "task_status" in this design change was a problem: code)  
 * Update OperationPostponed to work with Core cannot add/remove content units without plugin involvement.  

 We can solve this problem by removing the core endpoint Tasks 

 h3. Component 2. Tasks relating to create versions, and allow the plugin writers complete control. This approach is conceptually simple, and is used in this plan. There are Update/Delete of other ideas as well (hooks) but they require a much fuller discussion of objects 
 Some objects cause the complexity deployment of tasks for Update and Delete. (Importer, Publisher, Repository). From the issue than user's perspective, this approach. To keep this issue clear, that discussion should will be had on almost unchanged, except the issue describing href for the problem: https://pulp.plan.io/issues/3541 tasks will change: 

 h3. Design Pattern Consistency Old: (Updates and Deletes) v3/tasks/1234 
 New: v3/tasks/core/updates/1234 
          v3/tasks/core/deletes/1234 

 Pulp 3 is built on a simple pattern.  

 Django is the base. It provides a *framework* for development, but does not provide behavior. 
 Django Rest Framework is the next level. Built on Django, it fills out the *framework* but again does not provide behavior. 
 Pulpcore is the next level. Built on DRF, it provides a *framework* for plugin development. It also implements the basic objects that are used identically between all plugins. In this plan there are 2 objects provided by core: Repository This change will be in pulpcore, and Artifact. These objects are also managed by core, and do will not need plugin involvement. All other objects are the responsibility of the plugin. Pulpcore simplifies the plugin responsibilities by providing abstract base classes. affect plugins: 

 A generic simple plugin needs to: * create detail task for Update 
 1) Implement a ContentUnit (Model, ViewSet, Serializer) * create detail task for Delete 
 2) Implement a Remote (Model, ViewSet, Serializer) * update viewset mixin for AsyncUpdate 
 3) Implement a Publisher (Model, ViewSet, Serializer) 
 4) Implement an AddRemoveTask (Model, ViewSet, Serializer, optional function/task) 
 5) Implement a SyncTask (Model, ViewSet, Serializer, function/task) 
 6) Implement a PublishTask (Model, ViewSet, Serializer, function/task) * update viewset mixin for AsyncDelete 

 By following this pattern for **the whole plugin API**, we can make Allow the following guarantees direct creation of Master/Detail Tasks that are connected to the plugin writers: 
 1) If your problem domain is simple and standard, most of celery tasking system, replacing the work is cheap. Just implement your classes, Pulp will handle the rest. 
 2) If your problem domain is more complex, you can override parts as necessary. The concerns are well separated, so most validation can be done with minimal overrides. Tools for customization are identical for all objects, and they are **well documented by DRF**. 
 3) If your problem domain is eccentric, Pulp will stay out previous pattern of your way. You are in control from the request until the return, if you need to be.  

 h2. Example add/remove with pulp_file "action endpoints". 

 Create a AddRemoveTask(Model, ViewSet, Serializer) Old: POST /v3/importers/file/1234/sync/ repository=repohref 
 New POST v3/tasks/file/syncs/ importer=importerhref repository=repositoryhref 

 The ViewSet is just like all other ViewSets. Please be polite, namespace the endpoint by plugin. h3. Component 3: Plugins 

 <pre><code class="python"> * Create models for Detail Tasks 
 class FileAddRemoveTaskViewSet(TaskViewSet, mixins.CreateModelMixin): 


     endpoint_name = 'file/add-remove' 
     queryset = FileAddRemoveTask.objects.all() 
     model = FileAddRemoveTask 
     serializer_class = FileAddRemoveTaskSerializer 
 </code></pre> 

 The Model is just like other Models. The fields represent the parameters * Create serializers for the function/task. 

 <pre><code class="python"> Detail Tasks 
 ** 3 new class FileAddRemoveTask(FileTask): 


     TYPE = 'add-remove' 
     repository = models.ForeignKey(Repository) 
     add_content_units = models.ManyToManyField(FileContent, related_name="added") 
     remove_content_units = models.ManyToManyField(FileContent, related_name="removed") attrs: 
 </code></pre> 

 Serializers for tasks are also the same as all other serializers. I took the liberty of overriding create in the base TaskSerializer to auto-deploy tasks. This makes the plugin classes simple. This behavior is easily overridden by defining your own `create`.    In this case, the file plugin is deploying the add/remove task from pulpcore. They could implement their own task if the core task doesn't suit their needs. 

 <pre><code class="python"> *** reservation_structure 
 class FileAddRemoveTaskSerializer(TaskSerializer): 


     repository = serializers.HyperlinkedRelatedField( 
         view_name='repositories-detail', 
         queryset=Repository.objects.all(), 
     ) 


     add_content_units = DetailRelatedField( 
         queryset=FileContent.objects.all(), 
         many=True, 
     ) 
     remove_content_units = DetailRelatedField( 
         queryset=FileContent.objects.all(), 
         many=True, 
     ) 


     reservation_structure = ["repository"] 


     # If there is custom logic related *** task_kwarg_structure 
 *** celery_task  
 ** new fields corresponding to dependencies, validation, etc, the plugin could create 
     # their own task rather than using the general add/remove from pulpcore. 
     celery_task = core_tasks.add_and_remove 


     @property 
     def task_kwargs(self): 
         add_pks = [content_unit.pk arguments for content_unit in self.task.add_content_units.all()] 
         rm_pks = [content_unit.pk for content_unit in self.task.remove_content_units.all()] 
         return {'repository_pk': self.task.repository.pk, 
                 'add_content_units': add_pks, 
                 'remove_content_units': rm_pks} 


     # def validate(self, data): 
     #       """ 
     #       OPTIONAL! 
     #       Here, the plugin writer can provide **synchronous** validation. The plugin writer also has 
     #       the opporunity to alter/clean the data. 
     # 
     #       Warning: The content in a repository could change between request time and celery task time. 
     #       """ 
     #       
 * Create viewsets for content_unit in data['add_content_units']: 
     #           if content_unit in data['remove_content_units']: 
     #               raise serializers.ValidationError("Cannot add and Detail Tasks 
 * remove a single content unit") 
     #       return data 


     class Meta: 
         model = FileAddRemoveTask 
         fields = TaskSerializer.Meta.fields + ("add_content_units", "remove_content_units", 
                                                "repository") 
 </code></pre> "action" endpoints 

 Sync is exactly the same: 
 ViewSet: https://github.com/asmacdo/pulp_file/blob/bf4138957aa4ac94319bba00c88e0043f8b26a03/pulp_file/app/viewsets.py#L33-L38 
 Model: https://github.com/asmacdo/pulp_file/blob/bf4138957aa4ac94319bba00c88e0043f8b26a03/pulp_file/app/models.py#L67-L71 
 Serializer: https://github.com/asmacdo/pulp_file/blob/bf4138957aa4ac94319bba00c88e0043f8b26a03/pulp_file/app/serializers.py#L14-L31 
 function/task (no change): https://github.com/asmacdo/pulp_file/blob/bf4138957aa4ac94319bba00c88e0043f8b26a03/pulp_file/app/tasks/synchronizing.py 


 h2. Analysis Motivations 

 h3. User Perspective Some discussion on this: https://www.redhat.com/archives/pulp-dev/2018-March/msg00066.html 

 As a user, I * Plugin code is simplified. (Plugin writers no longer need to deploy tasks. Detail Task classes    require new attributes (which are easily explained) but do not need to interact with Pulp and plugins by RESTful CRUD of objects. the tasking system.) 
 * When I POST to v3/<object>, I get back a serialized <object> 

 As a user, I can tell Pulp to "do something" by creating a Task 
 * All tasks parameters are in specified by the same place `v3/tasks/ ...` detail Task model, so validation is handled by the Detail Task serializer (href validation is free!) 
 * Each Task has a separate endpoint 
 history becomes more valuable.  
 ** Each endpoint is autodocumented, including filter tasks based on task parameters like repository or importer 
 ** Each endpoint validates parameters filter tasks based on created_resource, and see which importer was used 
 ** Each endpoint can autogenerate a binding for clients Easy to repeat tasks? 
 * I can view Tasks will be viewable by plugin, and filter tasks task type. 
 * Replace "action endpoints" with a GET request standard DRF object CRUD 

 Summary: The plugin writers will write less complex code, which is replaced by normal DRF objects (which are very similar to all the **same endpoint I created other work of writing plugins). 


 This has a few advantages which are summarized under the task** 
 * motivation section. 

 # Task history includes parameters 
 ** As are specified by the Detail Task model. 

 ### Tasks are typed, so you can see whether a user, I can filter "sync" tasks by "remote/importer" version was created with a SyncTask or "repository" an AddRemoveTask 

 h2. User Facing Comparison 

 h3. Plugin writer perspective Object CRUD 

 I covered a lot Creation of this above, but I'll reiterate: Objects will be exactly the same. 

 As a plugin writer, I have a single learning curve: 
 * How to implement ViewSets 
 * How to implement Serializers 
 * How to implement Models 
 * How to customize each 
 * How to implement logic specific to my plugin (connected via ^ customization) Update/Delete of objects that require reservations will be identical except that the task href that is returned will be namespaced. 

 I can't stress One possible downside of this enough. If we make pattern is that "actions" are deployed to Pulp by creating Tasks, but Update/Delete tasks are side effects. Though this change, plugin writing could be confusing at first, I think it makes sense. When deploying an "action", the task creation is **literally** just Models, ViewSets, Serializers, primary and custom plugin-specific logic. This design will probably add more lines of code, in exchange for conceptual simplicity. 

 Here's the pulp_file app. 
 https://github.com/asmacdo/pulp_file/tree/bf4138957aa4ac94319bba00c88e0043f8b26a03/pulp_file/app 

 In particular, compare created_resources is secondary. When deploying and "update" or "delete", the custom code to dispatch sync (this design) to object being affected is primary, and the custom code task to dispatch publish (old design) 

 Sync: https://github.com/asmacdo/pulp_file/blob/bf4138957aa4ac94319bba00c88e0043f8b26a03/pulp_file/app/serializers.py#L14-L31 

 Publish:  
 https://github.com/asmacdo/pulp_file/blob/bf4138957aa4ac94319bba00c88e0043f8b26a03/pulp_file/app/viewsets.py#L88-L123 

 h2. Example user experience, sync 

 The user creates a repository and importer the same way they do today. 

 Deploy a sync task 

 it is secondary.  

 <pre> 
  
 $ http http://pulp3.dev:8000/api/v3/tasks/file/syncs/ importer=http://pulp3.dev:8000/api/v3/importers/file/7a8866a4-f6f4-4ab3-ade6-678e2328b238/ repository=http://pulp3.dev:8000/api/v3/repositories/7e91a5c3-4a96-4d27-b5f2-99ccc563eaab/ PATCH http://pulp3.dev:8000/api/v3/repositories/7e91a5c3-4a96-4d27-b5f2-99ccc563eaab/ name=r2 
 HTTP/1.0 201 Created 202 Accepted 
 Allow: GET, POST, PUT, PATCH, DELETE, HEAD, OPTIONS 
 Content-Length: 412 148 
 Content-Type: application/json 
 Date: Fri, 23 Mar 2018 16:47:17 16:41:02 GMT 
 Location: http://pulp3.dev:8000/api/v3/tasks/file/syncs/9d78f7af-80e6-4b95-9a8c-f315eb6872cf/ 
 Server: WSGIServer/0.2 CPython/3.6.4 
 Vary: Accept, Cookie 
 X-Frame-Options: SAMEORIGIN 

 [ 
     { 
     
         "_href": "http://pulp3.dev:8000/api/v3/tasks/file/syncs/9d78f7af-80e6-4b95-9a8c-f315eb6872cf/", "http://pulp3.dev:8000/api/v3/tasks/core/updates/824f60ff-13a1-4de6-91a7-a3f6dc23707b/", 
         "task_id": "824f60ff-13a1-4de6-91a7-a3f6dc23707b" 
     "created_resources": [], 
     "error": null, 
     "finished_at": null, 
     "importer": "http://pulp3.dev:8000/api/v3/importers/file/7a8866a4-f6f4-4ab3-ade6-678e2328b238/", 
     "non_fatal_errors": [], 
     "repository": "http://pulp3.dev:8000/api/v3/repositories/7e91a5c3-4a96-4d27-b5f2-99ccc563eaab/", 
     "started_at": null, 
     "state": "waiting", 
     "worker": null 
 } 
 </pre> ] 

 Retrieve Sync Task: 

 <pre> 
 ~ ❯ $ http http://pulp3.dev:8000/api/v3/tasks/file/syncs/9d78f7af-80e6-4b95-9a8c-f315eb6872cf/ http://pulp3.dev:8000/api/v3/tasks/core/updates/824f60ff-13a1-4de6-91a7-a3f6dc23707b/ 
 HTTP/1.0 200 OK 
 Allow: GET, HEAD, OPTIONS 
 Content-Length: 624 348 
 Content-Type: application/json 
 Date: Fri, 23 Mar 2018 16:47:54 16:41:08 GMT 
 Server: WSGIServer/0.2 CPython/3.6.4 
 Vary: Accept, Cookie 
 X-Frame-Options: SAMEORIGIN 

 { 
     "_href": "http://pulp3.dev:8000/api/v3/tasks/file/syncs/9d78f7af80e64b959a8cf315eb6872cf/", "http://pulp3.dev:8000/api/v3/tasks/core/updates/824f60ff-13a1-4de6-91a7-a3f6dc23707b/", 
     "created_resources": [ 
         "http://pulp3.dev:8000/api/v3/repositories/7e91a5c3-4a96-4d27-b5f2-99ccc563eaab/versions/6/" [], 
     ], 
     "error": null, 
     "finished_at": "2018-03-23T16:47:18.484046Z", "2018-03-23T16:41:03.452739Z", 
     "importer": "http://pulp3.dev:8000/api/v3/importers/file/7a8866a4-f6f4-4ab3-ade6-678e2328b238/", 
     "non_fatal_errors": [], 
     "repository": "http://pulp3.dev:8000/api/v3/repositories/7e91a5c3-4a96-4d27-b5f2-99ccc563eaab/", 
     "started_at": "2018-03-23T16:47:17.939148Z", "2018-03-23T16:41:03.408065Z", 
     "state": "completed", 
     "worker": "http://pulp3.dev:8000/api/v3/workers/fd319bbb-789e-462b-b87a-83143543330a/" 
 } 
 </pre> 

 </pre> 


 h2. Resultant REST API overview 

 This change would enocurage consistency between plugins by making an "easy obvious way" to deploy and view tasks. This would encourage (but not require) the following urls 

 v3/tasks/docker/add-removes/ 
 v3/tasks/docker/syncs/ 
 v3/tasks/docker/publishes/ 
 v3/tasks/file/add-removes/ 
 v3/tasks/file/syncs/ 
 v3/tasks/file/publishes/ 
 v3/tasks/core/updates/ 
 v3/tasks/core/deletes/ 

 It is trivial to create "tiered viewsets". 

 v3/tasks/file/    <---------    list all file tasks (add-remove, syncs, publishes) 
 v3/tasks/        <-----------    list all tasks, which are .cast() and serialized 


 h3. Object CRUD 

 Creation of Objects This will be exactly the same. 

 Update/Delete of objects much more obvious than how we are doing it now, because tasks are spread out. Sync has a place now, because a sync task is associated with an importer (v3/importers/file/1234/sync/). But what about plugins that require reservations will be identical except want to allow multiple importers in a single sync? And what about tasks that don't use a plugin-defined object (like an importer)?  

 For example, rich dependencies don't need an importer, so the task href that plugin writer is returned will be namespaced. (This made sense on their own to me, but it is not critical to the design) define everything, including dispatching their celery tasks and choosing and endpoint. 


 h3. Sync Task example 

 Deploy a sync task 
 <pre> 
 $ 
  http PATCH http://pulp3.dev:8000/api/v3/repositories/7e91a5c3-4a96-4d27-b5f2-99ccc563eaab/ name=r2 http://pulp3.dev:8000/api/v3/tasks/file/syncs/ importer=http://pulp3.dev:8000/api/v3/importers/file/7a8866a4-f6f4-4ab3-ade6-678e2328b238/ repository=http://pulp3.dev:8000/api/v3/repositories/7e91a5c3-4a96-4d27-b5f2-99ccc563eaab/ 
 HTTP/1.0 202 Accepted 201 Created 
 Allow: GET, PUT, PATCH, DELETE, POST, HEAD, OPTIONS 
 Content-Length: 148 412 
 Content-Type: application/json 
 Date: Fri, 23 Mar 2018 16:41:02 16:47:17 GMT 
 Location: http://pulp3.dev:8000/api/v3/tasks/file/syncs/9d78f7af-80e6-4b95-9a8c-f315eb6872cf/ 
 Server: WSGIServer/0.2 CPython/3.6.4 
 Vary: Accept, Cookie 
 X-Frame-Options: SAMEORIGIN 

 [ { 
     { 
         "_href": "http://pulp3.dev:8000/api/v3/tasks/core/updates/824f60ff-13a1-4de6-91a7-a3f6dc23707b/", 
         "task_id": "824f60ff-13a1-4de6-91a7-a3f6dc23707b" "http://pulp3.dev:8000/api/v3/tasks/file/syncs/9d78f7af-80e6-4b95-9a8c-f315eb6872cf/", 
     "created_resources": [], 
     "error": null, 
     "finished_at": null, 
     "importer": "http://pulp3.dev:8000/api/v3/importers/file/7a8866a4-f6f4-4ab3-ade6-678e2328b238/", 
     "non_fatal_errors": [], 
     "repository": "http://pulp3.dev:8000/api/v3/repositories/7e91a5c3-4a96-4d27-b5f2-99ccc563eaab/", 
     "started_at": null, 
     "state": "waiting", 
     "worker": null 
 } 
 ] 

 $ </pre> 
 Retrieve Sync Task: 
 <pre> 
 ~ ❯ http http://pulp3.dev:8000/api/v3/tasks/core/updates/824f60ff-13a1-4de6-91a7-a3f6dc23707b/ http://pulp3.dev:8000/api/v3/tasks/file/syncs/9d78f7af-80e6-4b95-9a8c-f315eb6872cf/ 
 HTTP/1.0 200 OK 
 Allow: GET, HEAD, OPTIONS 
 Content-Length: 348 624 
 Content-Type: application/json 
 Date: Fri, 23 Mar 2018 16:41:08 16:47:54 GMT 
 Server: WSGIServer/0.2 CPython/3.6.4 
 Vary: Accept, Cookie 
 X-Frame-Options: SAMEORIGIN 

 { 
     "_href": "http://pulp3.dev:8000/api/v3/tasks/core/updates/824f60ff-13a1-4de6-91a7-a3f6dc23707b/", "http://pulp3.dev:8000/api/v3/tasks/file/syncs/9d78f7af80e64b959a8cf315eb6872cf/", 
     "created_resources": [], [ 
         "http://pulp3.dev:8000/api/v3/repositories/7e91a5c3-4a96-4d27-b5f2-99ccc563eaab/versions/6/" 
     ], 
     "error": null, 
     "finished_at": "2018-03-23T16:41:03.452739Z", "2018-03-23T16:47:18.484046Z", 
     "importer": "http://pulp3.dev:8000/api/v3/importers/file/7a8866a4-f6f4-4ab3-ade6-678e2328b238/", 
     "non_fatal_errors": [], 
     "repository": "http://pulp3.dev:8000/api/v3/repositories/7e91a5c3-4a96-4d27-b5f2-99ccc563eaab/", 
     "started_at": "2018-03-23T16:41:03.408065Z", "2018-03-23T16:47:17.939148Z", 
     "state": "completed", 
     "worker": "http://pulp3.dev:8000/api/v3/workers/fd319bbb-789e-462b-b87a-83143543330a/" 
 } 
 

 </pre> 


 To make sure this was feasible, feasable, I've created proof of concept PRs for pulpcore and pulp_file. All of the proposed changes are proofed, but not all work is done. For example, sync is implemented, but not publish. 
 https://github.com/pulp/pulp/pull/3394 
 https://github.com/pulp/pulp_file/pull/61 

 

Back