Task #3522
closedPlan Master/Detail Tasks
0%
Description
End to end design.¶
When the design reaches consensus, sub-tasks for core and plugins should be created.
Motivations:¶
Plugin involvement in add/remove¶
The original motivation of this design change was a problem: Core cannot add/remove content units without plugin involvement.
We can solve this problem by removing the core endpoint to create versions, and allow the plugin writers complete control. This approach is conceptually simple, and is used in this plan. There are other ideas as well (hooks) but they require a much fuller discussion of the complexity of the issue than this approach. To keep this issue clear, that discussion should be had on the issue describing the problem: https://pulp.plan.io/issues/3541
Design Pattern Consistency¶
Pulp 3 is built on a simple pattern.
Django is the base. It provides a framework for development, but does not provide behavior.
Django Rest Framework is the next level. Built on Django, it fills out the framework but again does not provide behavior.
Pulpcore is the next level. Built on DRF, it provides a framework for plugin development. It also implements the basic objects that are used identically between all plugins. In this plan there are 2 objects provided by core: Repository and Artifact. These objects are also managed by core, and do not need plugin involvement. All other objects are the responsibility of the plugin. Pulpcore simplifies the plugin responsibilities by providing abstract base classes.
A generic simple plugin needs to:
1) Implement a ContentUnit (Model, ViewSet, Serializer)
2) Implement a Remote (Model, ViewSet, Serializer)
3) Implement a Publisher (Model, ViewSet, Serializer)
4) Implement an AddRemoveTask (Model, ViewSet, Serializer, optional function/task)
5) Implement a SyncTask (Model, ViewSet, Serializer, function/task)
6) Implement a PublishTask (Model, ViewSet, Serializer, function/task)
By following this pattern for the whole plugin API, we can make the following guarantees to the plugin writers:
1) If your problem domain is simple and standard, most of the work is cheap. Just implement your classes, Pulp will handle the rest.
2) If your problem domain is more complex, you can override parts as necessary. The concerns are well separated, so most validation can be done with minimal overrides. Tools for customization are identical for all objects, and they are well documented by DRF.
3) If your problem domain is eccentric, Pulp will stay out of your way. You are in control from the request until the return, if you need to be.
Example add/remove with pulp_file¶
Create a AddRemoveTask(Model, ViewSet, Serializer)
The ViewSet is just like all other ViewSets. Please be polite, namespace the endpoint by plugin.
class FileAddRemoveTaskViewSet(TaskViewSet, mixins.CreateModelMixin):
endpoint_name = 'file/add-remove'
queryset = FileAddRemoveTask.objects.all()
model = FileAddRemoveTask
serializer_class = FileAddRemoveTaskSerializer
The Model is just like other Models. The fields represent the parameters for the function/task.
class FileAddRemoveTask(FileTask):
TYPE = 'add-remove'
repository = models.ForeignKey(Repository)
add_content_units = models.ManyToManyField(FileContent, related_name="added")
remove_content_units = models.ManyToManyField(FileContent, related_name="removed")
Serializers for tasks are also the same as all other serializers. I took the liberty of overriding create in the base TaskSerializer to auto-deploy tasks. This makes the plugin classes simple. This behavior is easily overridden by defining your own `create`. In this case, the file plugin is deploying the add/remove task from pulpcore. They could implement their own task if the core task doesn't suit their needs.
class FileAddRemoveTaskSerializer(TaskSerializer):
repository = serializers.HyperlinkedRelatedField(
view_name='repositories-detail',
queryset=Repository.objects.all(),
)
add_content_units = DetailRelatedField(
queryset=FileContent.objects.all(),
many=True,
)
remove_content_units = DetailRelatedField(
queryset=FileContent.objects.all(),
many=True,
)
reservation_structure = ["repository"]
# If there is custom logic related to dependencies, validation, etc, the plugin could create
# their own task rather than using the general add/remove from pulpcore.
celery_task = core_tasks.add_and_remove
@property
def task_kwargs(self):
add_pks = [content_unit.pk for content_unit in self.task.add_content_units.all()]
rm_pks = [content_unit.pk for content_unit in self.task.remove_content_units.all()]
return {'repository_pk': self.task.repository.pk,
'add_content_units': add_pks,
'remove_content_units': rm_pks}
# def validate(self, data):
# """
# OPTIONAL!
# Here, the plugin writer can provide **synchronous** validation. The plugin writer also has
# the opporunity to alter/clean the data.
#
# Warning: The content in a repository could change between request time and task time.
# """
# for content_unit in data['add_content_units']:
# if content_unit in data['remove_content_units']:
# raise serializers.ValidationError("Cannot add and remove a single content unit")
# return data
class Meta:
model = FileAddRemoveTask
fields = TaskSerializer.Meta.fields + ("add_content_units", "remove_content_units",
"repository")
Sync is exactly the same:
ViewSet: https://github.com/asmacdo/pulp_file/blob/bf4138957aa4ac94319bba00c88e0043f8b26a03/pulp_file/app/viewsets.py#L33-L38
Model: https://github.com/asmacdo/pulp_file/blob/bf4138957aa4ac94319bba00c88e0043f8b26a03/pulp_file/app/models.py#L67-L71
Serializer: https://github.com/asmacdo/pulp_file/blob/bf4138957aa4ac94319bba00c88e0043f8b26a03/pulp_file/app/serializers.py#L14-L31
function/task (no change): https://github.com/asmacdo/pulp_file/blob/bf4138957aa4ac94319bba00c88e0043f8b26a03/pulp_file/app/tasks/synchronizing.py
Analysis¶
User Perspective¶
As a user, I interact with Pulp and plugins by RESTful CRUD of objects.
- When I POST to v3/<object>, I get back a serialized <object>
As a user, I can tell Pulp to "do something" by creating a Task
- All tasks are in the same place `v3/tasks/ ...`
- Each Task has a separate endpoint
- Each endpoint is autodocumented, including parameters
- Each endpoint validates parameters
- Each endpoint can autogenerate a binding for clients
- I can view and filter tasks with a GET request to the same endpoint I created the task
- Task history includes parameters
- As a user, I can filter "sync" tasks by "remote/importer" or "repository"
Plugin writer perspective¶
I covered a lot of this above, but I'll reiterate:
As a plugin writer, I have a single learning curve:
- How to implement ViewSets
- How to implement Serializers
- How to implement Models
- How to customize each
- How to implement logic specific to my plugin (connected via ^ customization)
I can't stress this enough. If we make this change, plugin writing is literally just Models, ViewSets, Serializers, and custom plugin-specific logic. This design will probably add more lines of code, in exchange for conceptual simplicity.
Here's the pulp_file app.
https://github.com/asmacdo/pulp_file/tree/bf4138957aa4ac94319bba00c88e0043f8b26a03/pulp_file/app
In particular, compare the custom code to dispatch sync (this design) to the custom code to dispatch publish (old design)
Example user experience, sync¶
The user creates a repository and importer the same way they do today.
Deploy a sync task
http http://pulp3.dev:8000/api/v3/tasks/file/syncs/ importer=http://pulp3.dev:8000/api/v3/importers/file/7a8866a4-f6f4-4ab3-ade6-678e2328b238/ repository=http://pulp3.dev:8000/api/v3/repositories/7e91a5c3-4a96-4d27-b5f2-99ccc563eaab/
HTTP/1.0 201 Created
Allow: GET, POST, HEAD, OPTIONS
Content-Length: 412
Content-Type: application/json
Date: Fri, 23 Mar 2018 16:47:17 GMT
Location: http://pulp3.dev:8000/api/v3/tasks/file/syncs/9d78f7af-80e6-4b95-9a8c-f315eb6872cf/
Server: WSGIServer/0.2 CPython/3.6.4
Vary: Accept, Cookie
X-Frame-Options: SAMEORIGIN
{
"_href": "http://pulp3.dev:8000/api/v3/tasks/file/syncs/9d78f7af-80e6-4b95-9a8c-f315eb6872cf/",
"created_resources": [],
"error": null,
"finished_at": null,
"importer": "http://pulp3.dev:8000/api/v3/importers/file/7a8866a4-f6f4-4ab3-ade6-678e2328b238/",
"non_fatal_errors": [],
"repository": "http://pulp3.dev:8000/api/v3/repositories/7e91a5c3-4a96-4d27-b5f2-99ccc563eaab/",
"started_at": null,
"state": "waiting",
"worker": null
}
Retrieve Sync Task:
~ ❯ http http://pulp3.dev:8000/api/v3/tasks/file/syncs/9d78f7af-80e6-4b95-9a8c-f315eb6872cf/
HTTP/1.0 200 OK
Allow: GET, HEAD, OPTIONS
Content-Length: 624
Content-Type: application/json
Date: Fri, 23 Mar 2018 16:47:54 GMT
Server: WSGIServer/0.2 CPython/3.6.4
Vary: Accept, Cookie
X-Frame-Options: SAMEORIGIN
{
"_href": "http://pulp3.dev:8000/api/v3/tasks/file/syncs/9d78f7af80e64b959a8cf315eb6872cf/",
"created_resources": [
"http://pulp3.dev:8000/api/v3/repositories/7e91a5c3-4a96-4d27-b5f2-99ccc563eaab/versions/6/"
],
"error": null,
"finished_at": "2018-03-23T16:47:18.484046Z",
"importer": "http://pulp3.dev:8000/api/v3/importers/file/7a8866a4-f6f4-4ab3-ade6-678e2328b238/",
"non_fatal_errors": [],
"repository": "http://pulp3.dev:8000/api/v3/repositories/7e91a5c3-4a96-4d27-b5f2-99ccc563eaab/",
"started_at": "2018-03-23T16:47:17.939148Z",
"state": "completed",
"worker": "http://pulp3.dev:8000/api/v3/workers/fd319bbb-789e-462b-b87a-83143543330a/"
}
Resultant REST API overview¶
This change would enocurage consistency between plugins by making an "easy obvious way" to deploy and view tasks. This would encourage (but not require) the following urls
v3/tasks/docker/add-removes/
v3/tasks/docker/syncs/
v3/tasks/docker/publishes/
v3/tasks/file/add-removes/
v3/tasks/file/syncs/
v3/tasks/file/publishes/
v3/tasks/core/updates/
v3/tasks/core/deletes/
It is trivial to create "tiered viewsets".
v3/tasks/file/ <--------- list all file tasks (add-remove, syncs, publishes)
v3/tasks/ <----------- list all tasks, which are .cast() and serialized
Object CRUD¶
Creation of Objects will be exactly the same.
Update/Delete of objects that require reservations will be identical except that the task href that is returned will be namespaced. (This made sense to me, but it is not critical to the design)
$ http PATCH http://pulp3.dev:8000/api/v3/repositories/7e91a5c3-4a96-4d27-b5f2-99ccc563eaab/ name=r2
HTTP/1.0 202 Accepted
Allow: GET, PUT, PATCH, DELETE, HEAD, OPTIONS
Content-Length: 148
Content-Type: application/json
Date: Fri, 23 Mar 2018 16:41:02 GMT
Server: WSGIServer/0.2 CPython/3.6.4
Vary: Accept, Cookie
X-Frame-Options: SAMEORIGIN
[
{
"_href": "http://pulp3.dev:8000/api/v3/tasks/core/updates/824f60ff-13a1-4de6-91a7-a3f6dc23707b/",
"task_id": "824f60ff-13a1-4de6-91a7-a3f6dc23707b"
}
]
$ http http://pulp3.dev:8000/api/v3/tasks/core/updates/824f60ff-13a1-4de6-91a7-a3f6dc23707b/
HTTP/1.0 200 OK
Allow: GET, HEAD, OPTIONS
Content-Length: 348
Content-Type: application/json
Date: Fri, 23 Mar 2018 16:41:08 GMT
Server: WSGIServer/0.2 CPython/3.6.4
Vary: Accept, Cookie
X-Frame-Options: SAMEORIGIN
{
"_href": "http://pulp3.dev:8000/api/v3/tasks/core/updates/824f60ff-13a1-4de6-91a7-a3f6dc23707b/",
"created_resources": [],
"error": null,
"finished_at": "2018-03-23T16:41:03.452739Z",
"non_fatal_errors": [],
"started_at": "2018-03-23T16:41:03.408065Z",
"state": "completed",
"worker": "http://pulp3.dev:8000/api/v3/workers/fd319bbb-789e-462b-b87a-83143543330a/"
}
To make sure this was feasible, I've created proof of concept PRs for pulpcore and pulp_file. All of the proposed changes are proofed, but not all work is done. For example, sync is implemented, but not publish.
https://github.com/pulp/pulp/pull/3394
https://github.com/pulp/pulp_file/pull/61
Related issues
Updated by amacdona@redhat.com over 6 years ago
- Related to Issue #3541: Core should not add/remove content to a repository or create a repository_version without plugin input added
Updated by milan over 6 years ago
I experimented a bit with the patches, but I can't make those work right now; had to perform some modifications;
in the DB:
CREATE TABLE IF NOT EXISTS "pulp_file_filesynctask" ("id" char(32) NOT NULL PRIMARY KEY, "created" datetime NOT NULL, "last_updated" datetime NULL, "state" text NOT NULL, "started_at" datetime NULL, "finished_at" datetime NULL, "non_fatal_errors" text NOT NULL, "error" text NULL, "worker_id" char(32) NULL REFERENCES "pulp_app_worker" ("id"), "parent_id" char(32) NULL REFERENCES "pulp_app_task" ("id"), "type" char(256), "importer_id" char(32), "repository_id" char(32), "filetask_ptr_id" char(32), FOREIGN KEY (importer_id) REFERENCES pulp_file_fileimporter("id"), FOREIGN KEY (repository_id) REFERENCES pulp_app_repository(id));
Now if I do phttp :8000/v3/tasks/file/syncs/
:
(pulp) [vagrant@pulp3 pulpcore]$ phttp http://localhost:8000/api/v3/tasks/file/syncs/
HTTP/1.0 200 OK
Allow: GET, POST, HEAD, OPTIONS
Content-Length: 42
Content-Type: application/json
Date: Wed, 04 Apr 2018 10:43:48 GMT
Server: WSGIServer/0.2 CPython/3.6.4
Vary: Accept, Cookie
X-Frame-Options: SAMEORIGIN
{
"next": null,
"previous": null,
"results": []
}
(pulp) [vagrant@pulp3 pulpcore]$
but phttp POST :8000/v3/tasks/file/syncs/ repository=<repo_url> importer=<importer_url>
still just causes a traceback:
Traceback (most recent call last):
File "/home/vagrant/.virtualenvs/pulp/lib64/python3.6/site-packages/django/core/handlers/exception.py", line 41, in inner
response = get_response(request)
File "/home/vagrant/.virtualenvs/pulp/lib64/python3.6/site-packages/django/core/handlers/base.py", line 249, in _legacy_get_response
response = self._get_response(request)
File "/home/vagrant/.virtualenvs/pulp/lib64/python3.6/site-packages/django/core/handlers/base.py", line 187, in _get_response
response = self.process_exception_by_middleware(e, request)
File "/home/vagrant/.virtualenvs/pulp/lib64/python3.6/site-packages/django/core/handlers/base.py", line 185, in _get_response
response = wrapped_callback(request, *callback_args, **callback_kwargs)
File "/home/vagrant/.virtualenvs/pulp/lib64/python3.6/site-packages/django/views/decorators/csrf.py", line 58, in wrapped_view
return view_func(*args, **kwargs)
File "/home/vagrant/.virtualenvs/pulp/lib64/python3.6/site-packages/rest_framework/viewsets.py", line 95, in view
return self.dispatch(request, *args, **kwargs)
File "/home/vagrant/.virtualenvs/pulp/lib64/python3.6/site-packages/rest_framework/views.py", line 494, in dispatch
response = self.handle_exception(exc)
File "/home/vagrant/.virtualenvs/pulp/lib64/python3.6/site-packages/rest_framework/views.py", line 454, in handle_exception
self.raise_uncaught_exception(exc)
File "/home/vagrant/.virtualenvs/pulp/lib64/python3.6/site-packages/rest_framework/views.py", line 491, in dispatch
response = handler(request, *args, **kwargs)
File "/home/vagrant/.virtualenvs/pulp/lib64/python3.6/site-packages/rest_framework/mixins.py", line 21, in create
self.perform_create(serializer)
File "/home/vagrant/.virtualenvs/pulp/lib64/python3.6/site-packages/rest_framework/mixins.py", line 26, in perform_create
serializer.save()
File "/home/vagrant/.virtualenvs/pulp/lib64/python3.6/site-packages/rest_framework/serializers.py", line 214, in save
self.instance = self.create(validated_data)
File "/home/vagrant/devel/pulp/pulpcore/pulpcore/app/serializers/task.py", line 101, in create
self.task = super().create(validated_data)
File "/home/vagrant/devel/pulp/pulpcore/pulpcore/app/serializers/base.py", line 123, in create
instance = super(ModelSerializer, self).create(validated_data)
File "/home/vagrant/.virtualenvs/pulp/lib64/python3.6/site-packages/rest_framework/serializers.py", line 917, in create
instance = ModelClass.objects.create(**validated_data)
File "/home/vagrant/.virtualenvs/pulp/lib64/python3.6/site-packages/django/db/models/manager.py", line 85, in manager_method
return getattr(self.get_queryset(), name)(*args, **kwargs)
File "/home/vagrant/.virtualenvs/pulp/lib64/python3.6/site-packages/django/db/models/query.py", line 394, in create
obj.save(force_insert=True, using=self.db)
File "/home/vagrant/devel/pulp/pulpcore/pulpcore/app/models/base.py", line 92, in save
return super(MasterModel, self).save(*args, **kwargs)
File "/home/vagrant/.virtualenvs/pulp/lib64/python3.6/site-packages/django/db/models/base.py", line 808, in save
force_update=force_update, update_fields=update_fields)
File "/home/vagrant/.virtualenvs/pulp/lib64/python3.6/site-packages/django/db/models/base.py", line 837, in save_base
self._save_parents(cls, using, update_fields)
File "/home/vagrant/.virtualenvs/pulp/lib64/python3.6/site-packages/django/db/models/base.py", line 864, in _save_parents
self._save_table(cls=parent, using=using, update_fields=update_fields)
File "/home/vagrant/.virtualenvs/pulp/lib64/python3.6/site-packages/django/db/models/base.py", line 924, in _save_table
result = self._do_insert(cls._base_manager, using, fields, update_pk, raw)
File "/home/vagrant/.virtualenvs/pulp/lib64/python3.6/site-packages/django/db/models/base.py", line 963, in _do_insert
using=using, raw=raw)
File "/home/vagrant/.virtualenvs/pulp/lib64/python3.6/site-packages/django/db/models/manager.py", line 85, in manager_method
return getattr(self.get_queryset(), name)(*args, **kwargs)
File "/home/vagrant/.virtualenvs/pulp/lib64/python3.6/site-packages/django/db/models/query.py", line 1076, in _insert
return query.get_compiler(using=using).execute_sql(return_id)
File "/home/vagrant/.virtualenvs/pulp/lib64/python3.6/site-packages/django/db/models/sql/compiler.py", line 1112, in execute_sql
cursor.execute(sql, params)
File "/home/vagrant/.virtualenvs/pulp/lib64/python3.6/site-packages/django/db/backends/utils.py", line 79, in execute
return super(CursorDebugWrapper, self).execute(sql, params)
File "/home/vagrant/.virtualenvs/pulp/lib64/python3.6/site-packages/django/db/backends/utils.py", line 64, in execute
return self.cursor.execute(sql, params)
File "/home/vagrant/.virtualenvs/pulp/lib64/python3.6/site-packages/django/db/utils.py", line 94, in __exit__
six.reraise(dj_exc_type, dj_exc_value, traceback)
File "/home/vagrant/.virtualenvs/pulp/lib64/python3.6/site-packages/django/utils/six.py", line 685, in reraise
raise value.with_traceback(tb)
File "/home/vagrant/.virtualenvs/pulp/lib64/python3.6/site-packages/django/db/backends/utils.py", line 64, in execute
return self.cursor.execute(sql, params)
File "/home/vagrant/.virtualenvs/pulp/lib64/python3.6/site-packages/django/db/backends/sqlite3/base.py", line 328, in execute
return Database.Cursor.execute(self, query, params)
django.db.utils.IntegrityError: NOT NULL constraint failed: pulp_file_filetask.id
So I tried to patch the Task viewset but still no luck:
jahoda:pulp mkovacik$ git diff
diff --git a/pulpcore/pulpcore/app/serializers/task.py b/pulpcore/pulpcore/app/serializers/task.py
index e75c4a2e7..e66b90e15 100755
--- a/pulpcore/pulpcore/app/serializers/task.py
+++ b/pulpcore/pulpcore/app/serializers/task.py
@@ -96,10 +96,12 @@ class TaskSerializer(MasterModelSerializer):
def create(self, validated_data):
+ from uuid import uuid4
+ validated_data['id'] = uuid4()
self.task = super().create(validated_data)
- self.celery_task.apply_async_with_reservation(
+ self.task = self.celery_task.apply_async_with_reservation(
self.reservations,
- self.task,
+ validated_data['id'],
kwargs=self.task_kwargs
)
return self.task
@@ -131,9 +133,9 @@ class TaskSerializer(MasterModelSerializer):
# Fields that serialize tasks are broken in this WIP
fields = ModelSerializer.Meta.fields + ('state', 'started_at', 'finished_at',
'non_fatal_errors', 'error', 'worker',
- # 'parent',
- # 'spawned_tasks',
- # 'progress_reports',
+ 'parent',
+ 'spawned_tasks',
+ 'progress_reports',
'created_resources')
diff --git a/pulpcore/pulpcore/app/viewsets/task.py b/pulpcore/pulpcore/app/viewsets/task.py
index 93c7a462e..13a9c326c 100755
--- a/pulpcore/pulpcore/app/viewsets/task.py
+++ b/pulpcore/pulpcore/app/viewsets/task.py
@@ -3,7 +3,7 @@ from django_filters.rest_framework import filters, filterset
from pulpcore.app.models import Task, Worker
from pulpcore.app.models.task import CoreUpdateTask, CoreDeleteTask
from pulpcore.app.serializers import TaskSerializer, WorkerSerializer
-from pulpcore.app.serializers.task import CoreUpdateTaskSerializer
+from pulpcore.app.serializers.task import CoreUpdateTaskSerializer, CoreDeleteTaskSerializer
from pulpcore.app.viewsets import NamedModelViewSet
from pulpcore.app.viewsets.base import GenericNamedModelViewSet
from pulpcore.app.viewsets.custom_filters import CharInFilter, HyperlinkRelatedFilter
@@ -59,7 +59,7 @@ class CoreDeleteTaskViewSet(TaskViewSet):
endpoint_name = 'core/deletes'
queryset = CoreDeleteTask.objects.all()
model = CoreDeleteTask
- serializer_class = CoreUpdateTaskSerializer
+ serializer_class = CoreDeleteTaskSerializer
class WorkerViewSet(NamedModelViewSet):
It feels like some sort of a chicken--egg problem instance.
So I'm afraid no demo yet ;)
Updated by amacdona@redhat.com over 6 years ago
Updated by milan over 6 years ago
Austin,
thanks for the suggestions!
Did as as suggested and got a successful sync
What I miss is the tasks endpoint doesn't show all the tasks:
phttp :8000/api/v3/tasks/
HTTP/1.0 404 Not Found
Content-Length: 21507
Content-Type: text/html
---------%<-------------
Updated by amacdona@redhat.com about 6 years ago
- Status changed from NEW to CLOSED - WONTFIX
We decided not to go in this direction.