Story #23
closedAs a user, I can rest easy in the knowledge that Pulp's Celery tasks are not serialized dangerously
100%
Description
Our tasks currently need to use pickle for serialization, because we pass objects as arguments to some of them. We likely also return objects from some tasks.We should either make our objects json serializable, or change our tasks to accept simple types that json can handle.Once this is complete, we should configure Celery to use json, and configure the workers not to accept pickle.
Related issues
Updated by bmbouter over 9 years ago
- Groomed set to No
- Sprint Candidate set to No
Updated by bmbouter about 9 years ago
The branch with the current work is here: https://github.com/bmbouter/pulp/tree/23-farmers-ale
It's currently branched from 2.6, and has 2 additional commits (50d785d and 8bf9633).
Here is what needs to be done:
1) First, rebase or cherry pick against master. The two existing commits will have their hashes changed in the process, but please leave them and their commit messages in place (for attribution). Once rebased into a branch that branches from master, make your commits on top of the existing 2.
2) Reproduce the known issue with this code by having a working pulp installation and running:
pulp-admin rpm repo create --repo-id bar
pulp-admin rpm repo create --repo-id foo
pulp-admin rpm repo copy rpm --from-repo-id=bar --to-repo-id=foo
3) The issue is that the UnitAssociationCriteria is arriving as a dict in the task code, when it needs to be a UnitAssociationCriteria (which inherits from dict). To resolve this, the UnitAssociationCriteria object needs to be explicitly serialized to a dictionary prior to it being sent as an argument of apply_async(). Add a to_dict() method on UnitAssociationCriteria() to return the dict. Find the place where the UnitAssociationCriteria is passed as an argument by looking at what API endpoint is being used by the repo copy rpm command.
4) Add a classmethod to UnitAssociationCriteria which can instantiate and return a UnitAssociationCriteria object from a dictionary that was produced by to_dict().
5) Use the classmethod above in the task code to produce the UnitAssociationCriteria from the dict passed in as an argument
6) Do some other general testing if you like (not required)
7) Write test code for everything
8) Write one specific test which looks through the celery registry and asserts that each entry in it is a subclass of PulpTask.
Updated by bmbouter about 9 years ago
Here is what I put together as a start for item #8 above.
from pulp.server.async import app
from pulp.server import tasks # you have to import this to discover all pulp tasks
from pulp.server.async.tasks import PulpTask
for task_name, task in app.celery.tasks.iteritems():
if task.startswith('pulp'):
if not isinstance(task, PulpTask):
self.fail('task named %s must inherit from %s) % (task_name, PulpTask)
Updated by ipanova@redhat.com about 9 years ago
- Status changed from NEW to ASSIGNED
- Assignee set to ipanova@redhat.com
Updated by ipanova@redhat.com about 9 years ago
- Status changed from ASSIGNED to MODIFIED
Updated by bmbouter about 9 years ago
It's not merged yet, so moving back to POST.
Added by Shubham Bhawsinka about 9 years ago
Added by bmbouter about 9 years ago
Revision 4ac8b815 | View on GitHub
Removes explicit type validation, and cleans up docstrings some.
Added by ipanova@redhat.com about 9 years ago
Revision ed367081 | View on GitHub
Introducing from_dict() and to_dict()
closes #23 https://pulp.plan.io/issues/23
UnitAssociationCriteria arrives as dict in the task code, when it needs to be UnitAssociationCriteria object. to_dict() method was introduced to serialize UnitAssociationCriteria to a dictionary prior to it being sent as an argument to apply_sync(). the from_dict() method was introduced to return UnitAssociationCriteria object from a dictionary that was produced by to_dict()
This commit also introduces a testcase that checks that all Celery tasks defined in Pulp inhherit from PulpTask and not from Celery directly.
There have been done also changes to criteria. Due to the introduction of de/sereliazions methods, criteria's value cannot be None. Code and tests were updated accordigly.
Added by ipanova@redhat.com about 9 years ago
Revision ed367081 | View on GitHub
Introducing from_dict() and to_dict()
closes #23 https://pulp.plan.io/issues/23
UnitAssociationCriteria arrives as dict in the task code, when it needs to be UnitAssociationCriteria object. to_dict() method was introduced to serialize UnitAssociationCriteria to a dictionary prior to it being sent as an argument to apply_sync(). the from_dict() method was introduced to return UnitAssociationCriteria object from a dictionary that was produced by to_dict()
This commit also introduces a testcase that checks that all Celery tasks defined in Pulp inhherit from PulpTask and not from Celery directly.
There have been done also changes to criteria. Due to the introduction of de/sereliazions methods, criteria's value cannot be None. Code and tests were updated accordigly.
Updated by ipanova@redhat.com about 9 years ago
- % Done changed from 0 to 100
Applied in changeset pulp|ed36708132e18bac4ab8b58d0b56cdb2ffe95685.
Updated by bmbouter almost 9 years ago
This merged change does switch our usage to use the JSON serializer, but we haven't actually disabled the other serializers. As such the following banner is still visible:
Jan 22 01:35:01 dev pulp[22849]: py.warnings:WARNING: (22849-36000) /usr/lib/python2.7/site-packages/celery/apps/worker.py:161: CDeprecationWarning:
Jan 22 01:35:01 dev pulp[22849]: py.warnings:WARNING: (22849-36000) Starting from version 3.2 Celery will refuse to accept pickle by default.
Jan 22 01:35:01 dev pulp[22849]: py.warnings:WARNING: (22849-36000)
Jan 22 01:35:01 dev pulp[22849]: py.warnings:WARNING: (22849-36000) The pickle serializer is a security concern as it may give attackers
Jan 22 01:35:01 dev pulp[22849]: py.warnings:WARNING: (22849-36000) the ability to execute any command. It's important to secure
Jan 22 01:35:01 dev pulp[22849]: py.warnings:WARNING: (22849-36000) your broker from unauthorized access when using pickle, so we think
Jan 22 01:35:01 dev pulp[22849]: py.warnings:WARNING: (22849-36000) that enabling pickle should require a deliberate action and not be
Jan 22 01:35:01 dev pulp[22849]: py.warnings:WARNING: (22849-36000) the default choice.
Jan 22 01:35:01 dev pulp[22849]: py.warnings:WARNING: (22849-36000)
Jan 22 01:35:01 dev pulp[22849]: py.warnings:WARNING: (22849-36000) If you depend on pickle then you should set a setting to disable this
Jan 22 01:35:01 dev pulp[22849]: py.warnings:WARNING: (22849-36000) warning and to be sure that everything will continue working
Jan 22 01:35:01 dev pulp[22849]: py.warnings:WARNING: (22849-36000) when you upgrade to Celery 3.2::
Jan 22 01:35:01 dev pulp[22849]: py.warnings:WARNING: (22849-36000)
Jan 22 01:35:01 dev pulp[22849]: py.warnings:WARNING: (22849-36000) CELERY_ACCEPT_CONTENT = ['pickle', 'json', 'msgpack', 'yaml']
Jan 22 01:35:01 dev pulp[22849]: py.warnings:WARNING: (22849-36000)
Jan 22 01:35:01 dev pulp[22849]: py.warnings:WARNING: (22849-36000) You must only enable the serializers that you will actually use.
Jan 22 01:35:01 dev pulp[22849]: py.warnings:WARNING: (22849-36000)
Jan 22 01:35:01 dev pulp[22849]: py.warnings:WARNING: (22849-36000)
Jan 22 01:35:01 dev pulp[22849]: py.warnings:WARNING: (22849-36000) warnings.warn(CDeprecationWarning(W_PICKLE_DEPRECATED))
Jan 22 01:35:01 dev pulp[22849]: py.warnings:WARNING: (22849-36000)
I am fixing this now with a small PR to disable the other serializers.
Added by bmbouter almost 9 years ago
Revision 478443a7 | View on GitHub
Disables all other celery serializers except JSON
This removes a scary, multiline warning from celery workers when they start up
Added by bmbouter almost 9 years ago
Revision 478443a7 | View on GitHub
Disables all other celery serializers except JSON
This removes a scary, multiline warning from celery workers when they start up
Updated by bmbouter almost 9 years ago
PR available at: https://github.com/pulp/pulp/pull/2346
Updated by bmbouter almost 9 years ago
- Has duplicate Issue #575: Warning in the log when starting celery services added
Updated by dkliban@redhat.com almost 9 years ago
- Status changed from MODIFIED to 5
Updated by dkliban@redhat.com almost 9 years ago
- Status changed from 5 to CLOSED - CURRENTRELEASE
Progress for story 23