Issue #956: Pulp's Celery result backend connection cannot use Mongo replica sets with automatic failover - Pulp

Actions

Send by e-mail Copy link

Issue #956

closed

Task #1014: Short Term Improvements for Pulp's use of MongoDB

Pulp's Celery result backend connection cannot use Mongo replica sets with automatic failover

Added by rbarlow about 9 years ago. Updated about 5 years ago.

Status:

CLOSED - CURRENTRELEASE

Priority:

High

Assignee:

dkliban@redhat.com

Category:

Sprint/Milestone:

Start date:

Due date:

Estimated time:

Severity:

3. High

Version:

2.4.0

Platform Release:

2.7.0

OS:

Triaged:

Yes

Groomed:

Sprint Candidate:

Tags:

Pulp 2

Sprint:

Quarter:

Description

This tracks the issue, but the fix is refactor #1084. Anyone who assigns this issue to them needs to also assign #1084 because they go together.

We had an issue today where it was discovered that Pulp cannot truly use Mongo replica sets with automatic failover. Pulp uses MongoDB as Celery's results backend, and this is the component that fails with the following traceback:

pulp: celery.worker.strategy:INFO: Received task: pulp.server.async.tasks._reserve_resource[8c66b5b7-3236-4da0-9ab2-00c476e7196f]
pulp: celery.worker.job:CRITICAL: Task pulp.server.async.tasks._reserve_resource[8c66b5b7-3236-4da0-9ab2-00c476e7196f] INTERNAL ERROR: AutoReconnect('not master',)
pulp: celery.worker.job:CRITICAL: Traceback (most recent call last):
pulp: celery.worker.job:CRITICAL:   File "/usr/lib/python2.6/site-packages/celery/app/trace.py", line 283, in trace_task
pulp: celery.worker.job:CRITICAL:     uuid, retval, SUCCESS, request=task_request,
pulp: celery.worker.job:CRITICAL:   File "/usr/lib/python2.6/site-packages/celery/backends/base.py", line 254, in store_result
pulp: celery.worker.job:CRITICAL:     request=request, **kwargs)
pulp: celery.worker.job:CRITICAL:   File "/usr/lib/python2.6/site-packages/celery/backends/mongodb.py", line 145, in _store_result
pulp: celery.worker.job:CRITICAL:     self.collection.save(meta)
pulp: celery.worker.job:CRITICAL:   File "/usr/lib/python2.6/site-packages/kombu/utils/__init__.py", line 322, in __get__
pulp: celery.worker.job:CRITICAL:     value = obj.__dict__[self.__name__] = self.__get(obj)
pulp: celery.worker.job:CRITICAL:   File "/usr/lib/python2.6/site-packages/celery/backends/mongodb.py", line 240, in collection
pulp: celery.worker.job:CRITICAL:     collection.ensure_index('date_done', background='true')
pulp: celery.worker.job:CRITICAL:   File "/usr/lib64/python2.6/site-packages/pymongo/collection.py", line 916, in ensure_index
pulp: celery.worker.job:CRITICAL:     return self.create_index(key_or_list, cache_for, **kwargs)
pulp: celery.worker.job:CRITICAL:   File "/usr/lib64/python2.6/site-packages/pymongo/collection.py", line 823, in create_index
pulp: celery.worker.job:CRITICAL:     **self._get_wc_override())
pulp: celery.worker.job:CRITICAL:   File "/usr/lib64/python2.6/site-packages/pymongo/collection.py", line 357, in insert
pulp: celery.worker.job:CRITICAL:     continue_on_error, self.__uuid_subtype), safe)
pulp: celery.worker.job:CRITICAL:   File "/usr/lib64/python2.6/site-packages/pymongo/mongo_client.py", line 929, in _send_message
pulp: celery.worker.job:CRITICAL:     raise AutoReconnect(str(e))
pulp: celery.worker.job:CRITICAL: AutoReconnect: not master

There is a comment claiming that Celery 3.1 does not support replica sets above this code block[0]. I have not independently verified this claim, but we'll need to either fix Celery so that it does support this, or find some other way around this problem so that replica sets are fully supported by Pulp, including automatic failover.

Steps to reproduce:

1) Deploy a pool of three mongod's, configured to be a replica set.
2) Deploy Pulp, and configure it's database connection with the three mongo replicas. Put the current primary as the first seed in the list.
3) Perform a few actions to ensure everything is working correctly.
4) Now reconfigure Pulp's seed list so that one of the secondaries is the first in the list.
5) Perform an action that uses the results backend, such as a repository sync. This will fail with a traceback similar to the above.

Alternatively:

1) Deploy a pool of three mongod's, configured to be a replica set.
2) Deploy Pulp, and configure it's database connection with the three mongo replicas. Put the current primary as the first seed in the list.
3) Perform a few actions to ensure everything is working correctly.
4) Kill the current Mongo primary.
5) Perform an action that uses the results backend, such as a repository sync. This will fail with a traceback similar to the above.

Expected behavior:

The order of the seeds in server.conf should not be important for Pulp to operate correctly. It should also be possible to kill the current Mongo primary, and Pulp should continue operating smoothly.

I've filed this against 2.4.0, as it affects every version of Pulp that has used Celery.

QE instructions¶

You're actually verifying things that were done in #1080, but we're doing the verification on this issue.

Verify that the migration removes the celery_taskmeta collection
Verify the release notes
Verify that the fix which includes refactor #1080 passes a full regression test

[0] https://github.com/pulp/pulp/blob/01fcf261c38f9b4b057839980f892f85a8697a27/server/pulp/server/async/celery_instance.py#L48-L53

Related issues

Actions

Send by e-mail Copy link

Also available in: Atom PDF

Project

Profile

Help

Pulp

Agile boards

Custom queries

Issue #956

Pulp's Celery result backend connection cannot use Mongo replica sets with automatic failover

QE instructions¶

Updated by rbarlow about 9 years ago

Updated by jortel@redhat.com about 9 years ago

Updated by mhrivnak about 9 years ago

Updated by dkliban@redhat.com about 9 years ago

Updated by dkliban@redhat.com about 9 years ago

Updated by dkliban@redhat.com about 9 years ago

Updated by dkliban@redhat.com about 9 years ago

Updated by dkliban@redhat.com about 9 years ago

Updated by bmbouter about 9 years ago

Updated by bmbouter almost 9 years ago

Updated by bmbouter almost 9 years ago

Updated by dkliban@redhat.com almost 9 years ago

Added by dkliban@redhat.com almost 9 years ago

Added by dkliban@redhat.com almost 9 years ago

Updated by dkliban@redhat.com almost 9 years ago

Updated by dkliban@redhat.com almost 9 years ago

Updated by dkliban@redhat.com almost 9 years ago

Updated by dkliban@redhat.com over 8 years ago

Updated by pthomas@redhat.com over 8 years ago

Updated by amacdona@redhat.com over 8 years ago

Updated by bmbouter about 5 years ago