Issue #3887
closedRestarting pulp_celerybeat weekly or so causes pulp.server.maintenance.monthly to not get scheduled
Description
Symptoms¶
No signs of `pulp.server.maintenance.monthly` in the logs for the last 30 days.
If task is triggered manually, it works properly and orphaned applicability profiles got cleaned up.
Impact¶
Having a lot of orphaned applicability profiles triggers a lot of no-op applicability tasks, it can slow down the whole system if there are too many of those tasks. E.g. A repo with many bound consumers (30K unique profiles) will trigger 3K tasks.
Root Cause¶
Pulp users Celery's periodic tasks feature for its dispatching. The monthly maintenance is dispatched every 30 days. Unfortunately Celery measues this as 30 days since the pulp_celerybeat process started. See the celery docs This means that if you restart your pulp_celerybeat every now and then, you'll end up with a lot of orphaned applicability profiles.
Resolution¶
The recommendation is to use cron to run a script monthly like:
from pulp.server.maintenance.monthly import queue_monthly_maintenance
queue_monthly_maintenance.apply_async()
Related issues
Updated by ttereshc over 6 years ago
Numbers to show impact: https://bugzilla.redhat.com/show_bug.cgi?id=1609928#c5
Updated by CodeHeeler over 6 years ago
- Priority changed from Normal to High
- Severity changed from 2. Medium to 3. High
- Triaged changed from No to Yes
- Sprint set to Sprint 40
Updated by ttereshc over 6 years ago
Cron configuration to make this task run regularly is enough to fix this task.
The suggestion is to put it into /etc/cron.monthly and ship this config with rpm installation.
Updated by bmbouter over 6 years ago
ttereshc excellent! Also remove this part of the code with that change too: https://github.com/pulp/pulp/blob/2-master/server/pulp/server/async/celery_instance.py#L30-L33
Updated by bmbouter over 6 years ago
- Status changed from NEW to ASSIGNED
- Assignee set to bmbouter
Updated by bmbouter over 6 years ago
This is working for me. I have the latest branch head of pulp/pulp:2-master and pulp/pulp_rpm:2-master. I applied the following diff to test it:
diff --git a/server/pulp/server/async/celery_instance.py b/server/pulp/server/async/celery_instance.py
index eab2ea3dd..9b486e8a5 100644
--- a/server/pulp/server/async/celery_instance.py
+++ b/server/pulp/server/async/celery_instance.py
@@ -29,7 +29,7 @@ CELERYBEAT_SCHEDULE = {
},
'monthly_maintenance': {
'task': 'pulp.server.maintenance.monthly.queue_monthly_maintenance',
- 'schedule': timedelta(days=30),
+ 'schedule': timedelta(seconds=60),
'args': tuple(),
},
'download_deferred_content': {
diff --git a/server/pulp/server/maintenance/monthly.py b/server/pulp/server/maintenance/monthly.py
index 58e51f247..365496ea0 100644
--- a/server/pulp/server/maintenance/monthly.py
+++ b/server/pulp/server/maintenance/monthly.py
@@ -10,6 +10,7 @@ def queue_monthly_maintenance():
"""
Create an itinerary for monthly task
"""
+ raise Exception('oh no!')
tags = [action_tag('monthly')]
monthly_maintenance.apply_async(tags=tags)
Then I got the following output upon startup:
kombu.transport.qpid:INFO: Connected to qpid with SASL mechanism ANONYMOUS
celery.beat:INFO: Scheduler: Sending due task monthly_maintenance (pulp.server.maintenance.monthly.queue_monthly_maintenance)
celery.worker.strategy:INFO: Received task: pulp.server.maintenance.monthly.queue_monthly_maintenance[e465c5d5-4385-4dcb-89ff-24013669c252]
celery.app.trace:ERROR: [e465c5d5] (18124-49472) Task pulp.server.maintenance.monthly.queue_monthly_maintenance[e465c5d5-4385-4dcb-89ff-24013669c252] raised unexpected: Exception('oh no!',)
celery.app.trace:ERROR: [e465c5d5] (18124-49472) Traceback (most recent call last):
celery.app.trace:ERROR: [e465c5d5] (18124-49472) File "/usr/lib/python2.7/site-packages/celery/app/trace.py", line 367, in trace_task
celery.app.trace:ERROR: [e465c5d5] (18124-49472) R = retval = fun(*args, **kwargs)
celery.app.trace:ERROR: [e465c5d5] (18124-49472) File "/home/vagrant/devel/pulp/server/pulp/server/async/tasks.py", line 107, in __call__
celery.app.trace:ERROR: [e465c5d5] (18124-49472) return super(PulpTask, self).__call__(*args, **kwargs)
celery.app.trace:ERROR: [e465c5d5] (18124-49472) File "/usr/lib/python2.7/site-packages/celery/app/trace.py", line 622, in __protected_call__
celery.app.trace:ERROR: [e465c5d5] (18124-49472) return self.run(*args, **kwargs)
celery.app.trace:ERROR: [e465c5d5] (18124-49472) File "/home/vagrant/devel/pulp/server/pulp/server/maintenance/monthly.py", line 13, in queue_monthly_maintenance
celery.app.trace:ERROR: [e465c5d5] (18124-49472) raise Exception('oh no!')
celery.app.trace:ERROR: [e465c5d5] (18124-49472) Exception: oh no!
I am using Celery 4.0.2-3 with python2-celery-4.0.2-3.fc27.noarch
.
Updated by bmbouter over 6 years ago
- Subject changed from Periodic task pulp.server.maintenance.monthly doesn't seem to get scheduled to Restarting pulp_celerybeat weekly or so causes pulp.server.maintenance.monthly to not get scheduled
- Description updated (diff)
- Tags Documentation added
Rewriting based on root cause discovery. This will become a docs bug explaining the recommended workaround.
Updated by bmbouter over 6 years ago
- Status changed from ASSIGNED to POST
docs PR available at: https://github.com/pulp/pulp/pull/3585
Added by bmbouter over 6 years ago
Updated by bmbouter over 6 years ago
- Status changed from POST to MODIFIED
Applied in changeset pulp|97331ca0c398ff6371b03982260a3776a8d8ac80.
Added by bmbouter over 6 years ago
Revision 5af7d170 | View on GitHub
Use the correct command for maintenance dispatch
Updated by bmbouter over 6 years ago
- Related to Issue #3908: Add pulp-maintenance package added
Updated by dkliban@redhat.com over 6 years ago
- Status changed from MODIFIED to CLOSED - CURRENTRELEASE
- Platform Release set to 2.17.0
Add docs on maintenance jobs
This adds:
https://pulp.plan.io/issues/3887 closes #3887