https://pulp.plan.io/https://pulp.plan.io/favicon.ico2016-03-31T14:43:59ZPulpPulp - Issue #1801: Pulp celery_beat and resource_manager are running, but logs say they are not runninghttps://pulp.plan.io/issues/1801?journal_id=103052016-03-31T14:43:59Zmhrivnakmhrivnak@redhat.com
<ul><li><strong>Sprint/Milestone</strong> set to <i>19</i></li></ul> Pulp - Issue #1801: Pulp celery_beat and resource_manager are running, but logs say they are not runninghttps://pulp.plan.io/issues/1801?journal_id=103152016-03-31T16:55:30Zbmbouterbmbouter@redhat.com
<ul></ul><p>I reproduced this in my environment, and pulp_celerybeat appears to be deadlocking in the kombu transport. A gdb trace of a deadlocked pulp_celerybeat process shows the thread which processes event callbacks of incoming heartbeat messages is halted at this line. See the GDB py-list output:</p>
<pre><code>Thread 5 (Thread 0x7f737da33700 (LWP 6551)):
1433 'The Python package "qpid.messaging" is missing. Install it '
1434 'with your package manager. You can also try `pip install '
1435 'qpid-python`.')
1436
1437 def _qpid_message_ready_handler(self, session):
>1438 os.write(self._w, '0')
1439
1440 def _qpid_async_exception_notify_handler(self, obj_with_exception, exc):
1441 os.write(self._w, 'e')
1442
1443 def on_readable(self, connection, loop):
</code></pre>
<p>That line corresponds with this line in the kombu code: <a href="https://github.com/celery/kombu/blob/93f6606e0a758c9cffb9b3c2ef6a239ed7027309/kombu/transport/qpid.py#L1474" class="external">https://github.com/celery/kombu/blob/93f6606e0a758c9cffb9b3c2ef6a239ed7027309/kombu/transport/qpid.py#L1474</a></p>
<p>That os.write call is the point of deadlock. I don't yet understand why it is deadlocking, but it is likely a thread safety issue around that pipe. The investigation continues.</p> Pulp - Issue #1801: Pulp celery_beat and resource_manager are running, but logs say they are not runninghttps://pulp.plan.io/issues/1801?journal_id=103212016-03-31T19:17:01Zbmbouterbmbouter@redhat.com
<ul></ul><p>The root cause is identified, and I filed it in the Kombu upstream issue tracker. <a href="https://github.com/celery/kombu/issues/577" class="external">https://github.com/celery/kombu/issues/577</a></p>
<p>I'll be fixing it upstream and then we'll cherry pick that commit as a patch to the version of python-kombu that Pulp carries along with the version in Rawhide.</p> Pulp - Issue #1801: Pulp celery_beat and resource_manager are running, but logs say they are not runninghttps://pulp.plan.io/issues/1801?journal_id=103272016-04-01T13:08:27Zrbarlow
<ul></ul><p>On Thursday, March 31, 2016 9:17:01 PM EDT you wrote:</p>
<blockquote>
<p>I'll be fixing it upstream and then we'll cherry pick that commit as a</p>
</blockquote>
<p>patch</p>
<blockquote>
<p>to the version of python-kombu that Pulp carries along with the version</p>
</blockquote>
<p>in</p>
<blockquote>
<p>Rawhide.</p>
</blockquote>
<p>Consider trying to get the patch into Fedora 24 as well so we don't have<br>
this problem there. Thanks!</p> Pulp - Issue #1801: Pulp celery_beat and resource_manager are running, but logs say they are not runninghttps://pulp.plan.io/issues/1801?journal_id=103292016-04-01T13:47:49Zbmbouterbmbouter@redhat.com
<ul></ul><p>rbarlow wrote:</p>
<blockquote>
<p>On Thursday, March 31, 2016 9:17:01 PM EDT you wrote:</p>
<blockquote>
<p>I'll be fixing it upstream and then we'll cherry pick that commit as a</p>
</blockquote>
<p>patch</p>
<blockquote>
<p>to the version of python-kombu that Pulp carries along with the version</p>
</blockquote>
<p>in</p>
<blockquote>
<p>Rawhide.</p>
</blockquote>
<p>Consider trying to get the patch into Fedora 24 as well so we don't have<br>
this problem there. Thanks!</p>
</blockquote>
<p>Oh yes I will do this. I forgot Fedora 24 had branched. I'll submit the update to both Rawhide and F24.</p> Pulp - Issue #1801: Pulp celery_beat and resource_manager are running, but logs say they are not runninghttps://pulp.plan.io/issues/1801?journal_id=103372016-04-01T14:49:09Zmhrivnakmhrivnak@redhat.com
<ul><li><strong>Triaged</strong> changed from <i>No</i> to <i>Yes</i></li></ul> Pulp - Issue #1801: Pulp celery_beat and resource_manager are running, but logs say they are not runninghttps://pulp.plan.io/issues/1801?journal_id=103432016-04-01T19:09:22Zbmbouterbmbouter@redhat.com
<ul></ul><p>This commit needs to be cherry picked into the version we carry <a href="https://github.com/celery/kombu/commit/277309f47a713a31885248b78df45e41d8d5e490" class="external">https://github.com/celery/kombu/commit/277309f47a713a31885248b78df45e41d8d5e490</a>.</p>
<p>This regression was introduced with kombu 3.0.33. This fix needs to be on pulp-dev and newer branches. No existing 2.7 users use 3.0.33 so we can fix it in 2.7-dev and not have to make a new 2.7 release to make the fix available to existing users. The fix will be included with 2.8.2 from the merge forward to master.</p> Pulp - Issue #1801: Pulp celery_beat and resource_manager are running, but logs say they are not runninghttps://pulp.plan.io/issues/1801?journal_id=103452016-04-01T21:07:56Zbmbouterbmbouter@redhat.com
<ul><li><strong>Status</strong> changed from <i>ASSIGNED</i> to <i>POST</i></li></ul><p><a href="https://github.com/pulp/pulp/pull/2507" class="external">https://github.com/pulp/pulp/pull/2507</a></p> Pulp - Issue #1801: Pulp celery_beat and resource_manager are running, but logs say they are not runninghttps://pulp.plan.io/issues/1801?journal_id=103472016-04-03T19:04:40Zdgregor@redhat.comdgregor@redhat.com
<ul><li><strong>Version</strong> set to <i>2.8.0</i></li></ul> Pulp - Issue #1801: Pulp celery_beat and resource_manager are running, but logs say they are not runninghttps://pulp.plan.io/issues/1801?journal_id=103492016-04-04T11:50:42Zbmbouterbmbouter@redhat.com
<ul><li><strong>Private</strong> changed from <i>No</i> to <i>Yes</i></li></ul> Pulp - Issue #1801: Pulp celery_beat and resource_manager are running, but logs say they are not runninghttps://pulp.plan.io/issues/1801?journal_id=103502016-04-04T11:53:37Zbmbouterbmbouter@redhat.com
<ul><li><strong>Private</strong> changed from <i>Yes</i> to <i>No</i></li></ul> Pulp - Issue #1801: Pulp celery_beat and resource_manager are running, but logs say they are not runninghttps://pulp.plan.io/issues/1801?journal_id=103742016-04-05T13:03:53Zpthomas@redhat.com
<ul></ul><p>Before updating kombu</p>
<pre><code>
[root@ibm-x3550m3-12 ~]# rpm -qa |grep kombu
python-kombu-3.0.33-4.pulp.el7.noarch
[root@ibm-x3550m3-12 ~]#
[root@ibm-x3550m3-12 ~]# sudo qpid-stat -q |grep celeryev
celeryev.223a4cfb-e1bd-4f6e-b146-0198d295e33a Y 20.4k 86.0k 65.5k 18.0m 75.5m 57.6m 1 2
[root@ibm-x3550m3-12 ~]# journalctl -f -l
-- Logs begin at Mon 2016-04-04 21:51:34 CEST. --
Apr 05 13:55:02 ibm-x3550m3-12.lab.eng.brq.redhat.com pulp[32000]: pulp.server.async.scheduler:ERROR: There are 0 pulp_resource_manager processes running. Pulp will not operate correctly without at least one pulp_resource_mananger process running.
Apr 05 13:55:02 ibm-x3550m3-12.lab.eng.brq.redhat.com pulp[32000]: pulp.server.async.scheduler:ERROR: There are 0 pulp_celerybeat processes running. Pulp will not operate correctly without at least one pulp_celerybeat process running.
</code></pre> Pulp - Issue #1801: Pulp celery_beat and resource_manager are running, but logs say they are not runninghttps://pulp.plan.io/issues/1801?journal_id=103752016-04-05T13:21:07Zbmbouterbmbouter@redhat.com
<ul><li><strong>Status</strong> changed from <i>POST</i> to <i>MODIFIED</i></li><li><strong>% Done</strong> changed from <i>0</i> to <i>100</i></li></ul><p>Applied in changeset <a class="changeset" title="Adds patch to python-kombu to fix pulp_celerybeat deadlock closes #1801 https://pulp.plan.io/iss..." href="https://pulp.plan.io/projects/pulp/repository/pulp/revisions/c54adba554d157a3ad6fef9ba11d2c0e01595ac7">pulp|c54adba554d157a3ad6fef9ba11d2c0e01595ac7</a>.</p> Pulp - Issue #1801: Pulp celery_beat and resource_manager are running, but logs say they are not runninghttps://pulp.plan.io/issues/1801?journal_id=104012016-04-06T12:23:30Zpthomas@redhat.com
<ul></ul><p>Verified that msgIn & msgOut are the same and msgOut doesn't stop after 65k</p>
<pre><code>[root@pulp-el7 ~]# rpm -qa |grep kombu
python-kombu-3.0.33-5.pulp.el7.noarch
[root@pulp-el7 ~]# sudo qpid-stat -q |grep celeryev
Queues
queue dur autoDel excl msg msgIn msgOut bytes bytesIn bytesOut cons bind
=========================================================================================================================================================
celeryev.9631492f-a29e-4bdc-b843-23911d505f2d Y 0 145k 145k 0 128m 128m 1 2
[root@pulp-el6 ~]# rpm -qa |grep kombu
python-kombu-3.0.33-5.pulp.el6.noarch
[root@pulp-el6 ~]#
Queues
queue dur autoDel excl msg msgIn msgOut bytes bytesIn bytesOut cons bind
=========================================================================================================================================================
celeryev.0caa15b8-8829-441f-8ed2-231cd34a94dd Y 0 156k 156k 0 142m 142m 1 2
</code></pre> Pulp - Issue #1801: Pulp celery_beat and resource_manager are running, but logs say they are not runninghttps://pulp.plan.io/issues/1801?journal_id=104082016-04-06T17:12:56Zbmbouterbmbouter@redhat.com
<ul></ul><p>The patch has been applied in rawhide and is currently available.<br>
I've submitted an update to F24 also here: <a href="https://bodhi.fedoraproject.org/updates/FEDORA-2016-ec038bbf19" class="external">https://bodhi.fedoraproject.org/updates/FEDORA-2016-ec038bbf19</a></p> Pulp - Issue #1801: Pulp celery_beat and resource_manager are running, but logs say they are not runninghttps://pulp.plan.io/issues/1801?journal_id=104332016-04-06T20:24:29Zsemyerssean.myers@redhat.com
<ul><li><strong>Platform Release</strong> changed from <i>2.8.2</i> to <i>2.8.3</i></li></ul> Pulp - Issue #1801: Pulp celery_beat and resource_manager are running, but logs say they are not runninghttps://pulp.plan.io/issues/1801?journal_id=110312016-04-26T22:38:48Zsemyerssean.myers@redhat.com
<ul><li><strong>Status</strong> changed from <i>MODIFIED</i> to <i>5</i></li></ul> Pulp - Issue #1801: Pulp celery_beat and resource_manager are running, but logs say they are not runninghttps://pulp.plan.io/issues/1801?journal_id=112002016-05-03T13:28:56Zbmbouterbmbouter@redhat.com
<ul></ul><p>pulp-list e-mail about the issue: <a href="https://www.redhat.com/archives/pulp-list/2016-April/msg00020.html" class="external">https://www.redhat.com/archives/pulp-list/2016-April/msg00020.html</a></p> Pulp - Issue #1801: Pulp celery_beat and resource_manager are running, but logs say they are not runninghttps://pulp.plan.io/issues/1801?journal_id=113602016-05-06T16:38:03Zpthomas@redhat.com
<ul><li><strong>Status</strong> changed from <i>5</i> to <i>6</i></li></ul><p>verified</p>
<p><a href="https://pulp.plan.io/issues/1801#note-16" class="external">https://pulp.plan.io/issues/1801#note-16</a></p> Pulp - Issue #1801: Pulp celery_beat and resource_manager are running, but logs say they are not runninghttps://pulp.plan.io/issues/1801?journal_id=115352016-05-17T19:31:44Zsemyerssean.myers@redhat.com
<ul><li><strong>Status</strong> changed from <i>6</i> to <i>CLOSED - CURRENTRELEASE</i></li></ul> Pulp - Issue #1801: Pulp celery_beat and resource_manager are running, but logs say they are not runninghttps://pulp.plan.io/issues/1801?journal_id=252162018-03-08T18:21:04Zbmbouterbmbouter@redhat.com
<ul><li><strong>Sprint</strong> set to <i>Sprint 1</i></li></ul> Pulp - Issue #1801: Pulp celery_beat and resource_manager are running, but logs say they are not runninghttps://pulp.plan.io/issues/1801?journal_id=252412018-03-08T18:21:47Zbmbouterbmbouter@redhat.com
<ul><li><strong>Sprint/Milestone</strong> deleted (<del><i>19</i></del>)</li></ul> Pulp - Issue #1801: Pulp celery_beat and resource_manager are running, but logs say they are not runninghttps://pulp.plan.io/issues/1801?journal_id=390302019-04-15T20:32:47Zbmbouterbmbouter@redhat.com
<ul><li><strong>Tags</strong> <i>Pulp 2</i> added</li></ul>