Project

Profile

Help

Issue #2124

Pulp status check leaves open Pipes

Added by jsherril@redhat.com about 5 years ago. Updated over 1 year ago.

Status:
CLOSED - CURRENTRELEASE
Priority:
Urgent
Assignee:
Category:
-
Sprint/Milestone:
-
Start date:
Due date:
Estimated time:
Severity:
2. Medium
Version:
2.8.0
Platform Release:
2.8.7
OS:
Triaged:
No
Groomed:
No
Sprint Candidate:
No
Tags:
Pulp 2
Sprint:
Sprint 6
Quarter:

Description

First monitor the current open files for the wsgi:pulp process, here will just pick teh first one:

watch "lsof | grep `ps aux | grep wsgi:pulp\) | grep -v grep | awk '{print $2}' | head -n 1` | wc -l"

Then hit the status api 1000 times:

for i in `seq 1000`; do curl https://`hostname`/pulp/api/v2/status/ &> /dev/null; done

You should see that number increase steadily.

This was tested on RHEL 6 running:

libqpid-dispatch-0.4-13.el6sat.x86_64
pulp-admin-client-2.8.3.3-1.el6sat.noarch
pulp-client-1.0-1.noarch
pulp-docker-plugins-2.0.1.1-1.el6sat.noarch
pulp-katello-1.0.1-1.el6sat.noarch
pulp-puppet-plugins-2.8.3.3-1.el6sat.noarch
pulp-puppet-tools-2.8.3.3-1.el6sat.noarch
pulp-rpm-admin-extensions-2.8.3.5-1.el6sat.noarch
pulp-rpm-plugins-2.8.3.5-1.el6sat.noarch
pulp-selinux-2.8.3.3-1.el6sat.noarch
pulp-server-2.8.3.3-1.el6sat.noarch
python-gofer-qpid-2.7.6-1.el6sat.noarch
python-pulp-bindings-2.8.3.3-1.el6sat.noarch
python-pulp-client-lib-2.8.3.3-1.el6sat.noarch
python-pulp-common-2.8.3.3-1.el6sat.noarch
python-pulp-docker-common-2.0.1.1-1.el6sat.noarch
python-pulp-oid_validation-2.8.3.3-1.el6sat.noarch
python-pulp-puppet-common-2.8.3.3-1.el6sat.noarch
python-pulp-repoauth-2.8.3.3-1.el6sat.noarch
python-pulp-rpm-common-2.8.3.5-1.el6sat.noarch
python-pulp-streamer-2.8.3.3-1.el6sat.noarch
python-qpid-0.30-9.el6sat.noarch
python-qpid-qmf-0.30-5.el6.x86_64
qpid-cpp-client-0.30-11.el6.x86_64
qpid-cpp-client-devel-0.30-11.el6.x86_64
qpid-cpp-server-0.30-11.el6.x86_64
qpid-cpp-server-linearstore-0.30-11.el6.x86_64
qpid-dispatch-router-0.4-13.el6sat.x86_64
qpid-proton-c-0.9-16.el6.x86_64
qpid-qmf-0.30-5.el6.x86_64
qpid-tools-0.30-4.el6.noarch

Associated revisions

Revision af30b8e7 View on GitHub
Added by bmbouter about 5 years ago

Fixes Qpid file descriptor leak

This fix is already in upstream Kombu so this ports the important part of the fix to downstream

This commit bumps python-kombu to 3.0.33-6

https://pulp.plan.io/issues/2124 closes #2124

Revision af30b8e7 View on GitHub
Added by bmbouter about 5 years ago

Fixes Qpid file descriptor leak

This fix is already in upstream Kombu so this ports the important part of the fix to downstream

This commit bumps python-kombu to 3.0.33-6

https://pulp.plan.io/issues/2124 closes #2124

Revision 3a2cb8df View on GitHub
Added by bmbouter about 5 years ago

Updates python-kombu in external_deps.json

For Pulp builds to include the 3.0.33-6 version of python-kombu this file needs to have the new version named.

https://pulp.plan.io/issues/2124 re #2124

Revision 3a2cb8df View on GitHub
Added by bmbouter about 5 years ago

Updates python-kombu in external_deps.json

For Pulp builds to include the 3.0.33-6 version of python-kombu this file needs to have the new version named.

https://pulp.plan.io/issues/2124 re #2124

History

#1 Updated by jsherril@redhat.com about 5 years ago

In addition, this seems to sometimes cause a segfault:

Aug  2 16:16:26 sat-rhel6 qpidd[14602]: 2016-08-02 16:16:26 [System] error Error reading socket: Success(0)
Aug  2 16:16:26 sat-rhel6 kernel: httpd[15386]: segfault at 28 ip 00007f3f3154411b sp 00007f3f097facc0 error 4 in libapr-1.so.0.3.9[7f3f31525000+2b000]

#2 Updated by bmbouter about 5 years ago

  • Description updated (diff)

#4 Updated by bmbouter about 5 years ago

I reproduced this easily on my f24 machine. I did not see the segfaults but I can confirm that wsgi processes are becoming unusable as the request handling is handled by fewer and fewer pids over time. The pids that have run out of file descriptors aren't dead but they do show this in the logs:

Aug 02 17:42:28 dev audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-hostnamed comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Aug 02 17:42:28 dev audit[11606]: AVC avc:  denied  { getattr } for  pid=11606 comm="httpd" path="/home/vagrant/devel/pulp/server/usr/share/pulp/wsgi/webservices.wsgi" dev="0:43" ino=8653255 scontext=system_u:system_r:httpd_t:s0 tcontext=system_u:object_r:nfs_t:s0 tclass=file permissive=1
Aug 02 17:42:28 dev audit[17011]: AVC avc:  denied  { name_connect } for  pid=17011 comm="httpd" dest=5672 scontext=system_u:system_r:httpd_t:s0 tcontext=system_u:object_r:amqp_port_t:s0 tclass=tcp_socket permissive=1

#6 Updated by bmbouter about 5 years ago

  • Status changed from NEW to ASSIGNED
  • Assignee set to bmbouter
  • Priority changed from Normal to High
  • Sprint/Milestone set to 24

Moving onto Sprint 6 and upgrading Prio to High.

#8 Updated by bmbouter about 5 years ago

  • Priority changed from High to Urgent

Raising priority to URGENT per convo w/ smyers and jalberts

#9 Updated by bmbouter about 5 years ago

The root cause is that even though the kombu.transport.Qpid.Transport [0] object implements _ del _ [1] to release the object's file descriptors, it is never called. The kombu.Connection object is closed/released and it removes the reference from kombu.Transport to kombu.transport.Qpid.Transport. The file descriptors are stored as integers so if the _ del _ is never called then the file descriptor leak will occur.

I've been discussing the issue with @asksol in #celery, but for now I have a working workaround by calling _ del _ explicitly. I'm putting a PR in for this while we figure out what to do in upstream Kombu.

[0]: https://github.com/celery/kombu/blob/f3b0fe1d8f0c6d2a61b1457f649d370455fa13fc/kombu/transport/qpid.py#L1360
[1]: https://github.com/celery/kombu/blob/f3b0fe1d8f0c6d2a61b1457f649d370455fa13fc/kombu/transport/qpid.py#L1732-L1740

#10 Updated by bmbouter about 5 years ago

  • Status changed from ASSIGNED to POST
  • Platform Release set to 2.8.7

#11 Updated by bmbouter about 5 years ago

  • Status changed from POST to MODIFIED
  • % Done changed from 0 to 100

#12 Updated by semyers about 5 years ago

  • Status changed from MODIFIED to 5

#13 Updated by mhrivnak about 5 years ago

Here is the PR that was merged:

https://github.com/pulp/pulp/pull/2679

#14 Updated by pthomas@redhat.com about 5 years ago

  • Status changed from 5 to 6

Verified

Followed the verification steps from the description..

[root@ibm-x3550m3-10 ~]# rpm -qa |grep pulp
pulp-admin-client-2.8.7-0.3.beta.el7.noarch
python-kombu-3.0.33-6.pulp.el7.noarch
pulp-rpm-plugins-2.8.7-0.2.beta.el7.noarch
python-pulp-bindings-2.8.7-0.3.beta.el7.noarch
python-pulp-rpm-common-2.8.7-0.2.beta.el7.noarch
pulp-puppet-plugins-2.8.7-0.2.beta.el7.noarch
pulp-docker-admin-extensions-2.0.3-1.el7.noarch
pulp-ostree-plugins-1.1.3-1.el7.noarch
pulp-docker-plugins-2.0.3-1.el7.noarch
pulp-rpm-admin-extensions-2.8.7-0.2.beta.el7.noarch
pulp-python-admin-extensions-1.1.3-1.el7.noarch
python-pulp-common-2.8.7-0.3.beta.el7.noarch
python-pulp-docker-common-2.0.3-1.el7.noarch
pulp-puppet-admin-extensions-2.8.7-0.2.beta.el7.noarch
python-pulp-python-common-1.1.3-1.el7.noarch
pulp-selinux-2.8.7-0.3.beta.el7.noarch
python-pulp-streamer-2.8.7-0.3.beta.el7.noarch
python-pulp-repoauth-2.8.7-0.3.beta.el7.noarch
pulp-server-2.8.7-0.3.beta.el7.noarch
python-pulp-oid_validation-2.8.7-0.3.beta.el7.noarch
python-pulp-puppet-common-2.8.7-0.2.beta.el7.noarch
python-pulp-ostree-common-1.1.3-1.el7.noarch
pulp-ostree-admin-extensions-1.1.3-1.el7.noarch
python-isodate-0.5.0-4.pulp.el7.noarch
pulp-python-plugins-1.1.3-1.el7.noarch
python-pulp-client-lib-2.8.7-0.3.beta.el7.noarch
[root@ibm-x3550m3-10 ~]# 

#15 Updated by semyers about 5 years ago

  • Status changed from 6 to CLOSED - CURRENTRELEASE

#20 Updated by bmbouter over 3 years ago

  • Sprint set to Sprint 6

#21 Updated by bmbouter over 3 years ago

  • Sprint/Milestone deleted (24)

#22 Updated by bmbouter over 2 years ago

  • Tags Pulp 2 added

#23 Updated by bmbouter over 1 year ago

  • Category deleted (14)

We are removing the 'API' category per open floor discussion June 16, 2020.

Please register to edit this issue

Also available in: Atom PDF