Issue #1788
closedThe pulp_streamer is subject to requests issue #2863
Description
Here is a broad overview of the issue [0] [1] in question:
When the streamer downloads a file, it does so with the Python `requests` library. The `requests` library provides connection pooling by making use of the `urllib3` library. This library has a bug. The bug is that each connection has TLS context associated with it, and the way the connections are pooled does not take this into account. To fully understand how and why this is the case I recommend reading the `urllib3` source, specifically the `PoolManager` class.
The correct place to fix this is in the `urllib3` and `requests` libraries. There is a pull request[2] open on `urllib3` which introduces a way to consider the TLS context of a connection when determining the pool it is placed in. I believe once this is merged, a pull request will need to be made against `requests` that makes use of this new feature. Once that is done, these patches must be backported to the EL6 and EL7 packages. `python-urllib3` is in the base OS repository so, from what I've been told, we shouldn't package a version ourselves. An errata would need to be issued.
The above solution will not be trivial. It has been suggested that we attempt to work around the issue in the streamer until such time as the fixes land in `urllib3`, `requests`, and the downstream packages of those projects. The following is a list of suggested solutions:
- Stop connection pooling in the streamer. The problem with this is that we will create a large number of TCP connections with the upstream servers (one per file). It is likely the servers will throttle us by either returning HTTP 503s or refusing the TCP connection. This will work slightly better than not at all, but it is not something that should be used in production (in my opinion).
- Stop using Nectar in the streamer. Nectar is a wrapper around `requests`, so if we switched to a different library we might avoid this problem. The only other serious candidate is the Twisted web client[3]. This has two downsides. The first is that we would no longer support alternate content sources with lazy sync. The second is that there may be problems with the Twisted web client we are currently unaware of and we will find ourselves in a similar situation after doing this. We also have a large range of versions to support (8 through 15) whereas the version of `requests` in RHEL is quite new due to a recent errata (2.6).
- Try to manage a 'pool' of sessions in the streamer. I investigated this and decided it would be very difficult to get working at all, and if we did manage to get it functional it would take much longer. It would also be very risky. The reason for this is that we need to use locking because all this is happening inside of Twisted, and when this is combined with the need to clean up our 'pool' it is difficult to avoid either deadlocking or killing connections suddenly.
Most of these have a significant amount of work associated with them.
[0] https://github.com/kennethreitz/requests/issues/3058
[1] https://github.com/kennethreitz/requests/issues/2863
[2] https://github.com/shazow/urllib3/pull/751
[3] https://twistedmatrix.com/documents/current/web/howto/client.html
Updated by jcline@redhat.com about 7 years ago
- Status changed from NEW to ASSIGNED
- Assignee set to jcline@redhat.com
- Priority changed from Normal to High
- Severity changed from 2. Medium to 3. High
- Version set to Master
- Platform Release set to 2.8.1
- Triaged changed from No to Yes
Updated by semyers about 7 years ago
- Platform Release changed from 2.8.1 to 2.8.2
Updated by semyers almost 7 years ago
- Platform Release changed from 2.8.2 to 2.8.3
Updated by Ichimonji10 almost 7 years ago
Are there steps for reproducing this issue as an end user?
Updated by jcline@redhat.com almost 7 years ago
As a Pulp end user?
1. Set up two repositories, each with an entitlement certificate that is not entitled to the other repository's content.
2. Lazy sync both.
3. Request a package from one repository that isn't in the other
4. Request a package from the other repository that isn't in the first repository.
That should do it.
Updated by jcline@redhat.com almost 7 years ago
The fix for urllib3 has been merged[0]. Issues have been filed against EL6 and EL7 to backport the patch[1][2]. However, changes need to be made to requests to use this new functionality. A patch has been proposed[3], but unfortunately it breaks the API for requests so it will not be accepted in the 2.Y series (unless somehow it can be done without breaking the API - I'm not sure that it can).
We can work around this issue by sub-classing the HTTPAdapter in the 2.Y series and overriding a few methods. This will mean that if fixes are introduced in those various methods we won't pick them up, but we can keep an eye on what upstream is up to and since 3.0 isn't too far away (it seems) we shouldn't have to do this for long. We also have to make sure our adapter is compatible with 2.6 through 2.9+ to work with RHEL6 through Fedora 2[3-9].
One last thing to note - for this fix to be in Fedora, it must have urllib3-1.16 or greater. That may not happen too quickly, since they unbundle the requests library and probably want to align the urllib3 version to the one the version of requests they are shipping uses. All that is to say, this probably won't land in Fedora until requests pulls in urllib3-1.16 (which has to be released before that even happens).
[0] https://github.com/shazow/urllib3/pull/830
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1329382
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1329395
[3] https://github.com/kennethreitz/requests/pull/3109
Updated by semyers almost 7 years ago
- Platform Release changed from 2.8.3 to 2.8.4
Updated by semyers almost 7 years ago
- Platform Release changed from 2.8.4 to 2.8.5
Updated by semyers almost 7 years ago
- Platform Release changed from 2.8.5 to 2.8.6
Updated by jcline@redhat.com over 6 years ago
- Status changed from ASSIGNED to POST
Updated by semyers over 6 years ago
- Priority changed from High to Urgent
- Platform Release set to 2.8.7
Blocking 2.8.7 beta
Added by Jeremy Cline over 6 years ago
Added by Jeremy Cline over 6 years ago
Backport proposed requests HTTPAdapter
This contains the changes to the default HTTP adapter for the requests library proposed in GitHub pull request #3109. Note that at this time, that pull request has not been accepted and is subject to change. In addition to the modified adapter, the Pulp streamer (and only the Pulp streamer) has been changed to use this adapter.
Note this fix only works if urllib-1.16+ is used in conjunction with requests.
closes #1788
Updated by bmbouter over 6 years ago
This was rebased back to 2.8-dev so the PR which will introduce the commit is: https://github.com/pulp/pulp/pull/2667
Updated by Anonymous over 6 years ago
- Status changed from POST to MODIFIED
- % Done changed from 0 to 100
Applied in changeset pulp|9db514741c2cda126c049505f98ddd3b4f2b895e.
Updated by bmbouter over 6 years ago
It would be great to have QE manually verify this issue.
Updated by jcline@redhat.com over 6 years ago
Updated by jcline@redhat.com over 6 years ago
- Status changed from ASSIGNED to POST
Updated by semyers over 6 years ago
- Status changed from POST to MODIFIED
This issue should be MODIFIED, https://github.com/pulp/pulp/pull/2696 is merged to 2.8-dev.
Updated by pthomas@redhat.com over 6 years ago
- Status changed from 5 to 6
verified
yum install python-urllib3-1.10.2-2.katello.el7.noarch.rpm
Loaded plugins: product-id, search-disabled-repos, subscription-manager
Examining python-urllib3-1.10.2-2.katello.el7.noarch.rpm: python-urllib3-1.10.2-2.katello.el7.noarch
Marking python-urllib3-1.10.2-2.katello.el7.noarch.rpm as an update to python-urllib3-1.10.2-2.el7_1.noarch
Resolving Dependencies
--> Running transaction check
---> Package python-urllib3.noarch 0:1.10.2-2.el7_1 will be updated
---> Package python-urllib3.noarch 0:1.10.2-2.katello.el7 will be an update
--> Finished Dependency Resolution
Dependencies Resolved
===========================================================================================================
Package Arch Version Repository Size
===========================================================================================================
Updating:
python-urllib3 noarch 1.10.2-2.katello.el7 /python-urllib3-1.10.2-2.katello.el7.noarch 377 k
Transaction Summary
===========================================================================================================
Upgrade 1 Package
Total size: 377 k
Is this ok [y/d/N]: y
Downloading packages:
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
Updating : python-urllib3-1.10.2-2.katello.el7.noarch 1/2
Cleanup : python-urllib3-1.10.2-2.el7_1.noarch 2/2
Verifying : python-urllib3-1.10.2-2.katello.el7.noarch 1/2
Verifying : python-urllib3-1.10.2-2.el7_1.noarch 2/2
Updated:
python-urllib3.noarch 0:1.10.2-2.katello.el7
Complete!
# pulp-admin rpm repo create --repo-id capsule --feed https://cdn.redhat.com/content/dist/rhel/server/6/6Server/x86_64/sat-capsule/6.1/os --download-policy on_demand --feed-ca-cert cdn.redhat.com-chain.crt --feed-cert 5200954048155109175.pem --feed-key 5200954048155109175-key.pem
Successfully created repository [capsule]
pulp-admin rpm repo create --repo-id jws --feed https://cdn.redhat.com/content/dist/rhel/server/7/7.2/x86_64/jws/3/os --download-policy on_demand --feed-ca-cert cdn.redhat.com-chain.crt --feed-cert 7560728174757006536.pem --feed-key 7560728174757006536-key.pem
Successfully created repository [jws]
# pulp-admin rpm repo sync run --repo-id jws
+----------------------------------------------------------------------+
Synchronizing Repository [jws]
+----------------------------------------------------------------------+
This command may be exited via ctrl+c without affecting the request.
Downloading metadata...
[|]
... completed
Downloading repository content...
[-]
[==================================================] 100%
RPMs: 156/156 items
Delta RPMs: 0/0 items
... completed
Downloading distribution files...
[==================================================] 100%
Distributions: 0/0 items
... completed
Importing errata...
[\]
... completed
Importing package groups/categories...
[-]
... completed
Cleaning duplicate packages...
[-]
... completed
Task Succeeded
Initializing repo metadata
[-]
... completed
Publishing Distribution files
[-]
... completed
Publishing RPMs
[==================================================] 100%
156 of 156 items
... completed
Publishing Delta RPMs
... skipped
Publishing Errata
[==================================================] 100%
3 of 3 items
... completed
Publishing Comps file
[==================================================] 100%
2 of 2 items
... completed
Publishing Metadata.
[-]
... completed
Closing repo metadata
[-]
... completed
Generating sqlite files
... skipped
Publishing files to web
[-]
... completed
Writing Listings File
[-]
... completed
Task Succeeded
[root@qe-blade-07 ~]# pulp-admin rpm repo sync run --repo-id capsule
+----------------------------------------------------------------------+
Synchronizing Repository [capsule]
+----------------------------------------------------------------------+
This command may be exited via ctrl+c without affecting the request.
Downloading metadata...
[\]
... completed
Downloading repository content...
[-]
[==================================================] 100%
RPMs: 287/287 items
Delta RPMs: 0/0 items
... completed
Downloading distribution files...
[==================================================] 100%
Distributions: 0/0 items
... completed
Importing errata...
[\]
... completed
Importing package groups/categories...
[-]
... completed
Cleaning duplicate packages...
[-]
... completed
Task Succeeded
Initializing repo metadata
[-]
... completed
Publishing Distribution files
[-]
... completed
Publishing RPMs
[==================================================] 100%
287 of 287 items
... completed
Publishing Delta RPMs
... skipped
Publishing Errata
[==================================================] 100%
12 of 12 items
... completed
Publishing Comps file
[-]
... completed
Publishing Metadata.
[-]
... completed
Closing repo metadata
[-]
... completed
Generating sqlite files
... skipped
Publishing files to web
[-]
... completed
Writing Listings File
[-]
... completed
Task Succeeded
[root@qe-blade-07 ~]# pulp-admin rpm repo list
+----------------------------------------------------------------------+
RPM Repositories
+----------------------------------------------------------------------+
Id: capsule
Display Name: None
Description: None
Content Unit Counts:
Erratum: 12
Package Group: 1
Rpm: 287
Yum Repo Metadata File: 1
Id: jws
Display Name: None
Description: None
Content Unit Counts:
Erratum: 3
Package Group: 2
Rpm: 156
Yum Repo Metadata File: 1
[root@qe-blade-07 ~]# systemctl start goferd
[root@qe-blade-07 ~]#
[root@qe-blade-07 ~]#
[root@qe-blade-07 ~]#
[root@qe-blade-07 ~]# systemctl restart pulp_streamer
[root@qe-blade-07 ~]#
[root@qe-blade-07 ~]#
[root@qe-blade-07 ~]#
[root@qe-blade-07 ~]# pulp-consumer -u admin -p admin register --consumer-id goldfish
Consumer [goldfish] successfully registered
[root@qe-blade-07 ~]# pulp-admin rpm consumer bind --repo-id capsule --consumer-id goldfish
This command may be exited via ctrl+c without affecting the request.
[\]
Running...
Task Succeeded
[root@qe-blade-07 ~]# pulp-admin rpm consumer bind --repo-id jws --consumer-id goldfish
This command may be exited via ctrl+c without affecting the request.
[\]
Running...
Task Succeeded
[root@qe-blade-07 ~]# pulp-admin rpm consumer package install run --consumer-id goldfish -n foreman-debug
Install task created with id [ 02f875eb-541b-4aab-bd3c-d6610739c693 ]
This command may be exited via ctrl+c without affecting the request.
Refresh Repository Metadata [ OK ]
Downloading Packages [ OK ]
Check Package Signatures [ OK ]
Running Test Transaction [ OK ]
Running Transaction [ OK ]
Install Succeeded
+----------------------------------------------------------------------+
Installed
+----------------------------------------------------------------------+
Name: foreman-debug
Version: 1.7.2.56
Arch: noarch
Repoid: capsule
[root@qe-blade-07 ~]# pulp-admin rpm consumer package install run --consumer-id goldfish -n mod_cluster-demo
Install task created with id [ 2cab5daa-bf06-4dfd-9e65-2f5bd0c7465d ]
This command may be exited via ctrl+c without affecting the request.
Refresh Repository Metadata [ OK ]
Downloading Packages [ OK ]
Check Package Signatures [ OK ]
Running Test Transaction [ OK ]
Running Transaction [ OK ]
Install Succeeded
+----------------------------------------------------------------------+
Installed
+----------------------------------------------------------------------+
Name: mod_cluster
Version: 1.3.1
Arch: noarch
Repoid: jws
+----------------------------------------------------------------------+
Installed for Dependencies
+----------------------------------------------------------------------+
Name: python-javapackages
Version: 3.4.1
Arch: noarch
Repoid: beaker-Server
Name: javapackages-tools
Version: 3.4.1
Arch: noarch
Repoid: beaker-Server
Updated by semyers over 6 years ago
- Status changed from 6 to CLOSED - CURRENTRELEASE
Backport proposed requests HTTPAdapter
This contains the changes to the default HTTP adapter for the requests library proposed in GitHub pull request #3109. Note that at this time, that pull request has not been accepted and is subject to change. In addition to the modified adapter, the Pulp streamer (and only the Pulp streamer) has been changed to use this adapter.
Note this fix only works if urllib-1.16+ is used in conjunction with requests.
closes #1788