Project

Profile

Help

Issue #1788

The pulp_streamer is subject to requests issue #2863

Added by jcline@redhat.com over 4 years ago. Updated over 1 year ago.

Status:
CLOSED - CURRENTRELEASE
Priority:
Urgent
Category:
-
Sprint/Milestone:
-
Start date:
Due date:
Estimated time:
Severity:
3. High
Version:
Master
Platform Release:
2.8.7
OS:
Triaged:
Yes
Groomed:
No
Sprint Candidate:
No
Tags:
Pulp 2
Sprint:
Sprint 5
Quarter:

Description

Here is a broad overview of the issue [0] [1] in question:

When the streamer downloads a file, it does so with the Python `requests` library. The `requests` library provides connection pooling by making use of the `urllib3` library. This library has a bug. The bug is that each connection has TLS context associated with it, and the way the connections are pooled does not take this into account. To fully understand how and why this is the case I recommend reading the `urllib3` source, specifically the `PoolManager` class.

The correct place to fix this is in the `urllib3` and `requests` libraries. There is a pull request[2] open on `urllib3` which introduces a way to consider the TLS context of a connection when determining the pool it is placed in. I believe once this is merged, a pull request will need to be made against `requests` that makes use of this new feature. Once that is done, these patches must be backported to the EL6 and EL7 packages. `python-urllib3` is in the base OS repository so, from what I've been told, we shouldn't package a version ourselves. An errata would need to be issued.

The above solution will not be trivial. It has been suggested that we attempt to work around the issue in the streamer until such time as the fixes land in `urllib3`, `requests`, and the downstream packages of those projects. The following is a list of suggested solutions:

  • Stop connection pooling in the streamer. The problem with this is that we will create a large number of TCP connections with the upstream servers (one per file). It is likely the servers will throttle us by either returning HTTP 503s or refusing the TCP connection. This will work slightly better than not at all, but it is not something that should be used in production (in my opinion).
  • Stop using Nectar in the streamer. Nectar is a wrapper around `requests`, so if we switched to a different library we might avoid this problem. The only other serious candidate is the Twisted web client[3]. This has two downsides. The first is that we would no longer support alternate content sources with lazy sync. The second is that there may be problems with the Twisted web client we are currently unaware of and we will find ourselves in a similar situation after doing this. We also have a large range of versions to support (8 through 15) whereas the version of `requests` in RHEL is quite new due to a recent errata (2.6).
  • Try to manage a 'pool' of sessions in the streamer. I investigated this and decided it would be very difficult to get working at all, and if we did manage to get it functional it would take much longer. It would also be very risky. The reason for this is that we need to use locking because all this is happening inside of Twisted, and when this is combined with the need to clean up our 'pool' it is difficult to avoid either deadlocking or killing connections suddenly.

Most of these have a significant amount of work associated with them.

[0] https://github.com/kennethreitz/requests/issues/3058
[1] https://github.com/kennethreitz/requests/issues/2863
[2] https://github.com/shazow/urllib3/pull/751
[3] https://twistedmatrix.com/documents/current/web/howto/client.html

Associated revisions

Revision 9db51474 View on GitHub
Added by Jeremy Cline about 4 years ago

Backport proposed requests HTTPAdapter

This contains the changes to the default HTTP adapter for the requests library proposed in GitHub pull request #3109. Note that at this time, that pull request has not been accepted and is subject to change. In addition to the modified adapter, the Pulp streamer (and only the Pulp streamer) has been changed to use this adapter.

Note this fix only works if urllib-1.16+ is used in conjunction with requests.

closes #1788

Revision 9db51474 View on GitHub
Added by Jeremy Cline about 4 years ago

Backport proposed requests HTTPAdapter

This contains the changes to the default HTTP adapter for the requests library proposed in GitHub pull request #3109. Note that at this time, that pull request has not been accepted and is subject to change. In addition to the modified adapter, the Pulp streamer (and only the Pulp streamer) has been changed to use this adapter.

Note this fix only works if urllib-1.16+ is used in conjunction with requests.

closes #1788

History

#1 Updated by jcline@redhat.com over 4 years ago

  • Status changed from NEW to ASSIGNED
  • Assignee set to jcline@redhat.com
  • Priority changed from Normal to High
  • Severity changed from 2. Medium to 3. High
  • Version set to Master
  • Platform Release set to 2.8.1
  • Triaged changed from No to Yes

#2 Updated by jcline@redhat.com over 4 years ago

  • Description updated (diff)

#3 Updated by semyers over 4 years ago

  • Platform Release changed from 2.8.1 to 2.8.2

#4 Updated by mhrivnak over 4 years ago

  • Sprint/Milestone set to 19

#5 Updated by semyers over 4 years ago

  • Platform Release changed from 2.8.2 to 2.8.3

#6 Updated by Ichimonji10 over 4 years ago

Are there steps for reproducing this issue as an end user?

#7 Updated by jcline@redhat.com over 4 years ago

As a Pulp end user?

1. Set up two repositories, each with an entitlement certificate that is not entitled to the other repository's content.
2. Lazy sync both.
3. Request a package from one repository that isn't in the other
4. Request a package from the other repository that isn't in the first repository.

That should do it.

#8 Updated by jcline@redhat.com over 4 years ago

The fix for urllib3 has been merged[0]. Issues have been filed against EL6 and EL7 to backport the patch[1][2]. However, changes need to be made to requests to use this new functionality. A patch has been proposed[3], but unfortunately it breaks the API for requests so it will not be accepted in the 2.Y series (unless somehow it can be done without breaking the API - I'm not sure that it can).

We can work around this issue by sub-classing the HTTPAdapter in the 2.Y series and overriding a few methods. This will mean that if fixes are introduced in those various methods we won't pick them up, but we can keep an eye on what upstream is up to and since 3.0 isn't too far away (it seems) we shouldn't have to do this for long. We also have to make sure our adapter is compatible with 2.6 through 2.9+ to work with RHEL6 through Fedora 2[3-9].

One last thing to note - for this fix to be in Fedora, it must have urllib3-1.16 or greater. That may not happen too quickly, since they unbundle the requests library and probably want to align the urllib3 version to the one the version of requests they are shipping uses. All that is to say, this probably won't land in Fedora until requests pulls in urllib3-1.16 (which has to be released before that even happens).

[0] https://github.com/shazow/urllib3/pull/830
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1329382
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1329395
[3] https://github.com/kennethreitz/requests/pull/3109

#9 Updated by semyers over 4 years ago

  • Platform Release changed from 2.8.3 to 2.8.4

#10 Updated by mhrivnak over 4 years ago

  • Sprint/Milestone changed from 19 to 20

#11 Updated by mhrivnak over 4 years ago

  • Sprint/Milestone changed from 20 to 21

#12 Updated by semyers over 4 years ago

  • Platform Release changed from 2.8.4 to 2.8.5

#14 Updated by mhrivnak over 4 years ago

  • Sprint/Milestone changed from 21 to 22

#15 Updated by semyers over 4 years ago

  • Platform Release changed from 2.8.5 to 2.8.6

#16 Updated by semyers over 4 years ago

  • Platform Release deleted (2.8.6)

#17 Updated by mhrivnak about 4 years ago

  • Sprint/Milestone changed from 22 to 23

#18 Updated by jcline@redhat.com about 4 years ago

  • Status changed from ASSIGNED to POST

#19 Updated by semyers about 4 years ago

  • Priority changed from High to Urgent
  • Platform Release set to 2.8.7

Blocking 2.8.7 beta

#20 Updated by bmbouter about 4 years ago

This was rebased back to 2.8-dev so the PR which will introduce the commit is: https://github.com/pulp/pulp/pull/2667

#21 Updated by Anonymous about 4 years ago

  • Status changed from POST to MODIFIED
  • % Done changed from 0 to 100

#22 Updated by bmbouter about 4 years ago

It would be great to have QE manually verify this issue.

#23 Updated by semyers about 4 years ago

  • Status changed from MODIFIED to ASSIGNED

#27 Updated by jcline@redhat.com about 4 years ago

  • Status changed from ASSIGNED to POST

#28 Updated by semyers about 4 years ago

  • Status changed from POST to MODIFIED

This issue should be MODIFIED, https://github.com/pulp/pulp/pull/2696 is merged to 2.8-dev.

#29 Updated by semyers about 4 years ago

  • Status changed from MODIFIED to 5

#30 Updated by pthomas@redhat.com about 4 years ago

  • Status changed from 5 to 6

verified


 yum install python-urllib3-1.10.2-2.katello.el7.noarch.rpm 
Loaded plugins: product-id, search-disabled-repos, subscription-manager
Examining python-urllib3-1.10.2-2.katello.el7.noarch.rpm: python-urllib3-1.10.2-2.katello.el7.noarch
Marking python-urllib3-1.10.2-2.katello.el7.noarch.rpm as an update to python-urllib3-1.10.2-2.el7_1.noarch
Resolving Dependencies
--> Running transaction check
---> Package python-urllib3.noarch 0:1.10.2-2.el7_1 will be updated
---> Package python-urllib3.noarch 0:1.10.2-2.katello.el7 will be an update
--> Finished Dependency Resolution

Dependencies Resolved

===========================================================================================================
 Package           Arch      Version                  Repository                                      Size
===========================================================================================================
Updating:
 python-urllib3    noarch    1.10.2-2.katello.el7     /python-urllib3-1.10.2-2.katello.el7.noarch    377 k

Transaction Summary
===========================================================================================================
Upgrade  1 Package

Total size: 377 k
Is this ok [y/d/N]: y
Downloading packages:
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
  Updating   : python-urllib3-1.10.2-2.katello.el7.noarch                                              1/2 
  Cleanup    : python-urllib3-1.10.2-2.el7_1.noarch                                                    2/2 
  Verifying  : python-urllib3-1.10.2-2.katello.el7.noarch                                              1/2 
  Verifying  : python-urllib3-1.10.2-2.el7_1.noarch                                                    2/2 

Updated:
  python-urllib3.noarch 0:1.10.2-2.katello.el7                                                             

Complete!

# pulp-admin rpm repo create   --repo-id capsule  --feed https://cdn.redhat.com/content/dist/rhel/server/6/6Server/x86_64/sat-capsule/6.1/os  --download-policy on_demand --feed-ca-cert cdn.redhat.com-chain.crt --feed-cert 5200954048155109175.pem --feed-key 5200954048155109175-key.pem
Successfully created repository [capsule]

  pulp-admin rpm repo create   --repo-id jws  --feed https://cdn.redhat.com/content/dist/rhel/server/7/7.2/x86_64/jws/3/os  --download-policy on_demand --feed-ca-cert cdn.redhat.com-chain.crt --feed-cert 7560728174757006536.pem --feed-key 7560728174757006536-key.pem 
Successfully created repository [jws]
# pulp-admin rpm repo sync run --repo-id jws
+----------------------------------------------------------------------+
                     Synchronizing Repository [jws]
+----------------------------------------------------------------------+

This command may be exited via ctrl+c without affecting the request.

Downloading metadata...
[|]
... completed

Downloading repository content...
[-]
[==================================================] 100%
RPMs:       156/156 items
Delta RPMs: 0/0 items

... completed

Downloading distribution files...
[==================================================] 100%
Distributions: 0/0 items
... completed

Importing errata...
[\]
... completed

Importing package groups/categories...
[-]
... completed

Cleaning duplicate packages...
[-]
... completed

Task Succeeded

Initializing repo metadata
[-]
... completed

Publishing Distribution files
[-]
... completed

Publishing RPMs
[==================================================] 100%
156 of 156 items
... completed

Publishing Delta RPMs
... skipped

Publishing Errata
[==================================================] 100%
3 of 3 items
... completed

Publishing Comps file
[==================================================] 100%
2 of 2 items
... completed

Publishing Metadata.
[-]
... completed

Closing repo metadata
[-]
... completed

Generating sqlite files
... skipped

Publishing files to web
[-]
... completed

Writing Listings File
[-]
... completed

Task Succeeded

[root@qe-blade-07 ~]# pulp-admin rpm repo sync run --repo-id capsule
+----------------------------------------------------------------------+
                   Synchronizing Repository [capsule]
+----------------------------------------------------------------------+

This command may be exited via ctrl+c without affecting the request.

Downloading metadata...
[\]
... completed

Downloading repository content...
[-]
[==================================================] 100%
RPMs:       287/287 items
Delta RPMs: 0/0 items

... completed

Downloading distribution files...
[==================================================] 100%
Distributions: 0/0 items
... completed

Importing errata...
[\]
... completed

Importing package groups/categories...
[-]
... completed

Cleaning duplicate packages...
[-]
... completed

Task Succeeded

Initializing repo metadata
[-]
... completed

Publishing Distribution files
[-]
... completed

Publishing RPMs
[==================================================] 100%
287 of 287 items
... completed

Publishing Delta RPMs
... skipped

Publishing Errata
[==================================================] 100%
12 of 12 items
... completed

Publishing Comps file
[-]
... completed

Publishing Metadata.
[-]
... completed

Closing repo metadata
[-]
... completed

Generating sqlite files
... skipped

Publishing files to web
[-]
... completed

Writing Listings File
[-]
... completed

Task Succeeded

[root@qe-blade-07 ~]# pulp-admin rpm repo list
+----------------------------------------------------------------------+
                            RPM Repositories
+----------------------------------------------------------------------+

Id:                  capsule
Display Name:        None
Description:         None
Content Unit Counts: 
  Erratum:                12
  Package Group:          1
  Rpm:                    287
  Yum Repo Metadata File: 1

Id:                  jws
Display Name:        None
Description:         None
Content Unit Counts: 
  Erratum:                3
  Package Group:          2
  Rpm:                    156
  Yum Repo Metadata File: 1

[root@qe-blade-07 ~]# systemctl start goferd
[root@qe-blade-07 ~]# 
[root@qe-blade-07 ~]# 
[root@qe-blade-07 ~]# 
[root@qe-blade-07 ~]# systemctl restart pulp_streamer
[root@qe-blade-07 ~]# 
[root@qe-blade-07 ~]# 
[root@qe-blade-07 ~]# 
[root@qe-blade-07 ~]# pulp-consumer -u admin -p admin register --consumer-id goldfish
Consumer [goldfish] successfully registered

[root@qe-blade-07 ~]#  pulp-admin rpm consumer bind --repo-id capsule --consumer-id goldfish
This command may be exited via ctrl+c without affecting the request.

[\]
Running...

Task Succeeded

[root@qe-blade-07 ~]# pulp-admin rpm consumer bind --repo-id jws --consumer-id goldfish
This command may be exited via ctrl+c without affecting the request.

[\]
Running...

Task Succeeded

[root@qe-blade-07 ~]#  pulp-admin rpm consumer  package install  run --consumer-id goldfish -n foreman-debug
Install task created with id [ 02f875eb-541b-4aab-bd3c-d6610739c693 ]

This command may be exited via ctrl+c without affecting the request.

Refresh Repository Metadata             [ OK ]
Downloading Packages                    [ OK ]
Check Package Signatures                [ OK ]
Running Test Transaction                [ OK ]
Running Transaction                     [ OK ]

Install Succeeded

+----------------------------------------------------------------------+
                               Installed
+----------------------------------------------------------------------+

Name:    foreman-debug
Version: 1.7.2.56
Arch:    noarch
Repoid:  capsule

[root@qe-blade-07 ~]#  pulp-admin rpm consumer  package install  run --consumer-id goldfish -n  mod_cluster-demo
Install task created with id [ 2cab5daa-bf06-4dfd-9e65-2f5bd0c7465d ]

This command may be exited via ctrl+c without affecting the request.

Refresh Repository Metadata             [ OK ]
Downloading Packages                    [ OK ]
Check Package Signatures                [ OK ]
Running Test Transaction                [ OK ]
Running Transaction                     [ OK ]

Install Succeeded

+----------------------------------------------------------------------+
                               Installed
+----------------------------------------------------------------------+

Name:    mod_cluster
Version: 1.3.1
Arch:    noarch
Repoid:  jws

+----------------------------------------------------------------------+
                       Installed for Dependencies
+----------------------------------------------------------------------+

Name:    python-javapackages
Version: 3.4.1
Arch:    noarch
Repoid:  beaker-Server

Name:    javapackages-tools
Version: 3.4.1
Arch:    noarch
Repoid:  beaker-Server

#31 Updated by semyers about 4 years ago

  • Status changed from 6 to CLOSED - CURRENTRELEASE

#33 Updated by bmbouter over 2 years ago

  • Sprint set to Sprint 5

#34 Updated by bmbouter over 2 years ago

  • Sprint/Milestone deleted (23)

#35 Updated by bmbouter over 1 year ago

  • Tags Pulp 2 added

Please register to edit this issue

Also available in: Atom PDF