Pulp streamer not 'streaming' content to squid
I'm trying to get on_demand repos working.
I've got a very slow internet connection (2Mbit shared), which isn't helping, but I think pulp streamer isn't actually streaming data to squid. Instead, it appears to send all the data to squid after it's fetched it all itself.
When a client tries to download a large file (eg squashfs.img), the first problem it hits, is the 1 minute default proxy read timeout in apache. After increasing that (ProxySet connectiontimeout=60 timeout=3600), the next thing to timeout waiting for data is squid. It's default read_timeout is 10 minutes.
My understanding is that pulp_streamer should be streaming data to squid and as long as bytes are being received, squid nor apache should abort with read timeouts.
#1 Updated by alexjfisher over 4 years ago
Sorry, was in a big rush earlier. A few more details...
Pulp version 2.12.1 on Centos 7. Installed using https://github.com/Katello/puppet-pulp. Puppet's configuration of apache and squid look reasonable. They match the documentation anyway!
Upstream feed URL is https://www.mirrorservice.org/sites/mirror.centos.org/7.3.1611/os/x86_64/ and all internet access is via a corporate squid (non-caching, whitelist only) proxy.
Upstream proxy seems to be well behaved (manual downloads through it work fine, if slowly). I can't bypass it, so ruling it out 100% isn't going to be easy.
I've being using curl to test fetching on_demand (and not yet downloaded!) files. Using tcpdump, I can see data flowing from the corporate proxy into pulp_streamer, but I can't see anything between pulp_streamer and pulp's squid instance. Curl sits there receiving 0bps until it either suddenly starts to very quickly download the file, or a read timeout occurs (in apache or squid depending on how I've been fiddling with their read timeout settings.)
Meanwhile (and maybe not that important), every 10 minutes (default setting), the deferred download task thing triggers and tries to download the files (expecting that squid will probably have cached them). It hasn't, so the task also waits for squid to (hopefully) eventually get given the content from pulp_streamer. Unlike curl, the worker times out after 27 seconds. It tries a few times before giving up. I'm not sure, but I'm hoping each of these tries hasn't actually triggered pulp_streamer into opening another connection to www.mirrorservice.org.
I also played with the download_concurrency setting.
Lowering the server.conf lazy download_concurrency from 5 to 2 seemed to help a bit. I guess I made a bit more out of my limited bandwidth.
Hopefully I've provided enough that someone might be able to reproduce some of these issues? Let me know if there's anything else I can provide or check. I'm afisher on freenode IRC.
#3 Updated by alexjfisher over 4 years ago
I have also tried each of the following.
- Using an http feed URL.
- Using an internal feed URL and turning off use of the upstream corporate proxy.
- Using katello's version of python-liburl3 https://github.com/Katello/katello-packaging/commit/b60f4e8362a66c7864d01c2c10c0c4ff77f30dbe
- Upgrading python-requests and python-urllib3 packages to fedora 24 versions.
- Replacing squid with varnish.
No improvement. pulp-streamer still doesn't stream :(
#4 Updated by alexjfisher over 4 years ago
Regression is in python-nectar I think.
and pulp_streamer streams again.
#8 Updated by alexjfisher over 4 years ago
Other relevant versions are...
#13 Updated by bmbouter over 4 years ago
- Status changed from MODIFIED to CLOSED - DUPLICATE
Since this does seem like a duplicate, I'm going to close it as such. Once the fix for Issue 2696 is into a build, if the reporter (or anyone) can reproduce this issue, please post a comment and we can reopen.
#15 Updated by alexjfisher over 4 years ago
Tested fix by installing latest nectar rpm (on a pulp 2.12.2 system)
yum install https://repos.fedorapeople.org/repos/pulp/pulp/beta/2.13/7/x86_64/python-nectar-1.5.4-1.el7.noarch.rpm
New build contains fix and streaming works fine.
Please register to edit this issue