Story #4040

Updated by bmbouter over 4 years ago

h2. Problem 

 The ArtifactDownloader has When syncing from the url below I get a max_downloads restrictor. This ServerDisconnectedError 

 Here is an identical feature to the connection limiting offered with aiohttp itself. Specifically aiohttp offers two features with its TCPConnector: how I reproduce: 

 From "their docs": <pre> 
 http POST http://localhost:8000/pulp/api/v3/repositories/ name=foo 
 export REPO_HREF=$(http :8000/pulp/api/v3/repositories/ | jq -r '.results[] | select(.name == "foo") | ._href') 
 http POST http://localhost:8000/pulp/api/v3/remotes/file/ name='bar' url='' 
 export REMOTE_HREF=$(http :8000/pulp/api/v3/remotes/file/ | jq -r '.results[] | select(.name == "bar") | ._href') 
 http POST ':8000'$REMOTE_HREF'sync/' repository=$REPO_HREF 

 * limit (int) – total number simultaneous connections. If limit is I get this error: 

 Sep 26 20:28:31 rq[834]: aiohttp.client_exceptions.ServerDisconnectedError: None the connector has no limit (default: 100). 
 * limit_per_host (int) – limit simultaneous connections to the same endpoint. Endpoints are the same if they are have equal (host, port, is_ssl) triple. If limit is 0 the connector has no limit (default: 0). </pre> 

 h2. Is this really a duplication? Root Cause 

 Yes, it is verified via wireshark analysis. A max_downloader=20 and a limit=20 for the same number The default concurrency level of TCP connections to the server. ArtifactDownloader stage is too high. That is here: 

 It's kind of crazy, but it would only work if I set max_concurrent_downloads=1. 

 h2. Solution 

 Add connection_limit and connection_limit_per_host <code>max_concurrent_downloads</code> to the Remote like FileRemote as an optional parameter. If unset the rest default value of core is used. If set, the attributes, e.g. proxy settings. 

 connection_limit in Pulp maps ArtifactDownloader needs to TCPConnector.limit in aiohttp 
 connection_limit_per_host in Pulp maps to TCPConnector.limit_per_host in aiohttp 

 h2. What use that value instead. There are the right default to Pulp 

 Like Pulp2, we should default connection_limit=5. We should default connection_limit_per_host to 0 so it's disabled by default. 

 If unset on a remote, a user will receive these default values. 

 h2. How few options to Implement do that: 

 This configuration should be read from the Remote (a) subclass DeclarativeVersion and applied to override the session by the Factory just like how the other settings are done. Since an unset value should provide default even if unset, the default actually like in the Factory code. That is "here": pipeline_stages method 
 (b) use a custom pipeline 
 (c) implement issue 4039.