Updated by bmbouter over 4 years ago
h2. Problem The ArtifactDownloader has When syncing from the url below I get a max_downloads restrictor. This ServerDisconnectedError https://repos.fedorapeople.org/pulp/pulp/demo_repos/file-example/PULP_MANIFEST_500 Here is an identical feature to the connection limiting offered with aiohttp itself. Specifically aiohttp offers two features with its TCPConnector: how I reproduce: From "their docs":https://docs.aiohttp.org/en/stable/client_reference.html#tcpconnector <pre> http POST http://localhost:8000/pulp/api/v3/repositories/ name=foo export REPO_HREF=$(http :8000/pulp/api/v3/repositories/ | jq -r '.results | select(.name == "foo") | ._href') http POST http://localhost:8000/pulp/api/v3/remotes/file/ name='bar' url='https://repos.fedorapeople.org/pulp/pulp/demo_repos/file-example/PULP_MANIFEST_500' export REMOTE_HREF=$(http :8000/pulp/api/v3/remotes/file/ | jq -r '.results | select(.name == "bar") | ._href') http POST ':8000'$REMOTE_HREF'sync/' repository=$REPO_HREF </pre> * limit (int) – total number simultaneous connections. If limit is I get this error: <pre> Sep 26 20:28:31 pulp3.dev rq: aiohttp.client_exceptions.ServerDisconnectedError: None the connector has no limit (default: 100). * limit_per_host (int) – limit simultaneous connections to the same endpoint. Endpoints are the same if they are have equal (host, port, is_ssl) triple. If limit is 0 the connector has no limit (default: 0). </pre> h2. Is this really a duplication? Root Cause Yes, it is verified via wireshark analysis. A max_downloader=20 and a limit=20 for the same number The default concurrency level of TCP connections to the server. ArtifactDownloader stage is too high. That is here: https://github.com/pulp/pulp/blob/85eae21c9c0c99e7b62e6c8dcfd7b3de8d522226/plugin/pulpcore/plugin/stages/artifact_stages.py#L258 It's kind of crazy, but it would only work if I set max_concurrent_downloads=1. h2. Solution Add connection_limit and connection_limit_per_host <code>max_concurrent_downloads</code> to the Remote like FileRemote as an optional parameter. If unset the rest default value of core is used. If set, the attributes, e.g. proxy settings. connection_limit in Pulp maps ArtifactDownloader needs to TCPConnector.limit in aiohttp connection_limit_per_host in Pulp maps to TCPConnector.limit_per_host in aiohttp h2. What use that value instead. There are the right default to Pulp Like Pulp2, we should default connection_limit=5. We should default connection_limit_per_host to 0 so it's disabled by default. If unset on a remote, a user will receive these default values. h2. How few options to Implement do that: This configuration should be read from the Remote (a) subclass DeclarativeVersion and applied to override the session by the Factory just like how the other settings are done. Since an unset value should provide default even if unset, the default actually like in the Factory code. That is "here":https://github.com/pulp/pulp/blob/master/plugin/pulpcore/plugin/download/factory.py#L62-L103 pipeline_stages method (b) use a custom pipeline (c) implement issue 4039.