Actions
Issue #3149
closedpulpcore.plugin.download.asyncio has surprising timeout handling
Start date:
Due date:
Estimated time:
Severity:
2. Medium
Version:
Master
Platform Release:
OS:
Triaged:
Yes
Groomed:
No
Sprint Candidate:
Yes
Tags:
Sprint:
Quarter:
Description
aiohttp defines two timeouts in the ClientSession read_timeout
and conn_timeout
. The read_timeout
is cumulative for the entire connection, i.e. all data must be read within this timeout.
In case of longer connections used for downloads (i.e. connections using the streaming API to get the data in smaller chunks), this may lead to timeouts on a connection that is perfectly fine, as the timeout is global and not per read operation.
For example:
async def single_download():
async with aiohttp.ClientSession(read_timeout=3, conn_timeout=3) as session:
downloader = HttpDownloader('http://127.0.0.1:8000/large_file.txt', session=session)
return await downloader.run()
loop = asyncio.get_event_loop()
result = loop.run_until_complete(single_download())
Results in:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib64/python3.6/asyncio/base_events.py", line 467, in run_until_complete
return future.result()
File "<stdin>", line 4, in single_download
File "/home/user/github/pulp-3-from-source/venv/src/pulpcore/plugin/pulpcore/plugin/download/asyncio/base.py", line 50, in wrapper
raise error
File "/home/user/github/pulp-3-from-source/venv/src/pulpcore/plugin/pulpcore/plugin/download/asyncio/base.py", line 47, in wrapper
return await func(downloader)
File "/home/user/github/pulp-3-from-source/venv/src/pulpcore/plugin/pulpcore/plugin/download/asyncio/http.py", line 130, in run
to_return = await self._handle_response(response)
File "/home/user/github/pulp-3-from-source/venv/src/pulpcore/plugin/pulpcore/plugin/download/asyncio/http.py", line 112, in _handle_response
chunk = await response.content.read(1024 * 1024)
File "/home/user/github/pulp-3-from-source/venv/lib64/python3.6/site-packages/aiohttp/streams.py", line 607, in read
return (yield from super().read(n))
File "/home/user/github/pulp-3-from-source/venv/lib64/python3.6/site-packages/aiohttp/streams.py", line 330, in read
yield from self._wait('read')
File "/home/user/github/pulp-3-from-source/venv/lib64/python3.6/site-packages/aiohttp/streams.py", line 259, in _wait
yield from waiter
File "/home/user/github/pulp-3-from-source/venv/lib64/python3.6/site-packages/aiohttp/helpers.py", line 727, in __exit__
raise asyncio.TimeoutError from None
concurrent.futures._base.TimeoutError
after approx. 3 seconds.
One could use a high per connection timeout and an additional timeout for read() for long running connections. Something along these lines:
async with client.get(download_url, timeout=long_timout) as resp:
try:
...
new_data = await asyncio.wait_for(resp.content.read(self.CHUNK_SIZE),
per_read_timeout)
...
except asyncio.TimeoutError:
...
Related issues
Actions