Project

Profile

Help

Issue #3149

closed

pulpcore.plugin.download.asyncio has surprising timeout handling

Added by gmbnomis over 6 years ago. Updated about 5 years ago.

Status:
CLOSED - WONTFIX
Priority:
Normal
Assignee:
-
Category:
-
Sprint/Milestone:
Start date:
Due date:
Estimated time:
Severity:
2. Medium
Version:
Master
Platform Release:
OS:
Triaged:
Yes
Groomed:
No
Sprint Candidate:
Yes
Tags:
Sprint:
Quarter:

Description

aiohttp defines two timeouts in the ClientSession read_timeout and conn_timeout. The read_timeout is cumulative for the entire connection, i.e. all data must be read within this timeout.

In case of longer connections used for downloads (i.e. connections using the streaming API to get the data in smaller chunks), this may lead to timeouts on a connection that is perfectly fine, as the timeout is global and not per read operation.

For example:

async def single_download():
    async with aiohttp.ClientSession(read_timeout=3, conn_timeout=3) as session:
        downloader = HttpDownloader('http://127.0.0.1:8000/large_file.txt', session=session)
        return await downloader.run()

loop = asyncio.get_event_loop()
result = loop.run_until_complete(single_download())

Results in:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib64/python3.6/asyncio/base_events.py", line 467, in run_until_complete
    return future.result()
  File "<stdin>", line 4, in single_download
  File "/home/user/github/pulp-3-from-source/venv/src/pulpcore/plugin/pulpcore/plugin/download/asyncio/base.py", line 50, in wrapper
    raise error
  File "/home/user/github/pulp-3-from-source/venv/src/pulpcore/plugin/pulpcore/plugin/download/asyncio/base.py", line 47, in wrapper
    return await func(downloader)
  File "/home/user/github/pulp-3-from-source/venv/src/pulpcore/plugin/pulpcore/plugin/download/asyncio/http.py", line 130, in run
    to_return = await self._handle_response(response)
  File "/home/user/github/pulp-3-from-source/venv/src/pulpcore/plugin/pulpcore/plugin/download/asyncio/http.py", line 112, in _handle_response
    chunk = await response.content.read(1024 * 1024)
  File "/home/user/github/pulp-3-from-source/venv/lib64/python3.6/site-packages/aiohttp/streams.py", line 607, in read
    return (yield from super().read(n))
  File "/home/user/github/pulp-3-from-source/venv/lib64/python3.6/site-packages/aiohttp/streams.py", line 330, in read
    yield from self._wait('read')
  File "/home/user/github/pulp-3-from-source/venv/lib64/python3.6/site-packages/aiohttp/streams.py", line 259, in _wait
    yield from waiter
  File "/home/user/github/pulp-3-from-source/venv/lib64/python3.6/site-packages/aiohttp/helpers.py", line 727, in __exit__
    raise asyncio.TimeoutError from None
concurrent.futures._base.TimeoutError

after approx. 3 seconds.

One could use a high per connection timeout and an additional timeout for read() for long running connections. Something along these lines:

        async with client.get(download_url, timeout=long_timout) as resp:
            try:
...
                        new_data = await asyncio.wait_for(resp.content.read(self.CHUNK_SIZE),
                                                          per_read_timeout)
...
            except asyncio.TimeoutError:
...

Related issues

Related to Pulp - Issue #3918: DeclarativeVersion cannot sync longer than 5 minutes or a timeout error is raisedCLOSED - CURRENTRELEASEdalleyActions

Also available in: Atom PDF