Downloading » History » Revision 5
« Previous |
Revision 5/32
(diff)
| Next »
jortel@redhat.com, 08/29/2017 04:32 PM
Downloading¶
In pulp3, there are two competing technologies and designs being considered. For the purposes of the discussion we'll name them Jupiter and Saturn. The Jupiter solution is based on concurrent.futures and the Saturn solution is based on asyncio. In addition to the underlying technology difference, the solutions meet the requirements in different ways. The Jupiter solution includes more classes, provides more abstraction and supports extension through object composition. The Saturn solution meets the requirements with the fewest classes possible and minimum abstraction.
The three actors for our use cases is the Importer, Streamer and Plugin Writer. The ChangeSet shares a subset of the Streamer requirements but not included in this discussion.
Use Cases¶
Importer¶
As an importer, I need to download single files.
jupiter
download = HttpDownload(url, FileWriter(path))
try:
download()
except DownloadError:
# An error occurred.
else:
# Go read the downloaded file \o/
saturn
session = aiohttp.ClientSession()
downloader_obj = HttpDownloader(session, url)
downloader_coroutine = downloader_obj.run()
loop = asyncio._get_running_loop()
done, not_done = loop.run_until_complete(asyncio.wait([downloader_coroutine]))
for task in done:
try:
result = task.result() # This is a DownloadResult
except aiohttp.ClientError:
# An error occurred.
---
As an importer, I need to download files concurrently.
As an importer, I want to validate downloaded files.
As an importer, I am not required to keep all content (units) and artifacts in memory to support concurrent downloading.
As an importer, I need a way to link a downloaded file to an artifact without keeping all content units and artifacts in memory.
As an importer, I can perform concurrent downloading using a synchronous pattern.
As an importer, concurrent downloads must share resources such as sessions,connection pools and auth tokens across individual downloads.
As an importer I can customize how downloading is performed. For example, to support mirror lists
As an importer, concurrent downloading must limit the number of simultaneous connections. Downloading 5k artifacts cannot open 5k connections.
As an importer, I can terminate concurrent downlading at any point and not leak resources.
As an importer, I can download using any protocol. Starting with HTTP/HTTPS and FTP.
Streamer¶
As the streamer, I need to download files related to published artifacts and metadata but delegate the implementation (protocol, settings, credentials) to the importer. The implementation must be a black-box.
As the streamer, I can download using any protocol supported by the importer.
As the streamer, I want to validate downloaded files.
As the streamer, concurrent downloads must share resources such as sessions,connection pools and auth tokens across individual downloads without having knowledge of such things.
As the streamer, I need to support complex downloading such as mirror lists. This complexity must be delegated to the importer.
As the streamer, I need to bridge the downloaded bit stream to the Twisted response. The file is not written to disk.
As the streamer, I need to forward HTTP headers from the download response to the twisted response.
As the streamer, I can download using (the same) custom logic as the importer such as supporting mirror lists
h.3 Plugin Writer
As a plugin writer, I can add custom behavior to downloaders to support Mirror Lists.
Updated by jortel@redhat.com over 7 years ago · 5 revisions