As a plugin writer, the HttpDownloader provides auto_decompress=True like aiohttp
aiohttp by default decompresses everything it downloads, e.g. gzip, or other compression types. This is great for typical clients which would otherwise have to decompress downloaded data, e.g. downloading a foo.tar.gz file, when reading the data from aiohttp you'regetting the actual contents of foo, not the foo.tar.gz compressed data.
Pulp needs to store data exactly as it was stored remotely. To uncompress the data and save it in Pulp's backend is incorrect because Pulp things of that file as foo.tar.gz, not foo. When it presents it to client's it's presented as foo.tar.gz. If we uncompress it, but then save it as foo.tar.gz (incorrect) then clients will in turn be receiving foo.tar.gz and go to uncompress it (a second time) and that will fail.
There are other situations though where we want the auto_decompression. When downloading temporary metadata for example, often times remote metadata is compressed, and we need to download, uncompress it, and then have our code parse it. So auto-decompression is really useful in some cases.
Enable Pulp's downloaders to have auto decompression by default, but have the option to turn it off and download binary data.
Updated by bmbouter over 5 years ago
- Status changed from MODIFIED to ASSIGNED
Actually after investigating the original symptoms that motivated this issue I believe we do not need to offer this feature. The
Content-Encoding response header indicates the encoding of the response by the webserver just before sending it. In all cases we want to have aiohttp automatically decompress this data.
I'm going to negative commit this feature.