Task #2168

Updated by almost 5 years ago

Nectar is a wrapper around requests which attempts to provide an asynchronous API while masking the true requests API. It's aim is to make downloading easier, but causes a great deal of problems in both ease-of-use and error handling/reporting. Furthermore, it offers very few useful features to Pulp developers since it attempts to be general.

Below I include a sketchy plan for a download API. This needs to be fleshed out and a few choices need to get made, which is what this task is about. It then needs to be implemented as part of a plugin API.

h2. Feature Set

* An easy-to-use synchronous API that can handle HTTP/HTTPS and provides features like connection pooling, granular TLS configuration, etc.

* An easy-to-use asynchronous API that can handle HTTP/HTTPS and provides features like connection pooling, granular TLS configuration, a callback system, etc.

* Automatic handling of content validation (checksums, size, whatever)

* Automatic handling of storage to an appropriate backend (local storage, some object store, a temporary file, etc).

* Automatic creation of unit files that track where it is stored, how to validate it (in case of bitrot), all the places it can come from with optional priority weighting (this replaces both the lazy catalog and the alternate content source catalog) and network authentication information, etc.

* Automatic progress reporting

Optionally creating the content units (maybe even associating them with a provided repository?)

* Shared connection pooling across all repositories and plugins

* Optional global concurrent connection configuration

h2. Synchronous API

I think that this should simply be requests, with some wrapping code to handle where it's stored, post-download validation, and model creation. We should expose the session API as-is to the user.

h2. Asynchronous API

This API is the API that requires some research and choices. There are several asynchronous HTTP clients that I am aware of, and there may be others.

# grequests
# requests-futures (Python 3 only)
# Twisted's web client

Of the three, I think Twisted is probably the most robust. It provides a well-documented callback system (which is already used in the Pulp streamer) and it looks like it offers all the features we need, although the configuration of timeouts and retries looks light currently (they, of course, accept pull requests!).

Again, I recommend exposing whatever client library we choose to the user, with wrappers that provide additional functionality for plugin writers (unit storage and database record creation, handling non-unit temporary downloads, automatic validation, etc).