Task #1195
closedStory #1150: As a user, I can lazily fetch repositories
Develop the pulp-streamer
100%
Description
For lazy-sync to work Squid needs to have another software component act as a client presenting the correct certificate to the feed URL. That streaming component is called the "pulp-streamer" for now. This component is also responsible for dispatching a Celery task defined in task #1181 which causes Pulp to download and save a copy of the unit that the streamer just fetched. This dispatch will need to use apply_async_with_reservation and lock on the "unit id and path" which together guarantee uniqueness in the catalog. See task #1181 for more details on how/why.
Configuration¶
The configuration will come from the server.conf file. This is appropriate because the streamer will need many things from server.conf (ie: database).
- have the streamer use an entry named 'streamer_port' in the [lazy] section of server.conf to use its port. This needs to have a default too. I'll suggest 8751 which is unused according to an IANA page I looked at.
- have the streamer use an entry named 'streamer_interface' in the [lazy] section of server.conf. This field should default to 'lo' which will cause it to listen on localhost only by default. The field accepts a comma separated list which can be used to limit which interfaces it should listen on.
- The streamer will use a header to tell squid how long to cache content it is delivering to squid. This should be configurable using the streamer_cache_timeout setting in the [lazy] section. This is expressed in seconds and will default to 86400 (the number of seconds in 1 day).
Requirements¶
- Use the "unit catalog" to determine which hostname and URL the incoming request coming in should be translated to
- Make a request to the URL determined by the "unit catalog" to present the correct SSL client certificate corresponding with that request
- Have the streamer verify the identity of the server side of the SSL connection consistent with Pulp's existing functionality
- Pass through the headers from the server as-is. This will require #1179 to be fixed first
- Overwrite and set the "Cache Control" header to the streamer_cache_timeout setting specified by server.conf. It also should append "public". The "public" part is not configurable.
- Headers must be delivered to the client before any data.
- Stream the data to the client as the streamer receives it. This is not a store-and-forward software it should stream in chunks.
- Concurrently handle multiple downloads at a time efficiently.
- Dispatch a Celery task (from task #1181) that causes Pulp to download and save a copy of the unit.
Alternate Content sources¶
The streamer needs to be integrated to allow alternate content sources. In this usage when Pulp has an alternate content source configured a lazy repo can receive content from the alternate content source by the streamer reading the bits from disk instead of the upstream --feed location.
A proof of concept stream based was developed (see attachment), use that as a starting point. This story does not do any rpm packaging, init script work, or systemd unit work; that is all part of another story.
Files
Related issues
Updated by bmbouter over 9 years ago
- Blocked by Story #1179: As a developer I can receive headers while using download_one() added
Updated by bmbouter about 9 years ago
Updated server.py example to incorporate example code for integration with Nectar events from nectar PR 29 and also added several TODO outlines of remaining streamer development.
Updated by jcline@redhat.com about 9 years ago
- Status changed from NEW to ASSIGNED
- Assignee set to jcline@redhat.com
Added by Jeremy Cline about 9 years ago
Added by Jeremy Cline about 9 years ago
Revision 8c303f54 | View on GitHub
ref #1195 - Implement the pulp-streamer.
This commit adds the 'lazy' module inside of pulp.server. It also adds streamer settings to Pulp's server.conf.
Added by jcline@redhat.com about 9 years ago
Added by jcline@redhat.com about 9 years ago
Updated by jcline@redhat.com about 9 years ago
- Status changed from ASSIGNED to POST
- % Done changed from 70 to 100
PR against the feature branch: https://github.com/pulp/pulp/pull/2065
Updated by jcline@redhat.com almost 9 years ago
- Status changed from POST to MODIFIED
Updated by rbarlow almost 9 years ago
- Status changed from MODIFIED to 5
- Platform Release set to 2.8.0
Updated by dkliban@redhat.com over 8 years ago
- Status changed from 5 to CLOSED - CURRENTRELEASE
ref #1195 - Implement the pulp-streamer.
This commit adds the 'lazy' module inside of pulp.server. It also adds streamer settings to Pulp's server.conf.