Issue #1551
closedStory #1150: As a user, I can lazily fetch repositories
Requests for files not in a repository are being forwarded to the streamer.
Description
I noticed some requests are being forwarded to the pulp_streamer, despite them not existing:
Jan 19 16:04:35 dev pulp_streamer[22239]: [-] "127.0.0.1" - - [19/Jan/2016:16:04:34 +0000] "GET /var/lib/pulp/published/yum/master/yum_distributor/el7-ks/1453218564.99/.treeinfo HTTP/1.1" 404 - "-" "python-requests/2.9.1"
[vagrant@dev nectar]$ cd /var/lib/pulp/published/yum/master/yum_distributor/el7-ks/1453218564.99/
[vagrant@dev 1453218564.99]$ ls -lah | grep tree
lrwxrwxrwx. 1 apache apache 122 Jan 19 15:49 texlive-pst-tree-svn24142.1.12-32.el7.noarch.rpm -> /var/lib/pulp/content/units/rpm/c6e3/c6e3aeb9-727a-4beb-bcc9-1cdb1640cc65/texlive-pst-tree-svn24142.1.12-32.el7.noarch.rpm
lrwxrwxrwx. 1 apache apache 102 Jan 19 15:49 tree-1.6.0-10.el7.x86_64.rpm -> /var/lib/pulp/content/units/rpm/5799/5799bd27-581e-4c94-9852-2cf13f4c41ce/tree-1.6.0-10.el7.x86_64.rpm
lrwxrwxrwx. 1 apache apache 91 Jan 19 15:49 treeinfo -> /var/lib/pulp/content/units/distribution/4507/4507be24-9ca6-432a-bad7-6aeb2c998498/treeinfo
[vagrant@dev 1453218564.99]$
The streamer does the right thing and 404s, but it involves a database query to look for a catalog entry. It would be much preferable for client (since it has to deal with a 302) and server (it has to sign and verify those redirects, handle them, query the DB, etc) if the content WSGI app realized these files don't exist and 404ed.
Updated by jortel@redhat.com almost 9 years ago
Let's discuss the most efficient approach here.
Updated by jcline@redhat.com almost 9 years ago
- Status changed from NEW to POST
- Assignee set to jcline@redhat.com
- Triaged changed from No to Yes
Updated by jcline@redhat.com almost 9 years ago
I talked with Brian and he said the concern is an additional check on the filesystem. While this true and it does happen for every request, the alternative (redirecting the request) involves:
- A response to the client via Apache (after signing the URL - admittedly quick since we shouldn't be hitting the disk)
- A new request from the client to the reverse proxy
- Passage through the WSGI authentication script
- A new request to Squid
- Squid has to perform a lookup on its cache
- A new request to the Twisted streamer
- A database query to look up the catalog entry
- A 404 response from the streamer back through Squid and Apache
Added by Jeremy Cline almost 9 years ago
Added by Jeremy Cline almost 9 years ago
Revision cca7a778 | View on GitHub
Don't forward requests for files that don't exist to the streamer.
Requests for directories or files that don't exist in the repository should not be forwared to the streamer. This PR updates the view for the content WSGI app to check for the link in the published repository before forwarding a request.
closes #1551
Updated by jcline@redhat.com almost 9 years ago
- Status changed from POST to MODIFIED
Updated by rbarlow almost 9 years ago
- Status changed from MODIFIED to 5
- Platform Release set to 2.8.0
Updated by dkliban@redhat.com over 8 years ago
- Status changed from 5 to CLOSED - CURRENTRELEASE
Don't forward requests for files that don't exist to the streamer.
Requests for directories or files that don't exist in the repository should not be forwared to the streamer. This PR updates the view for the content WSGI app to check for the link in the published repository before forwarding a request.
closes #1551