Story #1908: As a user, I can rely on crane to check the destination target before redirecting - Crane - Pulp

Actions

Send by e-mail Copy link

Story #1908

closed

As a user, I can rely on crane to check the destination target before redirecting

Added by jluza almost 8 years ago. Updated about 5 years ago.

Status:

CLOSED - WONTFIX

Priority:

Normal

Assignee:

Start date:

Due date:

% Done:

Estimated time:

Platform Release:

Target Release - Crane:

master

Groomed:

Sprint Candidate:

Tags:

Pulp 2

Sprint:

Quarter:

Description

Following situation

repo1 - layers: [a,b,c], redirect url: https://foo.bar/repo1/layers/
repo2 - layers: [a,b,d], redirect url: https://foo.bar/repo2/layers/

let's assume https://foo.bar/repo2/layers/ is unreachable/broken for some reason. When user does
docker pull repo1
it fails because repo1 and repo2 share layer a so crane redirects GET /v1/images/088b4505aa3adc3d35e79c031fa126b403200f02f51920fbd9b7c503e87c7a2c/layer to
https://foo.bar/repo2/layers/ which is not reachable.

better solution: crane will http HEAD destination first and redirect to the destination only if it's reachable, otherwise will choose next destination from the list.

Actions

Copy link

Updated by rbarlow almost 8 years ago

Do we still need to support the Docker v1 protocol? Docker v2 content won't do this with the current implementation.

Actions

Copy link

Updated by bmbouter almost 8 years ago

Tracker changed from Issue to Story
Subject changed from [RFE] crane checks destination target before redirect to As a user, I can rely on crane to check the destination target before redirecting
Groomed set to No
Sprint Candidate set to No

Since this is an RFE, rewriting it to a user story title and updating the tracer to be type Story.

Actions

Copy link

Updated by jluza almost 8 years ago

If you are really sure that v2 content doesn't suffer from this issue, then I think we can probably drop this issue, because support of v1 will be dropped in late July this year.

Actions

Copy link

Updated by rbarlow almost 8 years ago

Status changed from NEW to CLOSED - WONTFIX

Hello jluza!

Crane doesn't know about the blobs (layers) for Docker v2 content like it did for Docker v1 content, so it can only redirect repositories. Thus, I'll close this as wontfix since we don't plan to do more work on Docker v1 at this time.

Thanks for the report!

Actions

Copy link

Updated by mhrivnak almost 8 years ago

There are some important factors to consider for the general feature being requested, which is for crane to make sure a resource really is available before redirecting to it. I see the issue was closed while I was writing this, but I'll add it anyway for posterity.

In a case where the redirect source requires authentication, we would need to teach crane how to authenticate and give it credentials. For example, consider content that requires an entitlement certificate for access. Crane's HEAD request would need to use a valid entitlement cert. Someone would have to make sure the entitlement cert gets replaced when it expires. Or for any other kind of auth, credentials will change at some point, and someone needs to handle that. This also increases the complexity of deploying crane, because now the deployer would have to inject and maintain secrets.

Crane currently does not require outbound network access of any kind. This feature would not only add a dependency on outbound access (which to be fair often isn't a big deal), but would require that crane has access to the same network resources as the client it is responding to. For most deployments that is probably not a big deal, but it is a valuable point of flexibility that would be lost.

Related to needing access to the same network resources, consider that thanks to things like DNS and network routing, crane's view of https://foo.bar may be very different from the client's.

Making requests before responding with redirects would increase crane's response time by a huge factor. That may be acceptable, but it's a consideration. For a high-traffic deployment, doubling the response time might mean doubling the amount of infrastructure running crane.

Making requests also adds a big new opportunity for failure. Right now, crane is basically bullet-proof. It's very simple. Everything it needs to generate a response is in-memory. Depending on remote requests brings up many new edge cases and failure scenarios that need to be handled. For example, how long should it wait for a response? Consider would would happen if one server for one repo stopped responding or became very slow. On a busy crane deployment, you could quickly end up with all of your available request handlers waiting on a response or timeout from that one server, and no handlers available to respond to client requests for other resources. Guarding against that would be difficult.

Depending on what scenario you are concerned about, a better option may be to use a real monitoring tool to monitor resource availability, and disable/enable repositories in crane as appropriate. That could be an interesting point of integration to explore, and it is definitely a good way to guard against the timeout issue above.

Actions

Copy link

Updated by bmbouter about 5 years ago

Tags Pulp 2 added

Actions

Send by e-mail Copy link

Also available in: Atom PDF

Project

Profile

Help

Crane

Agile boards

Custom queries

Story #1908

As a user, I can rely on crane to check the destination target before redirecting

Updated by rbarlow almost 8 years ago

Updated by bmbouter almost 8 years ago

Updated by jluza almost 8 years ago

Updated by rbarlow almost 8 years ago

Updated by mhrivnak almost 8 years ago

Updated by bmbouter about 5 years ago