Issue #2542
closed
Streamer needs to try all available catalog entries.
Status:
CLOSED - CURRENTRELEASE
Description
Streamer needs to try all available catalog entries. This address use cases where multiple catalog entries exist for the same path (file) but some of them are no longer valid. This also supports cases where a path (file) is available from multiple remotes but one or more of them are temporarily unavailable.
The streamer should fetch all entries sorted by (created(date), revision) to ensure it's trying the newest entry first.
https://github.com/pulp/pulp/blob/master/streamer/pulp/streamer/server.py#L172
This is related to the work done here: https://pulp.plan.io/issues/2503, because that bug happened to result in catalog entries where the remote files were no longer available.
- Description updated (diff)
- Status changed from NEW to ASSIGNED
- Assignee set to jortel@redhat.com
- Sprint/Milestone set to 32
- Triaged changed from No to Yes
Which http result code should be returned when the streamer tries multiple catalog entries? The choices seem to be:
- always return 404
- return the http code for the last entry tried.
Either could be a little misleading and user might have to look at the log to determine what happened. I'm leaning toward return the last code because it will likely be the most accurate in more cases. Especially for cases where only 1 catalog entry exists.
Thoughts?
- Status changed from ASSIGNED to POST
I think the response code should be chosen within the context of the current requester and responder, and should not consider context of 3rd-party interactions. A client requests a file. The streamer tries to return it, but can't find it. The client doesn't care why. Filesystem permission error, selinux, file isn't on disk where it should be, auth error to a remote source, etc. Regardless of the reason, the streamer wasn't able to find the file. It should return a 404.
Here is the official definition, which I think fits perfectly: https://tools.ietf.org/html/rfc7231#section-6.5.4
Logging reasons for failures of each catalog entry would be a great idea though.
mhrivnak wrote:
I think the response code should be chosen within the context of the current requester and responder, and should not consider context of 3rd-party interactions. A client requests a file. The streamer tries to return it, but can't find it. The client doesn't care why. Filesystem permission error, selinux, file isn't on disk where it should be, auth error to a remote source, etc. Regardless of the reason, the streamer wasn't able to find the file. It should return a 404.
I do like the low-complexity and deterministic nature of this approach.
I do agree that we need to return 404 in the mentioned above situations.
- Sprint/Milestone changed from 32 to 33
- Status changed from POST to MODIFIED
- Platform Release set to 2.12.1
- Status changed from MODIFIED to 5
- Status changed from 5 to CLOSED - CURRENTRELEASE
- Sprint/Milestone deleted (
33)
Also available in: Atom
PDF
Streamer tries all catalog entries. closes #2542