Changing a repository's name changes its URI
If I change the name of a Pulp 3 repository, then the URI to that repository changes. Though I haven't checked, there's a reasonable chance that this behaviour is also true for other types of resources too.
This is bad! Why? Because cool URIs don't change. Say it with me: Cool URIs don't change. There's a couple good reasons for URIs to be immutable.
One of the biggest benefits of RESTful HTTP APIs is that they're cache-friendly. (If you're not aiming for a high degree of cache-ability, then why are you even trying to make a RESTful HTTP API? Just make an RPC HTTP API, or something else.) When a resource is changed with an HTTP PUT, PATCH, or POST, then that resource isn't supposed to move. It's merely been updated. Caches count on this. If an HTTP PUT, PATCH or POST changes the URI of a resource, then caches can't do their job. A cache that tries really hard to do its job might break a client, and a client that works around caches with cache-busting headers will quickly drown the backing application.
There's other issues, too. For example, how can a client deterministically differentiate resources? For example, let's say that one client asks Pulp to asynchronously rename repository "bar" to "biz" and repository "foo" to "bar," and a second client asks for repository "bar" to be published. Which repository is published? There's absolutely no way to know. Mutable URIs are a recipe for race conditions!
"But that's a multi-user example, Jeremy! We're only serving single-client use cases."
Sure. That's unfortunate. But let's think about how this might affect a single client. Let's say that a client wants to rename repository "foo" to "bar" and "bar" to "foo." In what order should the client do this? It can't, at least not without using a temporary name. Let's hope that that other name isn't already used. This introduces unnecessary complexity and fragility into all clients. And what happens if any of these operations are done in the wrong order, or if the server (which is presumably multi-threaded) does things in a different order? The application breaks.
A single client might have many objects in-memory at a given time, and therefore several references to a repository. Imagine that a repository is renamed. How can all of these objects be updated? Can they all just issue an HTTP GET request to the href that they have? Nope. They'll get information about the wrong repository, or a 404. The client has no good way to update its state. Let's hope that the client doesn't do anything wild like cache information on disk for re-use between sessions. one of the big benefits of RESTful HTTP APIs is that they're stateless. Mutable URIs break statelessness.
One of the other big benefits of a RESTful HTTP API is that it promotes client-server decoupling. A good RESTful API gives opaque, fully-built URIs to clients. Client's don't need to construct URIs - they just ask for them. This makes client logic considerably simpler. A client that has to construct URIs by interpolating keys into a templated URI is much more complex. It needs to know which pieces of information that a server application provides might be used as keys, where those keys need to be placed into a URI, and the fact that when a property of a resource changes, existing cached URIs might become out of date and need to be re-built. And as discussed in the previous paragraph, there's already no good way to do this. So now clients need to do the nigh-impossible task of updating in-memory or on-disk objects whose URI might have changed, aaaand they need to perform URI templating with possibly out-of-date information. Why in the world should clients be forced to eat this complexity?
#9 Updated by Ichimonji10 over 2 years ago
The following isn't a part of the bug, but just a recommendation.
I'd recommend that no natural keys be used in URLs at all. I was reviewing some code yesterday, and the code interpolated a repository name directly into a URL. That kind of just-plain-bad client code design is enabled by using natural keys. If synthetic keys are used in URLs, then clients don't even have the opportunity to make mistakes like that. On the server side, code becomes simpler when synthetic keys are used. The server no longer needs to say "try to use this natural key if there's no conflict, and otherwise, find a synthetic key that's not used. Instead, it just needs to say "find a synthetic key that's not used."
#10 Updated by mhrivnak over 2 years ago
Repository names should be immutable. The same should probably go for all natural keys on other models. This was likely a simple oversight.
I think it is still ok and valuable to have natural keys in URLs. It could enable someone to poorly design their client, but I don't think we should actively prevent that by obscuring URLs. Readable URLs can be helpful in lots of situations, not the least of which is troubleshooting. But in any case, this issue is not the best place for this discussion. Feel free to resume it in a new issue if you like.
#13 Updated by email@example.com over 2 years ago
I don't think the correct answer is to make natural keys immutable. Instead, URIs should only contain synthetic keys. One reason that good database design includes synthetic primary keys is that natural keys are typically mutable. Natural keys are meaningful to users and they expect to be able to rename things. I don't see how URIs are any different. The URI is the immutable canonical reference for a REST resource so it should not contain natural keys either. The only reason (I can think of) to include natural keys in URIs is to make them human readable which I don't find a compelling reason to do so. Flows starting with users who know the natural key should do a lookup to get the URI and then proceed.
Please register to edit this issue