The Distribution ViewSet needs to previent base_path overlap.
The Distribution ViewSet needs to previent base_path overlap. The model (DB) ensures base_path uniqueness but does not prevent one base_path from being nested within another. The look-and-see logic in the pulp2 check for this is subject to race conditions. This needs to be solved using the DB.
base_path = a/b/c/d needs to reserve the entire directory tree. Another base_path cannot begin with a/b/c/d.
#7 Updated by mhrivnak about 22 hours ago
Thinking of this as two problems, I have a pattern to solve one of them. Maybe it'll inspire someone to come up with another idea that solves both. The problems are:
1. make sure the new base_path is not contained by some existing base_path
2. make sure the new base_path does not contain some existing base_path
Problem 1 is very similar to a problem faced by the app that serves content. Given a full URL, it needs to determine which distribution's base_path is somewhere in that URL.
Given a candidate new base_path or a full path to content, for example "a/b/c/d", you could split that into each of these paths:
And you could do a query for any Distribution where the base_path is in that list of potential paths.
For use case 1, if a Distribution is found, then the new base_path is not allowed. For the content serving use case, if a Distribution is found, then that is the one which should be further queried to find the requested file.
But as you can see, this does nothing for use case 2.
#8 Updated by firstname.lastname@example.org about 21 hours ago
An additional consideration is race conditions which can be solved in a few ways.
1. The check is enforced with constraints so the DB will prevent race conditions. Not sure how this can be done yet.
2. The check is enforced with queries (as done in pulp2 and suggested in #note-7) in which case a common approach is to lock the table. Best I can tell, explicit table locking is not directly supported by django in a DB agnostic way. However this can be solved easily if we're willing to do a little bit of postgres specific SQL.
As for case 2, there won't be many Distributions and I think we could fetch them all (just base_url) into memory and apply the case 1 algorithm in reverse.
#10 Updated by email@example.com about 2 hours ago
@mhrivnak, I think you meant that searching for sub-paths of a path would help with use case 2. For use case number 1, a simple LIKE with a left-anchored pattern would be sufficient. An index is used for such a lookup in both PostgreSQL and MariaDB. Continuing with your example, the query would look like this:
SELECT * FROM distributions WHERE base_path LIKE 'a/b/c/d%'
Please register to edit this issue