Project

Profile

Help

Task #2987

The Distribution ViewSet needs to previent base_path overlap.

Added by jortel@redhat.com about 1 month ago. Updated about 2 hours ago.

Status:
NEW
Priority:
Normal
Assignee:
-
Category:
-
Sprint/Milestone:
-
% Done:

0%

Platform Release:
Blocks Release:
Backwards Incompatible:
No
Groomed:
No
Sprint Candidate:
Yes
Tags:
Pulp 3
QA Contact:
Complexity:
Smash Test:
Verified:
No
Verification Required:
No

Description

The Distribution ViewSet needs to previent base_path overlap. The model (DB) ensures base_path uniqueness but does not prevent one base_path from being nested within another. The look-and-see logic in the pulp2 check for this is subject to race conditions. This needs to be solved using the DB.

For example:

base_path = a/b/c/d needs to reserve the entire directory tree. Another base_path cannot begin with a/b/c/d.

History

#1 Updated by jortel@redhat.com about 1 month ago

  • Description updated (diff)

#2 Updated by jortel@redhat.com about 1 month ago

  • Subject changed from The Distribution serializer needs to previent base_path overlap. to The Distribution ViewSet needs to previent base_path overlap.
  • Description updated (diff)

#3 Updated by amacdona@redhat.com 5 days ago

  • Groomed changed from No to Yes
  • Sprint Candidate changed from No to Yes

#4 Updated by mhrivnak 5 days ago

  • Sprint/Milestone set to Sprint 26

#5 Updated by jortel@redhat.com 4 days ago

  • Groomed changed from Yes to No

Ungrooming and removing checklist items per discussion in retrospective.

#6 Updated by jortel@redhat.com about 22 hours ago

  • Sprint/Milestone deleted (Sprint 26)

Let's document and discuss potential solutions here.

#7 Updated by mhrivnak about 22 hours ago

Thinking of this as two problems, I have a pattern to solve one of them. Maybe it'll inspire someone to come up with another idea that solves both. The problems are:

1. make sure the new base_path is not contained by some existing base_path
2. make sure the new base_path does not contain some existing base_path

Problem 1 is very similar to a problem faced by the app that serves content. Given a full URL, it needs to determine which distribution's base_path is somewhere in that URL.

Given a candidate new base_path or a full path to content, for example "a/b/c/d", you could split that into each of these paths:

  • a
  • a/b
  • a/b/c
  • a/b/c/d

And you could do a query for any Distribution where the base_path is in that list of potential paths.

For use case 1, if a Distribution is found, then the new base_path is not allowed. For the content serving use case, if a Distribution is found, then that is the one which should be further queried to find the requested file.

But as you can see, this does nothing for use case 2.

#8 Updated by jortel@redhat.com about 21 hours ago

An additional consideration is race conditions which can be solved in a few ways.

1. The check is enforced with constraints so the DB will prevent race conditions. Not sure how this can be done yet.
2. The check is enforced with queries (as done in pulp2 and suggested in #note-7) in which case a common approach is to lock the table. Best I can tell, explicit table locking is not directly supported by django in a DB agnostic way. However this can be solved easily if we're willing to do a little bit of postgres specific SQL.

As for case 2, there won't be many Distributions and I think we could fetch them all (just base_url) into memory and apply the case 1 algorithm in reverse.

#9 Updated by bmbouter about 20 hours ago

Can some example cases that we want to prevent be written out? Having some common examples would be helpful I think in evaluating algorithms.

#10 Updated by dkliban@redhat.com about 2 hours ago

@mhrivnak, I think you meant that searching for sub-paths of a path would help with use case 2. For use case number 1, a simple LIKE with a left-anchored pattern would be sufficient. An index is used for such a lookup in both PostgreSQL and MariaDB. Continuing with your example, the query would look like this:

SELECT * FROM distributions
WHERE base_path LIKE 'a/b/c/d%'  

Please register to edit this issue

Also available in: Atom PDF