Project

Profile

Help

Story #3167

Updated by mihai.ibanescu@gmail.com over 6 years ago

I believe that, at least for Docker v2, careful layout of the json files generated by pulp_docker will render obsolete the in-memory database that Crane has to generate by polling the filesystem and loading .json files. 

 I have not looked into v1, but my understanding is it won't be supported anymore. 

 h2. Details 

 The V2 REST API is documented here: 

 https://docs.docker.com/registry/spec/api/#detail 

 In that document, *<name>* refers to a repository+image name. In pulp_docker, this represents the *repo-registry-id* setting on the distributor's config, and if unset, it defaults to *<pulp_repo_id>*. In Crane's v2 view, this is referred to as *name_component*. 

 Right now, v2 json files are produced under /var/lib/pulp/published/docker/v2/app/ (assuming *data_dir* is /var/lib/pulp/published/docker/ in /etc/crane.conf). In that directory there is one json file per Pulp repository, and it is named *<pulp_repo_id>.json*. 

 If we went to a (potentially) deeper directory structure like *<repo-registry-id>.json*, then Crane could just try to find the redirect file in *<name>.json* after it splits out the *<name>* portion from the request URL. This could be performed in _repository.get_schema2_data_for_repo(name_component)_ which is being called from _crane/views/v2.py:name_redirect_. 

 h2. Example 

 * create pulp repository with id *my-lamp*, with *repo-registry-id=mibanescu/lamp*; upload a v2 image and tag it as *latest* 
 * publishing the pulp repository creates the redirect file at /var/lib/pulp/published/docker/v2/app/mibanescu/lamp.json 
 * crane is set up with *data_dir=/var/lib/pulp/published/docker* in /etc/crane.conf 
 * crane receives request from docker client: GET https://registry.example.com/v2/mibanescu/lamp/manifests/latest 
 * crane extracts *name_component=mibanescu/lamp* 
 * crane looks for a file named *<name_component>.json* under *data_dir*, which expands to /var/lib/pulp/published/docker/v2/app/mibanescu/lamp.json - without the need to have a database, in memory or otherwise, to tell it the repo has been published 

 h2. Limitations 

 The search catalog cannot be generated without walking the filesystem.

Back