Story #3167
Updated by mihai.ibanescu@gmail.com over 6 years ago
I believe that, at least for Docker v2, careful layout of the json files generated by pulp_docker will render obsolete the in-memory database that Crane has to generate by polling the filesystem and loading .json files.
I have not looked into v1, but my understanding is it won't be supported anymore.
h2. Details
The V2 REST API is documented here:
https://docs.docker.com/registry/spec/api/#detail
In that document, *<name>* refers to a repository+image name. In pulp_docker, this represents the *repo-registry-id* setting on the distributor's config, and if unset, it defaults to *<pulp_repo_id>*. In Crane's v2 view, this is referred to as *name_component*.
Right now, v2 json files are produced under /var/lib/pulp/published/docker/v2/app/ (assuming *data_dir* is /var/lib/pulp/published/docker/ in /etc/crane.conf). In that directory there is one json file per Pulp repository, and it is named *<pulp_repo_id>.json*.
If we went to a (potentially) deeper directory structure like *<repo-registry-id>.json*, then Crane could just try to find the redirect file in *<name>.json* after it splits out the *<name>* portion from the request URL. This could be performed in _repository.get_schema2_data_for_repo(name_component)_ which is being called from _crane/views/v2.py:name_redirect_.
h2. Example
* create pulp repository with id *my-lamp*, with *repo-registry-id=mibanescu/lamp*; upload a v2 image and tag it as *latest*
* publishing the pulp repository creates the redirect file at /var/lib/pulp/published/docker/v2/app/mibanescu/lamp.json
* crane is set up with *data_dir=/var/lib/pulp/published/docker* in /etc/crane.conf
* crane receives request from docker client: GET https://registry.example.com/v2/mibanescu/lamp/manifests/latest
* crane extracts *name_component=mibanescu/lamp*
* crane looks for a file named *<name_component>.json* under *data_dir*, which expands to /var/lib/pulp/published/docker/v2/app/mibanescu/lamp.json - without the need to have a database, in memory or otherwise, to tell it the repo has been published
h2. Limitations
The search catalog cannot be generated without walking the filesystem.