Story #3167
closedEliminate the need for Crane's in-memory database of images
0%
Description
I believe that, at least for Docker v2, careful layout of the json files generated by pulp_docker will render obsolete the in-memory database that Crane has to generate by polling the filesystem and loading .json files.
I have not looked into v1, but my understanding is it won't be supported anymore.
Details¶
The V2 REST API is documented here:
https://docs.docker.com/registry/spec/api/#detail
In that document, <name> refers to a repository+image name. In pulp_docker, this represents the repo-registry-id setting on the distributor's config, and if unset, it defaults to <pulp_repo_id>. In Crane's v2 view, this is referred to as name_component.
Right now, v2 json files are produced under /var/lib/pulp/published/docker/v2/app/
(assuming data_dir
is /var/lib/pulp/published/docker/
in /etc/crane.conf
). In that directory there is one json file per Pulp repository, and it is named <pulp_repo_id>.json.
If we went to a (potentially) deeper directory structure like <repo-registry-id>.json, then Crane could just try to find the redirect file in <name>.json after it splits out the <name> portion from the request URL. This could be performed in repository.get_schema2_data_for_repo(name_component)
which is being called from crane/views/v2.py:name_redirect
.
Example¶
- create pulp repository with id my-lamp, with
repo-registry-id=mibanescu/lamp
; upload a v2 image and tag it as latest - publishing the pulp repository creates the redirect file at
/var/lib/pulp/published/docker/v2/app/mibanescu/lamp.json
- crane is set up with
data_dir=/var/lib/pulp/published/docker
in/etc/crane.conf
- crane receives request from docker client:
GET https://registry.example.com/v2/mibanescu/lamp/manifests/latest
- crane extracts
name_component=mibanescu/lamp
- crane looks for a file named <name_component>.json under
data_dir
, which expands to/var/lib/pulp/published/docker/v2/app/mibanescu/lamp.json
- without the need to have a database, in memory or otherwise, to tell it the repo has been published - crane reads the url in the json file and issues the redirect, just like it currently does
Limitations¶
The search catalog cannot be generated without walking the filesystem.