Project

Profile

Help

Story #3167

closed

Eliminate the need for Crane's in-memory database of images

Added by mihai.ibanescu@gmail.com over 6 years ago. Updated about 5 years ago.

Status:
CLOSED - WONTFIX
Priority:
Normal
Assignee:
-
Start date:
Due date:
% Done:

0%

Estimated time:
Platform Release:
Target Release - Docker:
Groomed:
No
Sprint Candidate:
No
Tags:
Pulp 2
Sprint:
Quarter:

Description

I believe that, at least for Docker v2, careful layout of the json files generated by pulp_docker will render obsolete the in-memory database that Crane has to generate by polling the filesystem and loading .json files.

I have not looked into v1, but my understanding is it won't be supported anymore.

Details

The V2 REST API is documented here:

https://docs.docker.com/registry/spec/api/#detail

In that document, <name> refers to a repository+image name. In pulp_docker, this represents the repo-registry-id setting on the distributor's config, and if unset, it defaults to <pulp_repo_id>. In Crane's v2 view, this is referred to as name_component.

Right now, v2 json files are produced under /var/lib/pulp/published/docker/v2/app/ (assuming data_dir is /var/lib/pulp/published/docker/ in /etc/crane.conf). In that directory there is one json file per Pulp repository, and it is named <pulp_repo_id>.json.

If we went to a (potentially) deeper directory structure like <repo-registry-id>.json, then Crane could just try to find the redirect file in <name>.json after it splits out the <name> portion from the request URL. This could be performed in repository.get_schema2_data_for_repo(name_component) which is being called from crane/views/v2.py:name_redirect.

Example

  • create pulp repository with id my-lamp, with repo-registry-id=mibanescu/lamp; upload a v2 image and tag it as latest
  • publishing the pulp repository creates the redirect file at /var/lib/pulp/published/docker/v2/app/mibanescu/lamp.json
  • crane is set up with data_dir=/var/lib/pulp/published/docker in /etc/crane.conf
  • crane receives request from docker client: GET https://registry.example.com/v2/mibanescu/lamp/manifests/latest
  • crane extracts name_component=mibanescu/lamp
  • crane looks for a file named <name_component>.json under data_dir, which expands to /var/lib/pulp/published/docker/v2/app/mibanescu/lamp.json - without the need to have a database, in memory or otherwise, to tell it the repo has been published
  • crane reads the url in the json file and issues the redirect, just like it currently does

Limitations

The search catalog cannot be generated without walking the filesystem.

Also available in: Atom PDF