Project

Profile

Help

Task #2868

Updated by jortel@redhat.com over 7 years ago

WORK IN PROGRESS 

 h1. Overview 

 The platform needs to support the composition and inventory of publications.    Each _publication_ is a representation of a repository's content that can be consumed by a specific technology.    An example of a technology is YUM.    A publication is created by a _Publisher_ associated with the repository and is provided by a plugin.    A repository may have multiple publishers.    A publication is composed of two types of files.    The first is a content Artifact.    The second, is _metadata_ created by the publisher when the publication is created.    Once created, a publication may be distributed for consumption in many ways.    The platform will support common online distributing such as HTTP and HTTPS and offline distributing such as creating an ISO. 

 Additional Goals 
 * Clean separation between publishing and how a publication is distributed. 
 * Eliminate the use of symlinks as the primary method of publishing. 
 * Eliminate need for each plugin to provide an Apache .conf file for distributing via http/https. 
 * Prevent orphaned content from being deleted while published. 

 h1. Design 

 h3. Tables 

 The _Publication_ table contains publications and is associated to the Publisher that created it.    A Distribution defines a method for making publications available for consumption.    Attributes currently modeled on _Publisher_ pertaining to distribution such as Auth would be removed to ????. 

 *Publication* 
 * [pk] *id* - The primary key. 
 * [fk] *publisher_id* - The publisher that created the publication.    Constraint ensure it's deleted when the publication is deleted. 
 * *created* - When the publication was created. 


 The _PublishedArtifact_ table contains linkage to both content Artifacts and generated metadata files. 


 *PublishedArtifact* 
 * [pk] *id* - The primary key. 
 * [fk] *publication_id* - A publication.    Constraint ensure it's deleted when the publication is deleted. 
 * [fk] *artifact_id* - An (optional) associated content artifact. 
 * *relative_path* - The relative path component within the URL that is also relative to the Publication.base_path.  


 *PublishedMetadata* 
 * [pk] *id* - The primary key. 
 * [fk] *publication_id* - A publication.    Constraint ensure it's deleted when the publication is deleted. 
 * *file* - An (optional) absolute path to a metadata file.    Stored in: /var/lib/pulp/published/metadata/<id>/<name> 
 * *relative_path* - The relative path component within the URL that is also relative to the Publication.base_path. 


 h1. Distribution 


 A _Distribution_ defines the paths under which a publication is distributed. 

 *Distribution* 
 * [pk] *id* - The primary key. 
 * [fk] *publisher_id* - A publisher. 
 * [fk] *publication_id* - An (optional) publication (mutable). 
 * *name* - The    distribution name (Eg: rawhide, stable).  
 * *base_path* - The base path for the publication.    This is the root of the path component of URLs. 
 * *policy* - Update policy (auto|manual).    The _auto_ policy means the publication_id is updated automatically when a new publication is created.    The "manual" policy is designed to support promotion work flows. 


 A _Distributor_ defines method of generating a static file tree files for a Distribution.    Some builtin and others contributed by plugins.    Like: rsync to static CDN and crane support. 

 *Distributor* 
 * [pk] *id* - The primary key. 
 * [fk] *distribution_id* - A related distribution. 
 * *policy* - trigger policy (auto|manual).    The _auto_ policy means the distributor is triggered automatically when a new publication is created. 


 h1. Sample Data 

 <pre> 
 Publisher 
 ------------------------------- 
 publisher-1, ...  
 </pre> 

 <pre> 
 Publication 
 ------------------------------- 
 publication-1, publisher-1, ... 
 publication-2, publisher-1, ... 
 </pre> 

 <pre> 
 PublishedMetadata 
 ------------------------------- 
 <id>, publication-1, /var/lib/pulp/published/../repodata/repomd.xml 
 <id>, publication-1, /var/lib/pulp/published/../repodata/primary.xml 
 </pre> 

 <pre> 
 PublishedArtifact 
 ------------------------------- 
 <id>, publication-1, artifact-1, packages/dog.rpm 
 <id>, publication-1, artifact-2, packages/cat.rpm 
 </pre> 

 <pre> 
 Distribution 
 ------------------------------- 
 <id>, publisher-1, publication-1, rawhide, f25/rawhide/x86_64, auto 
 <id>, publisher-1, publication-2, stable, f25/stable/x86_64, manual 
 </pre> 


 h1. General Flows 

 h2. Create A Repository 

 <pre> 
 1. Create a repository. 
 2. Create a publisher associated with the repository. 
 3. Create desired distributors associated with the publisher such as http and/or https. 
 </pre> 


 h2. Publishing:  

 _"The publisher will compose a publication"_ 

 <pre> 
 1. Publisher creates a publication using the plugin API. 
 2. Publisher adds content artifacts to the publication. 
 3. Publisher generates some metadata files in the working dir. 
 4. Publisher adds the metadata files to the publication using the plugin API.  
 5. Publisher commits (publishes) the publication.    The plugin API ensures this is atomic. 
 6. Distributions with policy=auto are updated with new publication_id. 
 </pre> 

 h2. Client makes a GET request for content (or metadata): 

 <pre> 
 1. Request is routed to the content (WSGI) application (just like in pulp2 for RPM). 
 2. Query to get the Distribution and Publication 
 3. Query the PublishedArtifact and PublishedMetadata tables by URL path component to get the artifact or the metadata. 
 4. Find distributor based on URL protocol.    404 when not found. 
 5. forward the artifact storage path (or metadata path) to: 
    <not stored locally> 
        streamer 
    <stored locally> 
        x-send (or stream using django in dev environments) 
 6. Done. 
 </pre> 

 h1. Apache Configuration 

 The platform will provide an /etc/httpd/conf.d/pulp.conf that configures support for HTTP and HTTPS.    Pubished content would be consumed using URLs with a base of:  

 <pre> 
 /pulp/published/<path> 
 </pre> 

 where _path_ is the <Publication.base_path>/(<PublishedArtifact.relative_path>|<PublishedMetadata.relative_path>) 

 h1. Exporting 

 The platform would provide a _resource_ for exporting a _Publication_ in different formats such as ISO and static file trees.

Back