Project

Profile

Help

Story #3982

Updated by ttereshc over 5 years ago

h3. Motivation 

 * Currently there is no way to copy modules with all their artifacts without bringing in all dependencies. 
 * Internally Pulp parses modulemd artifact names (specified in metadata) and looks for each RPM by NEVRA every time it needs to know which RPM units belong to a module. 

 The modulemd fetch content units call returns a list like this   
 <pre> 
 {"metadata"=> 
   {"_storage_path"=> 
     "/var/lib/pulp/content/units/modulemd/d6/3abf7a6a5638d4aeb257aea68e09f8ea39b017aed73d196b7f9a6bd9d1ecfd", 
    "name"=>"django", 
    "stream"=>"1.6", 
    "artifacts"=> 
     ["python-django-bash-completion-0:1.6.11.7-1.module_1560+089ce146.noarch", 
      "python2-django-0:1.6.11.7-1.module_1560+089ce146.noarch"], 
    "checksum"=> 
     "5c6054966a7981e48e2e8b2b7f9e2a33fc58ae36cb7aeab9a0cb096b16739f50", 
    "_last_updated"=>1533230776, 
    "_content_type_id"=>"modulemd", 
    "profiles"=> 
     {"default"=>["python2-django"], "python2_development"=>["python2-django"]}, 
    "summary"=>"A high-level Python Web framework", 
    "downloaded"=>true, 
    "version"=>20180307130104, 
    "pulp_user_metadata"=>{}, 
    "context"=>"c2c572ec", 
    "_ns"=>"units_modulemd", 
    "_id"=>"33d9aff8-2c70-42ac-b0fc-0a8eef87f266", 
    "arch"=>"noarch", 
    "description"=> 
     "Django is a high-level Python Web framework that encourages rapid development and a clean, pragmatic design. It focuses on automating as much as possible and adhering to the DRY (Don't Repeat Yourself) principle."}, 
  "updated"=>"2018-08-02T17:26:16Z", 
  "repo_id"=>"311e01ab-29b7-4b3c-90f4-29b17480b22e", 
  "created"=>"2018-08-02T17:26:16Z", 
  "unit_type_id"=>"modulemd", 
  "unit_id"=>"33d9aff8-2c70-42ac-b0fc-0a8eef87f266", 
  "_id"=>{"$oid"=>"5b633eb8cc36bbe621415477"}} 

 </pre> 

 Note: 
 <pre> 
    "artifacts"=> 
     ["python-django-bash-completion-0:1.6.11.7-1.module_1560+089ce146.noarch", 
      "python2-django-0:1.6.11.7-1.module_1560+089ce146.noarch"], 
 </pre> 

 What katello would like with respect to the the publish operation is a rpm uuid/unit it mapping for each of these rpms. This will aid katello in accounting for rpms that got copied over and hence make the determination on the modules to copy over. 

 h3. h2. Suggested API change 

 Artifacts field can't be modified due to semver reasons. 
 Add a new field to the output of a module: 
 <pre> 
    "pulp_artifacts_map"=> 
     [{"filename": "python-django-bash-completion-1.6.11.7-1.module_1560+089ce146.noarch",  
        "unit_id": <uuid>}, 
      {"filename":"python2-django-1.6.11.7-1.module_1560+089ce146.noarch", 
       "unit_id": <uuid>}] 
 </pre> 

 @filename@ should correspond to the filename in Pulp and can/will be different from the one mentioned in the @artifacts@ field. 
 @unit_id@ is a UUID of an RPM related to the module of interest. 
 Note: at the moment only RPMs can be present in the artifacts, so it's not necessary to have @type@ for each entry of the @pulp_artifacts_map@. At any point later it can be added if needed, since it's an additive change. 

 h3. Suggested solution 

 Create a separate collection which maps a module to an RPM, @modulemd_artifact_map@ 
 Two fields: @modulemd_id@ and @artifact_id@, and they are unique together (an RPM can belong to multiple modules) 
 Question: Do we want it to be a more generic map? parent_id and child_id? E.g. is there a need to map modulemd_defaults to modulemds in the future? 

 Records are created at sync or upload time. 
 Records are removed in a post_delete hook of Modulemd model. 

 <unfinished> 

Back