Project

Profile

Help

Story #6375

Updated by ttereshc almost 4 years ago

## Motivation 

 Currently, Remotes are recreated on every run. When Remotes are removed, RemoteArtifacts are removed as well. 
 We sent through the content migration pipeline all the content on every run only to recreate RemoteArtifacts. 
 We need a way to minimise the amount of content which goes through the pipeline during migration re-run. 
 If there are no changes in pulp2, almost no content should go through the migration pipeline. 

 ## Proposal 

 Track Remotes, if they were changed or not.  
 Track any lazy catalog changes, were there any updates or not. 
 Recreate Remotes only if needed. 
 Recreate RemoteArtifacts only for the affected content. 


 ##### Track Remotes, recreate Remotes only if needed. 

 [This commit](https://github.com/pulp/pulp-2to3-migration/commit/809715843fe093b614e9b0cef9a09a60e7a47ebc) removed some initial attempts to track the Remotes, see the deletion in `mark_removed_resources` and `pre_migrate_*`. 

  - Pulp2Importer model for pre-migration needs a new field `not_in_plan` 
  - Update `not_in_plan` `not_in_plan accordingly in the `mark_removed_resources` step 
  - When pre-migrating an importer, determine if there were any changes and mark it by setting `is_migrated` to False 
  - When pre-migrating an importer, If there are changes to the feed or any configuration which is available on RemoteArtifact, remove such Remote from pulp3 
  - At the migration step, update a Remote for each Pulp2Importer which has `pulp3_remote` `pulp3content` but `is_migrated=False` 
  - At the migration step create a Remote for each Pulp2Importer    which has no `pulp3_remote` `pulp3content` 


 ##### Track any lazy catalog changes 

  - Move `pre_migrate_all_content` step to happen before any importers are migrated to Pulp 3 
  - Pulp2LazyCatalog needs a new field `is_migrated` 
  - At pre-migration time `is_migrated` field should be set 
  - At pre-migration time, check if the lazy catalog entry is for the Pulp2Importer which has `pulp3_remote` `pulp3content` or not. 
       - If a Remote exists, it means it hasn't changed or the changes were not critical enough to recreate it, thus lazy catalog for this importer is already migrated, set `is_migrated=True`. 
  - At content migration time `is_migrated` will help to identify whether we need to create RemoteArtifacts for content or not 

 ##### Recreate RemoteArtifacts only for the affected content 

 - At content migration stage, determine which migrated Pulp2Content (aka it has `pulp3content`) needs RemoteArtifact creation by querying pre-migrated lazy catalog entires which are `is_migrated=False` 
 - Create RemoteArtifatcs for those either by sending the content through the content migration pipeline or by explicitly creating RemoteArtifatcs for them 
  - Pulp2Content which has no `pulp3content` needs to be migrated fully 
  - I *think* mutable content will be handled properly because if it changed new Pulp2Content is created and old records a re removed 




Back