Project

Profile

Help

Story #5067

closed

As a user, multiple source/target repositories can be used for recursive copy

Added by ttereshc about 5 years ago. Updated almost 5 years ago.

Status:
CLOSED - CURRENTRELEASE
Priority:
High
Assignee:
Sprint/Milestone:
-
Start date:
Due date:
% Done:

0%

Estimated time:
Platform Release:
2.21.0
Groomed:
No
Sprint Candidate:
No
Tags:
Pulp 2
Sprint:
Sprint 58
Quarter:

Description

Motivation

Dependencies for content can be present in multiple repositories.
The problem existed before but became more apparent with the introduction of modularity.
Sometimes repositories are split into 2, modular and non-modular, instead of being a hybrid repo.
For such cases it's essential to have an ability to specify multiple pairs of source/target repositories, otherwise dependency soling can't be performed correctly.

Suggestion

Modify the unit association API endpoint[0] to accept a new additional argument. edit: dalley Instead of a new parameter, we pass it through the override config - this prevents other plugins from being impacted by a change to the method signatures. While it's not "ideal", there is precedent for it because that's how the recursive flags work.

POST to /pulp/api/v2/repositories/<destination_repo_id>/actions/associate/ will take an optional parameter called additional_repos. This will be a dictionary where each key is a source repository id and the value is the destination repository id. The additional repositories would then also be used during the recursive copy only. The dispatched task will need to lock on all the destination repositories.

Sample request:

{
  'source_repo_id' : 'pulp-f17',
  'criteria': {
    'type_ids' : ['rpm'],
    'filters' : {
      'unit' : {
        '$and': [{'name': {'$regex': 'p.*'}}, {'version': {'$gt': '1.0'}}]
      }
    }
  },
 'override_config': {
   'recursive': true,
   'additional_repos': {'source2': 'destination2', 'source3': 'destination3'},
  },
}

The response format will stay the same as it is now:

"result": {
  "units_successful": [
    {
      "unit_key": {
        "name": "whale",
        "checksum": "3b34234afc8b8931d627f8466f0e4fd352145a2512681ec29db0a051a0c9d893",
        "epoch": "0",
        "version": "0.2",
        "release": "1",
        "arch": "noarch",
        "checksumtype": "sha256"
      },
      "type_id": "rpm"
    }
  ]
}

[0] https://docs.pulpproject.org/dev-guide/integration/rest-api/content/associate.html#copying-units-between-repositories


Related issues

Related to RPM Support - Task #5237: Add CLI support for "additional repos" apiCLOSED - WONTFIX

Actions
Related to Pulp - Test #5242: Test copy using "additional_repos" to provide multiple source/destination repos via overrideCLOSED - COMPLETEkersomActions
Related to RPM Support - Issue #5449: Multiple source repos copy of errata produces different resultsCLOSED - WONTFIXggaineyActions
Blocked by Pulp - Story #5108: As a user, a task can reserve multiple resourcesCLOSED - CURRENTRELEASEggainey

Actions
Actions #1

Updated by ttereshc about 5 years ago

  • Description updated (diff)
Actions #2

Updated by ttereshc about 5 years ago

  • Tags Pulp 2 added
Actions #4

Updated by dkliban@redhat.com about 5 years ago

  • Description updated (diff)
Actions #5

Updated by ttereshc about 5 years ago

  • Blocked by Story #5108: As a user, a task can reserve multiple resources added
Actions #6

Updated by dalley about 5 years ago

Support added to the depsolver library in this PR: https://github.com/pulp/pulp_rpm/pull/1407

This PR does not add support for copying from multiple source repos as a whole. The work that remains:

  • Add the API to pulp_rpm
  • Pass the new data from the API into the associate and copy_rpms functions
  • Pass the new data into the solver inside the copy_rpms function, which now provides support for passing that information in
  • Inside the copy_rpms function, after receiving a dictionary of repos and the units to copy from those repos from the find_dependent_rpms function, copy those units to the correct places by taking into account the repo they came from and the additional_repo data from the API

Added by dalley about 5 years ago

Revision 87585b7a | View on GitHub

Add support for multiple input repositories to the solver

  • Add a new "additional_repos" kwarg to the Solver class
  • Make provisions for loading all of the new additional repos
  • Make all target repositories be loaded into one repo, since libsolv only supports a single "installed" repo
  • Make the "find_dependent_rpms" method return a dict of {'repo_id': set(<unit_ids>)} so that the upper-level copy code knows where the units came from, and hence, can determine where they need to go.

re: #5067 https://pulp.plan.io/issues/5067

Actions #7

Updated by dalley almost 5 years ago

  • Status changed from NEW to ASSIGNED
Actions #8

Updated by dalley almost 5 years ago

  • Assignee set to dalley
Actions #9

Updated by dalley almost 5 years ago

  • Priority changed from Normal to High
  • Sprint set to Sprint 57

Added by dalley almost 5 years ago

Revision e9f8d257 | View on GitHub

Use multi-resource locking when additional_repos override is passed

The RPM plugin needs to accept more than one source and destination repository for copy operations. The "additional_repos" flag is being added to deal with this. Core needs to look for this flag in the override config and then, when present, lock on all the repo ids present within as well as the directly-specified source and destination repo ids.

re: #5067 https://pulp.plan.io/issues/5067

Added by dalley almost 5 years ago

Revision 45373820 | View on GitHub

Only lock destination repos, not source

Locking all source repos too may degrade performance and despite a minimal risk by not doing so, the tradeoff is likely worth it. It is also the current behavior.

re: #5067 https://pulp.plan.io/issues/5067

Actions #10

Updated by dalley almost 5 years ago

  • Related to Task #5237: Add CLI support for "additional repos" api added
Actions #11

Updated by dalley almost 5 years ago

  • Related to Test #5242: Test copy using "additional_repos" to provide multiple source/destination repos via override added
Actions #12

Updated by dalley almost 5 years ago

  • Description updated (diff)
Actions #13

Updated by dalley almost 5 years ago

  • Platform Release set to 2.21.0
Actions #14

Updated by dalley almost 5 years ago

  • Status changed from ASSIGNED to POST
Actions #15

Updated by rchan almost 5 years ago

  • Sprint changed from Sprint 57 to Sprint 58

Added by dalley almost 5 years ago

Revision 1aef2cbc | View on GitHub

Refactor the entire pipeline so that depsolving comes at the very end

Depsolving has to happen after all the direct child units are discovered. It also can't happen inside copy_rpms() because that conflates depsolving and copying, which is a problem. We need these to be separate operations.

re: #5067 https://pulp.plan.io/issues/5067

Actions #16

Updated by dalley almost 5 years ago

  • Status changed from POST to MODIFIED
Actions #17

Updated by kersom almost 5 years ago

  • Related to Issue #5449: Multiple source repos copy of errata produces different results added
Actions #18

Updated by dalley almost 5 years ago

  • Status changed from MODIFIED to CLOSED - CURRENTRELEASE

Also available in: Atom PDF