Project

Profile

Help

Story #5067

As a user, multiple source/target repositories can be used for recursive copy

Added by ttereshc 5 months ago. Updated about 2 months ago.

Status:
CLOSED - CURRENTRELEASE
Priority:
High
Assignee:
Category:
-
Sprint/Milestone:
-
Start date:
Due date:
% Done:

0%

Platform Release:
2.21.0
Blocks Release:
Backwards Incompatible:
No
Groomed:
No
Sprint Candidate:
No
Tags:
Pulp 2
QA Contact:
Complexity:
Smash Test:
Verified:
No
Verification Required:
No
Sprint:
Sprint 58

Description

Motivation

Dependencies for content can be present in multiple repositories.
The problem existed before but became more apparent with the introduction of modularity.
Sometimes repositories are split into 2, modular and non-modular, instead of being a hybrid repo.
For such cases it's essential to have an ability to specify multiple pairs of source/target repositories, otherwise dependency soling can't be performed correctly.

Suggestion

Modify the unit association API endpoint0 to accept a new additional argument. edit: dalley Instead of a new parameter, we pass it through the override config - this prevents other plugins from being impacted by a change to the method signatures. While it's not "ideal", there is precedent for it because that's how the recursive flags work.

POST to /pulp/api/v2/repositories/<destination_repo_id>/actions/associate/ will take an optional parameter called additional_repos. This will be a dictionary where each key is a source repository id and the value is the destination repository id. The additional repositories would then also be used during the recursive copy only. The dispatched task will need to lock on all the destination repositories.

Sample request:

{
  'source_repo_id' : 'pulp-f17',
  'criteria': {
    'type_ids' : ['rpm'],
    'filters' : {
      'unit' : {
        '$and': [{'name': {'$regex': 'p.*'}}, {'version': {'$gt': '1.0'}}]
      }
    }
  },
 'override_config': {
   'recursive': true,
   'additional_repos': {'source2': 'destination2', 'source3': 'destination3'},
  },
}

The response format will stay the same as it is now:

"result": {
  "units_successful": [
    {
      "unit_key": {
        "name": "whale",
        "checksum": "3b34234afc8b8931d627f8466f0e4fd352145a2512681ec29db0a051a0c9d893",
        "epoch": "0",
        "version": "0.2",
        "release": "1",
        "arch": "noarch",
        "checksumtype": "sha256" 
      },
      "type_id": "rpm" 
    }
  ]
}

[0] https://docs.pulpproject.org/dev-guide/integration/rest-api/content/associate.html#copying-units-between-repositories


Related issues

Related to RPM Support - Task #5237: Add CLI support for "additional repos" api CLOSED - WONTFIX Actions
Related to Pulp - Test #5242: Test copy using "additional_repos" to provide multiple source/destination repos via override CLOSED - COMPLETE Actions
Related to RPM Support - Issue #5449: Multiple source repos copy of errata produces different results NEW Actions
Blocked by Pulp - Story #5108: As a user, a task can reserve multiple resources CLOSED - CURRENTRELEASE Actions

Associated revisions

Revision 87585b7a View on GitHub
Added by dalley 4 months ago

Add support for multiple input repositories to the solver

  • Add a new "additional_repos" kwarg to the Solver class
  • Make provisions for loading all of the new additional repos
  • Make all target repositories be loaded into one repo, since libsolv
    only supports a single "installed" repo
  • Make the "find_dependent_rpms" method return a dict of {'repo_id':
    set(<unit_ids>)} so that the upper-level copy code knows where the units
    came from, and hence, can determine where they need to go.

re: #5067
https://pulp.plan.io/issues/5067

Revision e9f8d257 View on GitHub
Added by dalley 3 months ago

Use multi-resource locking when additional_repos override is passed

The RPM plugin needs to accept more than one source and destination
repository for copy operations. The "additional_repos" flag is being
added to deal with this. Core needs to look for this flag in the
override config and then, when present, lock on all the repo ids present
within as well as the directly-specified source and destination repo
ids.

re: #5067
https://pulp.plan.io/issues/5067

Revision 45373820 View on GitHub
Added by dalley 3 months ago

Only lock destination repos, not source

Locking all source repos too may degrade performance and despite a
minimal risk by not doing so, the tradeoff is likely worth it. It is
also the current behavior.

re: #5067
https://pulp.plan.io/issues/5067

Revision 1aef2cbc View on GitHub
Added by dalley 2 months ago

Refactor the entire pipeline so that depsolving comes at the very end

Depsolving has to happen after all the direct child units are discovered. It
also can't happen inside copy_rpms() because that conflates depsolving
and copying, which is a problem. We need these to be separate
operations.

re: #5067
https://pulp.plan.io/issues/5067

History

#1 Updated by ttereshc 5 months ago

  • Description updated (diff)

#2 Updated by ttereshc 5 months ago

  • Tags Pulp 2 added

#4 Updated by dkliban@redhat.com 4 months ago

  • Description updated (diff)

#5 Updated by ttereshc 4 months ago

  • Blocked by Story #5108: As a user, a task can reserve multiple resources added

#6 Updated by dalley 4 months ago

Support added to the depsolver library in this PR: https://github.com/pulp/pulp_rpm/pull/1407

This PR does not add support for copying from multiple source repos as a whole. The work that remains:

  • Add the API to pulp_rpm
  • Pass the new data from the API into the associate and copy_rpms functions
  • Pass the new data into the solver inside the copy_rpms function, which now provides support for passing that information in
  • Inside the copy_rpms function, after receiving a dictionary of repos and the units to copy from those repos from the find_dependent_rpms function, copy those units to the correct places by taking into account the repo they came from and the additional_repo data from the API

#7 Updated by dalley 3 months ago

  • Status changed from NEW to ASSIGNED

#8 Updated by dalley 3 months ago

  • Assignee set to dalley

#9 Updated by dalley 3 months ago

  • Priority changed from Normal to High
  • Sprint set to Sprint 57

#10 Updated by dalley 3 months ago

  • Related to Task #5237: Add CLI support for "additional repos" api added

#11 Updated by dalley 3 months ago

  • Related to Test #5242: Test copy using "additional_repos" to provide multiple source/destination repos via override added

#12 Updated by dalley 3 months ago

  • Description updated (diff)

#13 Updated by dalley 3 months ago

  • Platform Release set to 2.21.0

#14 Updated by dalley 3 months ago

  • Status changed from ASSIGNED to POST

#15 Updated by rchan 3 months ago

  • Sprint changed from Sprint 57 to Sprint 58

#16 Updated by dalley 2 months ago

  • Status changed from POST to MODIFIED

#17 Updated by kersom 2 months ago

  • Related to Issue #5449: Multiple source repos copy of errata produces different results added

#18 Updated by dalley about 2 months ago

  • Status changed from MODIFIED to CLOSED - CURRENTRELEASE

Please register to edit this issue

Also available in: Atom PDF