Project

Profile

Help

Story #5202

closed

Story #5110: [Epic] As a user, I can manage Kickstart repositories

As a user, I can sync distribution trees

Added by daviddavis over 5 years ago. Updated almost 5 years ago.

Status:
CLOSED - CURRENTRELEASE
Priority:
Normal
Assignee:
Start date:
Due date:
% Done:

100%

Estimated time:
Platform Release:
Groomed:
Yes
Sprint Candidate:
No
Tags:
Sprint:
Sprint 58
Quarter:

Description

For now, mirror all images and a main yum repo from a kickstart repo.

Design

Have the FirstStage of pulp_rpm try to download and parse the tree info file. From this file create the declarative content unit of the DistributionTree model (the actual content unit). Have the Stages API process this unit like any other.

When actually saving the DistributionTree content unit in the ContentUnitSaver stage, have any additional model creation happen then.

Syncing Sub Repos

Some entries in the kickstart tree identify a "sub repo". that's not an official term, but it's intended to mean another valid repodata folder that identifies another set of RPMs that is outside of the main /repodata/ folder. One main challenge is efficiently syncing these "sub repos".

One easy solution we came up with is to instantiate multiple RPMDeclarativeVersion instances, one for each sub-repo and one for the main repo. This allows each one to work out of a different repository. The trick is to not instantiate more than 1 at once that way rate limiting, etc won't be an issue.

What about remotes for sub Repos?

We should create an in-memory Remote for the sub-repos. It shouldn't modify the real Remote because we don't want to do that. Also it needs all the settings from the original remote, e.g. proxy settings, ssl, etc so all those should be copied. The one difference needs to be the url, it needs to be appended with the sub-path url of the sub-repo from the treeinfo file.

Actions #1

Updated by bmbouter over 5 years ago

  • Groomed changed from No to Yes
  • Sprint set to Sprint 57

This makes sense to me if the models were to exist.

Actions #2

Updated by bmbouter over 5 years ago

  • Description updated (diff)
Actions #3

Updated by bmbouter over 5 years ago

  • Description updated (diff)
Actions #4

Updated by bmbouter over 5 years ago

If we don't want to extend the Stages API immediately with a "multi-sync" capability, we could instantiate N copies of the RpmDeclarativeVersion pipeline and run N of them on the asyncio scheduler concurrently.

The downside of this approach is that it interferes with the various limiting features the Stages API itself which is ok in the short term but not the long. Specifically download concurrency would be N times what the user expects, also the memory restriction features would be N times larger.

Actions #5

Updated by bmbouter over 5 years ago

  • Description updated (diff)
Actions #6

Updated by rchan over 5 years ago

  • Sprint changed from Sprint 57 to Sprint 58
Actions #7

Updated by fao89 over 5 years ago

  • Status changed from NEW to POST
Actions #8

Updated by fao89 over 5 years ago

  • Assignee set to fao89
Actions #9

Updated by fao89 over 5 years ago

  • Assignee deleted (fao89)
Actions #10

Updated by ipanova@redhat.com over 5 years ago

  • Assignee set to fao89
Actions #11

Updated by Anonymous over 5 years ago

  • Status changed from POST to MODIFIED
  • % Done changed from 0 to 100
Actions #12

Updated by ttereshc almost 5 years ago

  • Status changed from MODIFIED to CLOSED - CURRENTRELEASE
Actions #13

Updated by ttereshc almost 5 years ago

  • Sprint/Milestone set to Pulp 3.x RPM (Katello 3.16)

Also available in: Atom PDF