Project

Profile

Help

Story #5202

Story #5110: [Epic] As a user, I can manage Kickstart repositories

As a user, I can sync distribution trees

Added by daviddavis almost 2 years ago. Updated over 1 year ago.

Status:
CLOSED - CURRENTRELEASE
Priority:
Normal
Assignee:
Start date:
Due date:
% Done:

100%

Estimated time:
Platform Release:
Groomed:
Yes
Sprint Candidate:
No
Tags:
Sprint:
Sprint 58
Quarter:

Description

For now, mirror all images and a main yum repo from a kickstart repo.

Design

Have the FirstStage of pulp_rpm try to download and parse the tree info file. From this file create the declarative content unit of the DistributionTree model (the actual content unit). Have the Stages API process this unit like any other.

When actually saving the DistributionTree content unit in the ContentUnitSaver stage, have any additional model creation happen then.

Syncing Sub Repos

Some entries in the kickstart tree identify a "sub repo". that's not an official term, but it's intended to mean another valid repodata folder that identifies another set of RPMs that is outside of the main /repodata/ folder. One main challenge is efficiently syncing these "sub repos".

One easy solution we came up with is to instantiate multiple RPMDeclarativeVersion instances, one for each sub-repo and one for the main repo. This allows each one to work out of a different repository. The trick is to not instantiate more than 1 at once that way rate limiting, etc won't be an issue.

What about remotes for sub Repos?

We should create an in-memory Remote for the sub-repos. It shouldn't modify the real Remote because we don't want to do that. Also it needs all the settings from the original remote, e.g. proxy settings, ssl, etc so all those should be copied. The one difference needs to be the url, it needs to be appended with the sub-path url of the sub-repo from the treeinfo file.

Associated revisions

History

#1 Updated by bmbouter almost 2 years ago

  • Groomed changed from No to Yes
  • Sprint set to Sprint 57

This makes sense to me if the models were to exist.

#2 Updated by bmbouter almost 2 years ago

  • Description updated (diff)

#3 Updated by bmbouter almost 2 years ago

  • Description updated (diff)

#4 Updated by bmbouter almost 2 years ago

If we don't want to extend the Stages API immediately with a "multi-sync" capability, we could instantiate N copies of the RpmDeclarativeVersion pipeline and run N of them on the asyncio scheduler concurrently.

The downside of this approach is that it interferes with the various limiting features the Stages API itself which is ok in the short term but not the long. Specifically download concurrency would be N times what the user expects, also the memory restriction features would be N times larger.

#5 Updated by bmbouter almost 2 years ago

  • Description updated (diff)

#6 Updated by rchan almost 2 years ago

  • Sprint changed from Sprint 57 to Sprint 58

#7 Updated by fao89 almost 2 years ago

  • Status changed from NEW to POST

#8 Updated by fao89 almost 2 years ago

  • Assignee set to fao89

#9 Updated by fao89 almost 2 years ago

  • Assignee deleted (fao89)

#10 Updated by ipanova@redhat.com almost 2 years ago

  • Assignee set to fao89

#11 Updated by Anonymous almost 2 years ago

  • Status changed from POST to MODIFIED
  • % Done changed from 0 to 100

#12 Updated by ttereshc over 1 year ago

  • Status changed from MODIFIED to CLOSED - CURRENTRELEASE

#13 Updated by ttereshc over 1 year ago

  • Sprint/Milestone set to Pulp 3.x RPM (Katello 3.16)

Please register to edit this issue

Also available in: Atom PDF