Story #5202
closedStory #5110: [Epic] As a user, I can manage Kickstart repositories
As a user, I can sync distribution trees
100%
Description
For now, mirror all images and a main yum repo from a kickstart repo.
Design¶
Have the FirstStage of pulp_rpm try to download and parse the tree info file. From this file create the declarative content unit of the DistributionTree model (the actual content unit). Have the Stages API process this unit like any other.
When actually saving the DistributionTree content unit in the ContentUnitSaver stage, have any additional model creation happen then.
Syncing Sub Repos¶
Some entries in the kickstart tree identify a "sub repo". that's not an official term, but it's intended to mean another valid repodata folder that identifies another set of RPMs that is outside of the main /repodata/ folder. One main challenge is efficiently syncing these "sub repos".
One easy solution we came up with is to instantiate multiple RPMDeclarativeVersion instances, one for each sub-repo and one for the main repo. This allows each one to work out of a different repository. The trick is to not instantiate more than 1 at once that way rate limiting, etc won't be an issue.
What about remotes for sub Repos?¶
We should create an in-memory Remote for the sub-repos. It shouldn't modify the real Remote because we don't want to do that. Also it needs all the settings from the original remote, e.g. proxy settings, ssl, etc so all those should be copied. The one difference needs to be the url, it needs to be appended with the sub-path url of the sub-repo from the treeinfo file.
Updated by bmbouter over 5 years ago
- Groomed changed from No to Yes
- Sprint set to Sprint 57
This makes sense to me if the models were to exist.
Updated by bmbouter about 5 years ago
If we don't want to extend the Stages API immediately with a "multi-sync" capability, we could instantiate N copies of the RpmDeclarativeVersion pipeline and run N of them on the asyncio scheduler concurrently.
The downside of this approach is that it interferes with the various limiting features the Stages API itself which is ok in the short term but not the long. Specifically download concurrency would be N times what the user expects, also the memory restriction features would be N times larger.
Updated by fao89 about 5 years ago
- Assignee deleted (
fao89)
Added by Fabricio Aguiar about 5 years ago
Updated by Anonymous about 5 years ago
- Status changed from POST to MODIFIED
- % Done changed from 0 to 100
Applied in changeset 0057cab262fa5afa0ee9164f74b1ce60c8fdcc87.
Updated by ttereshc almost 5 years ago
- Status changed from MODIFIED to CLOSED - CURRENTRELEASE
Updated by ttereshc over 4 years ago
- Sprint/Milestone set to Pulp 3.x RPM (Katello 3.16)
kickstart syncing
closes #5202 https://pulp.plan.io/issues/5202 Required PR: https://github.com/pulp/pulp_rpm/pull/1418