Reduce sync time spent processing metadata up-front
The rpm sync workflow spends a lot of time early on parsing metadata (filelist.xml and other.xml) and writing it to disk in an indexed data structure. Actual downloading of files doesn't happen until afterward. This can take multiple minutes.
There are several options for improving this workflow.
- start downloading before indexing is complete. This would require a new queue of completed downloads, and an additional thread that finishes processing and saving them once the indexing is complete.
- Only index the rpms that will actually be downloaded. This would not help on the first sync (except in rare cases).
- use the sqlite file (if available) to get this metadata instead of indexing the data ourselves. The downside here is we'd have to convert it to XML.
- there may be other options
This refactor is to employ whatever improvements are reasonable to reduce the amount of time spent chewing on metadata before downloading starts.
Please register to edit this issue