Project

Profile

Help

Issue #9567

closed

More fault tollerant metadata parsing

Added by holger.hees over 2 years ago. Updated over 2 years ago.

Status:
CLOSED - DUPLICATE
Priority:
Normal
Assignee:
-
Sprint/Milestone:
-
Start date:
11/12/2021
Due date:
Estimated time:
Severity:
2. Medium
Version:
Platform Release:
OS:
Triaged:
No
Groomed:
No
Sprint Candidate:
No
Tags:
Sprint:
Quarter:

Description

This fixes parsing metadata for repositories where the filelists does not contain package entries like for gitlab repos.

Check https://packages.gitlab.com/gitlab/gitlab-ce/scientific/7/x86_64/repodata/repomd.xml

there you find a filelists document which contains only

< ? xml version="1.0" encoding="UTF-8" ? > < filelists xmlns="http://linux.duke.edu/metadata/filelists" packages="0" / >

https://github.com/pulp/pulp_rpm/pull/2174


Related issues

Is duplicate of RPM Support - Refactor #9309: Add support for new memory-efficient createrepo_c parsing methodCLOSED - DUPLICATEdalley

Actions
Actions #1

Updated by pulpbot over 2 years ago

  • Status changed from NEW to POST
Actions #3

Updated by dalley over 2 years ago

  • Version deleted (3.3.1)

I'm closing the PR since we'd like to get rid of the current implementation of "iterative" parsing entirely as soon as we are able to do so.

Regarding the issue, DNF works, with the repos, which makes it kind of de-facto valid, although I've never seen any repository tool create metadata that looks like this before.

My best guess for what packages.gitlab.com is using for this is artifactory, but again, I'm not totally sure. Their docs say:

Indexing the File List

The filelists.xml metadata file of an RPM repository contains a list of all the files in each package hosted in the repository. When the repository contains many packages, reindexing this file as a result of interactions with the YUM client can be resource intensive causing a degradation of performance. Therefore, from version 5.4, reindexing this file is initially disabled when an RPM repository is created. To enable indexing filelists.xml, set the Enable File List Indexing checkbox.

Note that the filelists.xml metadata file for a virtual repository may not be complete (i.e. it may not actually list all the files it aggregates) if any of the repositories it aggregates do not have file listing enabled. Note that if indexing of the filelists.xml file is disabled, it is not possible to search for a file using the YUM client to determine which package wrote the queried file to the filesystem.

I actually don't think this is universally true, it seems like perhaps it came out of some more recent developments and isn't applicable to older versions of yum, but nonetheless we probably do want to attempt to support it.

A little more info here - following some of the links here it looks like a few years ago downloading filelists was made optional: https://github.com/coreos/rpm-ostree/issues/1127#issue-278715219

Actions #4

Updated by dalley over 2 years ago

  • Status changed from POST to CLOSED - DUPLICATE
Actions #5

Updated by dalley over 2 years ago

  • Is duplicate of Refactor #9309: Add support for new memory-efficient createrepo_c parsing method added

Also available in: Atom PDF