Issue #8720
Updated by dalley over 3 years ago
RPM metadata should be published in-order, which helps with compression efficency (see associated BZ). createrepo_c does this, but not via the library itself, so Pulp is still publishing unordered metadata.
Note that the metadata is "fine", it works, it's just inefficient to compress.
The appropriate sort key is the filename / location_href, as this is what createrepo_c uses. When we iterate over the package queryset, we should order_by
Problem: Pulp mixes location_href's together from many different repositories, and because they are meaningless, it basically ignores them. So we store useless data in the database.
We should remove the location_href and location_base fields (the latter is entirely unused), and replace them with just a filename, which we can possibly use to reconstruct a location_href if we need to keep it for backwards compatibility. Then we can properly sort by it, and we can use it directly in various places without needing to rewrite the value constantly.
It is not a "real" part of the RPM package metadata, only a value which createrepo_c happens to provide on the objects which we copied over. This probably shouldn't have been done.
We should therefore sort by the filename