Story #4866
closedallow fetching obsolete artifacts from snapshot.debian.org
0%
Description
Ticket moved to GitHub: "pulp/pulp_deb/382":https://github.com/pulp/pulp_deb/issues/382
Debian removes old versions of packages that have been updated from their mirrors. Because of that using lazy sync against official Debian mirrors might end up with incomplete repos when Pulp syncs metadata on day 1, on day 2 a package gets updated/replaced and on day 3 pulp tries to actually get that package.
Debian offers a snapshot service at https://snapshot.debian.org/, where you can get almost every file ever seen in the Debian archive by accessing https://snapshot.debian.org/file/HASH (where HASH is currently the SHA1 hash of the file). The API docs of snapshot can be found at https://salsa.debian.org/snapshot-team/snapshot/raw/master/API
The idea would be to allow a secondary source in pulp_deb that is tried when the primary mirror replied with a 404.
This is probably also useful for other backends, not just Debian.
Snapshot currently only supports sha1 hashes, but the authors promised to accept patches for sha256 if anyone would write one (the source of the service is in the same repo as the api doc linked above).
Updated by mdellweg over 5 years ago
As we save all the hashes, SHA1 should work.
Updated by quba42 over 4 years ago
Can somebody explain to me how exactly lazy sync works.
In particular, how exactly is the repository associated with the relevant remote? (If I am only actually downloading packages when they are finally needed, then I guess I need to find a path of associations from a finished distribution back to the remote?)
Would the "secondary/fallback" content source be stored in a remote or elsewhere?
Updated by pulpbot about 3 years ago
- Description updated (diff)
- Status changed from NEW to CLOSED - DUPLICATE