Project

Profile

Help

Task #4697

Current implementation of digest specification has many issues - remove it

Added by dalley 6 months ago. Updated 6 months ago.

Status:
MODIFIED
Priority:
Normal
Assignee:
-
Category:
-
Sprint/Milestone:
-
Start date:
Due date:
% Done:

100%

Platform Release:
Blocks Release:
Target Release - Python:
Backwards Incompatible:
No
Groomed:
No
Sprint Candidate:
No
Tags:
QA Contact:
Complexity:
Smash Test:
Verified:
No
Verification Required:
No
Sprint:

Description

Currently our "includes" field looks like this:

includes:='[
        { "name": "django",
          "version_specifier": "~=2.0,!=2.0.1",
          "digests":[
                {"type": "sha256",
                 "digest": "3d9916515599f757043c690ae2b5ea28666afa09779636351da505396cbb2f19"}
          ]
        },
        {"name": "pip-tools",
         "version_specifier": ">=1.12,<=2.0"},
        {"name": "scipy",
         "digests":[
            {"type": "md5",
            "digest": "044af71389ac2ad3d3ece24d0baf4c07"},
            {"type": "sha256",
            "digest": "18b572502ce0b17e3b4bfe50dcaea414a98290358a2fa080c36066ba0651ec14"}]
        },
        {"name": "shelf-reader"}
    ]'

These "digests" are a bit of an ergonomic problem, and the tooling to make them useful as something the user manually inputs doesn't exist. Further, the way Pulp works is inconsistent with the tooling that does exist.

Ergonomic problem:

While the requirements.txt format does support specifying hashes, there's no easy way to get them, since no tools give you a way to output them. Pip freeze unfortunately doesn't support it, so you have to either parse the API output, or hash the files yourself, or use a tool like pipenv or poetry to get them. So while we might eventually want to support this, I don't think it makes any sense to expose it to the user manually. It can be a feature that only gets used when ingesting a pipenv.lock or similar kind of file.

Existing behavior is inconsistent with how other tooling treats them

For all the tools which do support digests, they are only defined for pinned versions. E.g. it is completely invalid to do this:

shelf-reader>=0.1 --hash=sha256:04cfd8bb4f843e35d51bfdef2035109bdea831b55a57c3e6a154d14be116398c
--hash=sha256:2eceb1643c10c5e4a65970baf63bde43b79cbdac7de81dae853ce47ab05197e9

It will fail with the following error:

In --require-hashes mode, all requirements must have their versions pinned with ==. These do not:
shelf-reader>=0.1 from https://files.pythonhosted.org/packages/3a/e3/a6954c4134a899c0006515fbd40208922572947e960b35d0d19fd5a1b3d0/shelf-reader-0.1.tar.gz#sha256=04cfd8bb4f843e35d51bfdef2035109bdea831b55a57c3e6a154d14be116398c (from -r requirements.txt (line 1))

We currently allow adding digests to any specifier, which is inconsistent with the broader ecosystem. No tooling will output a specifier like this, and I don't see a reason for a user to do so manually.

Similarly, pip will not allow you to specify hashes for some packages and not for others, including for their dependencies. In other words, it is invalid to do something like:

Django==2.1 --hash=sha256:04cfd8bb4f843e35d51bfdef2035109bdea831b55a57c3e6a154d14be116398c
--hash=sha256:2eceb1643c10c5e4a65970baf63bde43b79cbdac7de81dae853ce47ab05197e9

This is invalid because, as the Pip docs describe, enabling hash-checking mode is an all-or-nothing proposition. Django pulls in dependencies such as pytz, and because those hashes are not also defined, installing that requirements file will fail. In this sense, Pulp is also acting inconsistently with the broader ecosystem by allowing digests to be specified for some packages but not for others.

This makes complete sense when you think about it. We don't want to provide anything less than a full guarantee to users who want the reproducability use case. There's little use in specifying one single package by digest when all of it's dependencies could be different. The all-or-nothing nature of the tooling likely better matches the desires of those users moreso than what Pulp is currently doing.

Proposal:

Since the way this feature is currently implemented is not going to work out long-term, I think we should remove all the code related to digests in the remote and in the sync task. Once we settle on a better plan, we can consider bringing some of it back.

Associated revisions

Revision 8fdc635b View on GitHub
Added by dalley 6 months ago

Remove digest specification feature for now

closes #4697
https://pulp.plan.io/issues/4697

History

#1 Updated by dalley 6 months ago

  • Status changed from NEW to MODIFIED
  • % Done changed from 0 to 100

#2 Updated by bmbouter 6 months ago

  • Tags deleted (Pulp 3)

Please register to edit this issue

Also available in: Atom PDF