Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Produce a pure-Python verification API #770

Open
di opened this issue Sep 12, 2023 · 6 comments
Open

Produce a pure-Python verification API #770

di opened this issue Sep 12, 2023 · 6 comments
Labels
enhancement New feature or request

Comments

@di
Copy link
Member

di commented Sep 12, 2023

Description

Some installers that may want to eventually perform signature verification have a hard requirement that all their dependencies are pure-Python (pip is the predominant example, because it vendors all its dependencies into a single pure-Python wheel).

Because sigstore-python has sub-dependencies that ship non-pure Python wheels, it's not immediately usable from these installers. However, installers will specifically only use a subset of our overall API (presumably just verification) and might not have a need for all the dependencies we have with native code.

Given that, we should:

  • identify how, where and why native code is used as a sub-dependency of this project
  • identity if any of those examples are dependencies for verification
  • split our verification logic out into a separate, pure-Python library, with our existing verification API
  • take a dependency on that library to provide the same API here

At a high level, looking at current sub-dependencies that ship non-pure Python wheels or have sub-dependencies that ship non-pure Python wheels shows the following:

  • cffi==1.15.1 (impure)
    • via cryptography==41.0.3 (impure)
      • via pyopenssl==23.2.0 (pure)
  • cryptography==41.0.3 (impure)
  • charset-normalizer==3.2.0 (impure)
    • via requests==2.31.0 (pure)
  • multidict==6.0.4 (impure)
    • via grpclib==0.4.5 (pure)
      • via betterproto==2.0.0b5 (pure)
        • via sigstore-protobuf-specs==0.1.0 (pure)
  • pydantic==1.10.12 (impure) (this will be resolved in our 2.0 release when we upgrade to pydantic >= 2,< 3)
    • via id==1.1.0 (pure)
@di di added the enhancement New feature or request label Sep 12, 2023
@jku
Copy link
Member

jku commented Sep 12, 2023

charset-normalizer==3.2.0 (impure)

charset-normalizer has a universal wheel too

@jku
Copy link
Member

jku commented Sep 12, 2023

multidict==6.0.4 (impure)

Multidict claims that the library has optional C Extensions for speed. There's no universal wheel though, this will need a closer look.

@di
Copy link
Member Author

di commented Sep 12, 2023

Interesting, I wonder why they ship impure wheels as well.

@woodruffw
Copy link
Member

To address cryptography and friends, the elephants in the room 🙂

  1. X.509 certificate parsing is currently done via cryptography, which implements it in pure Rust (subsequent chain building is done via pyOpenSSL, which uses C to call into an OpenSSL or OpenSSL-like backend)
  2. Signature verification (SET, SCT, certificate) is similar (calls into C via cffi in cryptography)
  3. Small associated bits are also written in Rust internally (SCT parsing)
  4. Transitively, we also depend on things like PEM parsing (since we accept certificates/chains in PEM format)

On that front, there's currently an effort (which I'm working on with others at ToB) to support X.509 path building in cryptography with a pure Rust implementation (pyca/cryptography#9405, pyca/cryptography#8873), meaning that a future version of sigstore-python hopefully won't need pyOpenSSL at all, which will also remove the cffi dep. However, that just exchanges one native dep (C) for another (Rust), so that is potentially not immediately useful here, besides reducing the overall total number of native deps 🙂

TL;DR: When path validation is merged, it should be possible to eliminate pyOpenSSL and cffi as dependencies, although cryptography will continue to be an impure dep (and we will further rely on its native bits).

Removing cryptography outright is a bigger challenge, and I can see two (non-exhaustive) possibilites:

  • "The hard way": reimplement the parts we care about (PEM parsing, DER parsing, X.509, path validation) in pure Python. This will be a significant effort, and (I believe) the cryptography maintainers probably won't want to upstream it (since they're all in on Rust for both performance and reliability reasons).
  • "The cheating way": convince CPython (and other major Python distributions?) to bundle cryptography, either as a public API (probably a hard sell) or an implementation detail. pip could then depend on cryptography's native bits without actually vendoring it. This entails more or less binding CPython's supported architectures list to Rust's, which may or may not be a "pro" in the eyes of the maintainers 🙂

@jku
Copy link
Member

jku commented Sep 12, 2023

Documenting the native code requirements is a very good idea, but for the end goal we'll also want to look at the dependency tree as a whole: if the subset of the dependency tree (that is not part of e.g. pip dependency tree already) is too large, then pip maintainers might not be enthusiastic about vendoring attempts.

The point I'm making is that putting a lot of effort into fixing the native code situation is not useful if the end result will still be unacceptable for vendoring because of the size of the dependency tree...

@jku
Copy link
Member

jku commented Sep 12, 2023

multidict==6.0.4 (impure)

Multidict claims that the library has optional C Extensions for speed. There's no universal wheel though, this will need a closer look.

This looks like a build system issue: it's supported but the CD builder just doesn't build the universal wheel

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants