Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix JWS verify #189

Closed
wants to merge 9 commits into from
Closed

Fix JWS verify #189

wants to merge 9 commits into from

Conversation

blag
Copy link
Contributor

@blag blag commented Sep 16, 2020

Note: This PR description is not yet finished, and I may rewrite some of these commits, so if you pull this branch, beware that you may need to rebase any changes you make to it!

Python 2 treated bytestrings and character strings as the same thing, a mistake that was thankfully fixed in Python 3. However, this did create a dilemma when updating this codebase to run on Python 3 - should module functions return bytestrings or character strings? Many, if not all, encryption packages sidestep this issue by foisting the distinction and conversion processes onto users, however, that makes implementing robust and correct applications more difficult.

Users of this package who have been running on Python 3 have come across many sharp edges of the bytes/str divide and reported them across multiple issues (#51, #153, #183, #184), and so with the update to Python 3, it has become imperative for this package to "take a stand" on the basic data structure it expects to handle.

I searched through RFC 7520 for "byte", "bytes", and "binary" and found only one instance:

All instances of binary octet strings are represented using base64url [RFC4648] encoding.

This indicates to me that the governing standard for this project does not want to deal with bytestrings directly, instead dealing with them indirectly via base64-urlencoding.

Python 2.7 makes base64-encoding bytestrings and strings easy, and both are base64-encoded to str:

Python 2.7.16 (default, Jan 27 2020, 04:46:15)
[GCC 4.2.1 Compatible Apple LLVM 10.0.1 (clang-1001.0.37.14)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import base64
>>> base64.b64encode(b'Hello Binary World')
'SGVsbG8gQmluYXJ5IFdvcmxk'
>>> type(base64.b64encode(b'Hello Binary World'))
<type 'str'>
>>> 
>>> base64.b64encode('Hello String World')
'SGVsbG8gU3RyaW5nIFdvcmxk'
>>> type(base64.b64encode('Hello String World'))
<type 'str'>

To summarize the types that are expected and returned:

  • bytes -> str
  • str -> str

Python 3.6 makes things more complex - users have to manually encode strings to bytes before base64-encoding them, and :

Python 3.6.4 (v3.6.4:d48ecebad5, Dec 18 2017, 21:07:28)
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import base64
>>> base64.b64encode(b'Hello Binary World')
b'SGVsbG8gQmluYXJ5IFdvcmxk'
>>> type(base64.b64encode(b'Hello Binary World'))
<class 'bytes'>
>>> 
>>> base64.b64encode('Hello String World')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File ".../base64.py", line 58, in b64encode
    encoded = binascii.b2a_base64(s, newline=False)
TypeError: a bytes-like object is required, not 'str'
>>> type(base64.b64encode('Hello String World'))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File ".../base64.py", line 58, in b64encode
    encoded = binascii.b2a_base64(s, newline=False)
TypeError: a bytes-like object is required, not 'str'

Summarizing those type expectations:

  • bytes -> bytes
  • str -> (exception)

So Python 2.7 will accept bytes and str and returns a str, but Python 3 only accepts bytes and returns bytes. This impacts everybody who is using Python 3, not just users of this library. Unfortunately, it's not the behavior we would like.

@blag blag changed the title Fix jws verify Fix JWS verify Sep 16, 2020
@codecov
Copy link

codecov bot commented Sep 23, 2020

Codecov Report

Merging #189 into master will not change coverage.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #189   +/-   ##
=======================================
  Coverage   93.51%   93.51%           
=======================================
  Files          16       16           
  Lines        1728     1728           
=======================================
  Hits         1616     1616           
  Misses        112      112           
Impacted Files Coverage Δ
jose/backends/cryptography_backend.py 93.20% <ø> (ø)
jose/backends/ecdsa_backend.py 98.66% <ø> (ø)
jose/backends/native.py 97.61% <ø> (ø)
jose/backends/pycrypto_backend.py 93.03% <ø> (ø)
jose/backends/rsa_backend.py 95.65% <ø> (ø)
jose/jwe.py 82.46% <100.00%> (ø)
jose/jws.py 96.85% <100.00%> (ø)
jose/jwt.py 100.00% <100.00%> (ø)
jose/utils.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 82bd8aa...d073ad2. Read the comment docs.

@Colin-b
Copy link

Colin-b commented Nov 30, 2020

Hello @blag , if this issue is the remaining blocker for a new release of python-jose, can you please tell a concrete test case you would expect?

It would ease validation of this feature (or at least the impact of this change on our codebase) if you were to release a .dev0 version on pypi.org

thanks again

@hawzie197
Copy link

+1 on a .dev0 release

@jwilliams-ocient
Copy link

Any update on a .dev0 release?

@twwildey
Copy link
Collaborator

Can you rebase your changes onto the latest master branch and force-update your branch for this PR?

@blag blag closed this May 31, 2024
@blag
Copy link
Contributor Author

blag commented May 31, 2024

I don't have the time to work on this anymore.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants