-
-
Notifications
You must be signed in to change notification settings - Fork 9.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
r.text and r.json() return different results *in some cases* #5667
Comments
I looked into this some more and the reason for the difference is that I propose that, based on RFC 4627, if the response's content type is set to I'm happy to submit a PR for this if you want. RFC 4627: https://www.ietf.org/rfc/rfc4627.txt -- Ctrl+f for |
So there's a semantic difference between Also, given the fact that All this said, |
…pe is set to application/json, following RFC 4627. fixes psf#5667
That's too bad about the issues with I just made a small PR based on what you said, it's #5673. I'm happy to make any changes you want. |
New encoding priority 1. Response.encoding from charset 2. Response.apparent_encoding 1. using guess strategy for specific content-type 2. general guess strategy by chardet.detect another fix for psf#5667
New encoding priority 1. Response.encoding from charset 2. Response.guess_and_decode() 1. using guess strategy for specific content-type 2. general guess strategy by chardet.detect another fix for psf#5667
…pe is set to application/json, following RFC 4627. fixes psf#5667
…pe is set to application/json, following RFC 4627. fixes psf#5667
…pe is set to application/json, following RFC 4627. fixes psf#5667
This might be an interesting one... I found that
r.text
andr.json()
can return different results in some specific cases. I don't understand why the difference between the two cases is changing the result ofr.text
.Maybe
r.text
should always default to usingutf-8
as the decoding ifapplication/json
is set as the response's content type, following https://www.ietf.org/rfc/rfc4627.txt (ctrl+f forJSON text SHALL be encoded in Unicode.
).I'm using the latest version of requests.
Expected Result
I would expect
r.text
andr.json()
to return ~ the same thing. More specifically, I would expectjson.loads(r.text)
andr.json()
to return the same thing, but the issue seems to be withr.text
's decoding specifically.Actual Result
In the following code, I am making a sample request and replacing the request's response with custom content so we have full control over it. The custom content is utf-8 encoded. In the next version of this code, you'll see the
name
change when it shouldn't.The above code prints:
which is fantastic.
Replacing the request's content with
b'{"name":"rd\xce\xba","uuid":"1234"}'
, which simply adds auuid
field to the JSON, and running the code again prints:The
name
is different even though it did not change at all! The existence of"uuid":"1234"
in the response's contents somehow changes the decoding. I have no clue why.Reproduction Steps
Run this code:
The issue should be fixed when the two print statements match... I think.
System Information
The text was updated successfully, but these errors were encountered: