Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid json_encode when serializing non-UTF8 literals #7791

Merged
merged 3 commits into from Mar 16, 2022

Conversation

ohader
Copy link
Contributor

@ohader ohader commented Mar 15, 2022

\Psalm\Internal\Clause::$hash basically holds a hash on arbitrary input literals, used for later comparison. Using json_encode fails when dealing with non-UTF8 literals, which has been replaced by plain PHP serialize.

Resolves: #7771

`\Psalm\Internal\Clause::$hash` basically holds a hash on
arbitrary input literals, used for later comparison. Using
`json_encode` fails when dealing with non-UTF8 literals,
which has been replaced by plain PHP `serialize`.

Resolves: vimeo#7771
@ohader
Copy link
Contributor Author

ohader commented Mar 15, 2022

This change does not actually crash in the 4.x branch - errors in json_encode are just ignored, results casted to string and "hashed" - always to d41d8cd98f00b204e9800998ecf8427e, which is md5(''). Thus, in 4.x it is a fix, but not having the same severity as in master branch.

@orklah
Copy link
Collaborator

orklah commented Mar 15, 2022

This change does not actually crash in the 4.x branch

This is interesting because it means those hash are returned as false on 4.x. Meaning whatever use we make of those hash could be completely bogus (for example when two weird encoded string need to be hashed)

This is a potential bug revealed by @weirdan applying a rector rule here: #7303

/cc @TomasVotruba :)

@TomasVotruba
Copy link

I'd go with Nette\Utils\Json. The native function cannot be trusted :)

@ohader
Copy link
Contributor Author

ohader commented Mar 16, 2022

Unfortunately Nette\Utils\Json does not provide the require magic, it's also just json_encode in the end.. 😉
https://github.com/nette/utils/blob/master/src/Utils/Json.php#L48

Seems like I need to check JSON specs carefully on that topic, since in JavaScript the following seems to be fine...

JSON.stringify(String.fromCharCode(255))

@orklah
Copy link
Collaborator

orklah commented Mar 16, 2022

If serialize is working fine, I'm good with it :)

@AndrolGenhald
Copy link
Collaborator

AndrolGenhald commented Mar 16, 2022

There is JSON_INVALID_UTF8_IGNORE, but it appears to just be stripping invalid UTF-8 characters.

Seems like I need to check JSON specs carefully on that topic, since in JavaScript the following seems to be fine...

The JavaScript implementation is wrong, it must be UTF-8:

JSON text exchanged between systems that are not part of a closed ecosystem MUST be encoded using UTF-8

Although technically the various JS implementations probably predate that RFC. RFC 7159 says:

JSON text SHALL be encoded in UTF-8, UTF-16, or UTF-32. The default encoding is UTF-8, and JSON texts that are encoded in UTF-8 are interoperable in the sense that they will be read successfully by the maximum number of implementations; there are many implementations that cannot successfully read texts in other encodings (such as UTF-16 and UTF-32).

So it's less "wrong" and more "not fully up to date with the latest RFC", and unfortunately it will probably always be that way because JS is nearly impossible to change without breaking compatibility with half the internet.

@orklah orklah added the release:fix The PR will be included in 'Fixes' section of the release notes label Mar 16, 2022
@orklah orklah merged commit 80f972c into vimeo:4.x Mar 16, 2022
@orklah
Copy link
Collaborator

orklah commented Mar 16, 2022

Thanks!

@ohader ohader deleted the issue-7771-4x branch March 17, 2022 11:34
muglug added a commit that referenced this pull request Apr 27, 2022
Avoid json_encode when serializing non-UTF8 literals
@ohader
Copy link
Contributor Author

ohader commented Jun 7, 2022

ℹ️ Merged to master branch in 20351c6

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release:fix The PR will be included in 'Fixes' section of the release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

JsonException Psalm\Internal\Clause
4 participants