Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect parsing of JSON numbers in v0.4 (Regression) #113

Open
probablykasper opened this issue Aug 20, 2023 · 2 comments · May be fixed by #119
Open

Incorrect parsing of JSON numbers in v0.4 (Regression) #113

probablykasper opened this issue Aug 20, 2023 · 2 comments · May be fixed by #119

Comments

@probablykasper
Copy link

In a JSON file, I have numbers that add up to exactly 50:

"splits": [
  { "share": 16.7 },
  { "share": 16.7 },
  { "share": 16.7 },
],
// ...

Thankfully, when parsing them I verify that they add up.

In v0.3 these numbers were correctly parsed as 16.7, but in v0.4.1 they parse to 16.699999999999999289457264239899814128875732421875.

@akubera
Copy link
Owner

akubera commented Nov 2, 2023

A solution of this is being discussed in #117. But in general I think JSON specifies that numbers are f64, so using those as storage is not considered safe, and should probably be strings.

For example, in Python:

import json, decimal

j = json.loads("""
{"splits": [
   { "share": 16.7 },
   { "share": 16.7 },
   { "share": 16.7 }
]}
""")
shares = [decimal.Decimal(split['share']) for split in j['splits']]
sum(shares)  # Decimal('50.09999999999999786837179272')

and more to the point:

>>> 16.7*3
50.099999999999994

But with strings, keeping the digits explicit:

j = json.loads("""
{"splits": [
   { "share": "16.7" },
   { "share": "16.7" },
   { "share": "16.7" }
]}
""")
sum(Decimal(split['share']) for split in j['splits'])  # Decimal('50.1')

But I understand: if you trust the text of JSON anyways, the standards really don't matter and you want to deserialize to the objects you expect.

Are you parsing with serde into your own rust structs? Because if I understand correctly, issue #117 would allow you mark your fields with a custom parsing function that parses the JSON fields directly.

@probablykasper
Copy link
Author

But in general I think JSON specifies that numbers are f64, so using those as storage is not considered safe, and should probably be strings.

That's not the case, the JSON spec says that any sequence of digits is equally valid:

JSON is agnostic about the semantics of numbers. In any programming language, there can be a variety of number types of various capacities and complements, fixed or floating, binary or decimal. That can make interchange between different programming languages difficult. JSON instead offers only the representation of numbers that humans use: a sequence of digits. All programming languages know how to make sense of digit sequences even if they disagree on internal representations. That is enough to allow interchange.

So it's no different than if you were to intermediately parse as f16. #117 sounds like it would solve it though, so that's great

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants