Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved Long-Double Number Policy #2674

Merged

Conversation

ctasada
Copy link
Contributor

@ctasada ctasada commented May 9, 2024

Purpose

Fixes a performance issue while using the ToNumberPolicy.LONG_OR_DOUBLE with Double values

Description

The Parsing of a Double value was always executing a Long.parseLong(value), which generated a NumberFormatException.

Identifying that a Number is a Double or a Long can be easily achieve (in a naive way) looking for the decimal separator.

This simple change avoids the extra NumberFormatException

A simple JUnit test, parsing a Long or a Double 10K times shows the next values:

  • Double (old parsing): ~42 ms
  • Double (new parsing): ~6 ms
  • Long (old parsing): ~7 ms
  • Long (new parsing): ~7 ms

As we can see, the parsing for Long values stays the same (±1ms), while the parsing for Double is dramatically improved.

Reducing the number of exceptions also has a positive side effect in memory consumption.

Checklist

  • New code follows the Google Java Style Guide
    This is automatically checked by mvn verify, but can also be checked on its own using mvn spotless:check.
    Style violations can be fixed using mvn spotless:apply; this can be done in a separate commit to verify that it did not cause undesired changes.
  • If necessary, new public API validates arguments, for example rejects null
  • New public API has Javadoc
    • Javadoc uses @since $next-version$
      ($next-version$ is a special placeholder which is automatically replaced during release)
  • If necessary, new unit tests have been added
    • Assertions in unit tests use Truth, see existing tests
    • No JUnit 3 features are used (such as extending class TestCase)
    • If this pull request fixes a bug, a new test was added for a situation which failed previously and is now fixed
  • mvn clean verify javadoc:jar passes without errors

Copy link

google-cla bot commented May 9, 2024

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@ctasada ctasada force-pushed the ctasada/improve-long-double-number-policy branch from 98cff1c to 8d4d95b Compare May 9, 2024 12:54
The Parsing of a Double value was always executing a `Long.parseLong(value)`, which generated a `NumberFormatException`.

Identifying that a Number is a Double or a Long can be easily achieve (in a naive way) looking for the decimal separator.

This simple change avoids the extra `NumberFormatException`

A simple JUnit test, parsing a `Long` or a `Double` 10K times shows the next values:

* Double (old parsing): ~42 ms
* Double (new parsing): ~6 ms
* Long (old parsing): ~7 ms
* Long (new parsing): ~7 ms

As we can see, the parsing for `Long` values stays the same (±1ms), while the parsing for `Double` is dramatically improved.

Reducing the number of exceptions also has a positive side effect in memory consumption.
@ctasada ctasada force-pushed the ctasada/improve-long-double-number-policy branch from 8d4d95b to e175533 Compare May 9, 2024 13:00
Copy link
Collaborator

@Marcono1234 Marcono1234 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this suggested improvement! What you are saying sounds reasonable to me, but I have not measured it yet myself.

@eamonnmcmanus, what do you think?

try {
return Long.parseLong(value);
} catch (NumberFormatException longE) {
if (value.contains(".")) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could also use indexOf('.') >= 0 here, which might be more efficient due to using a char. But I am not sure if that is really more efficient and if that is worth it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Marcono1234 the difference between contains(".") and indexOf('.') >= 0 is very small, and nearly negligible. But you're right, indexOf is slightly faster. I just pushed the tweak.

The usage of `indexOf(char)` is slightly faster
@ctasada ctasada force-pushed the ctasada/improve-long-double-number-policy branch from 8dcfdd0 to aa2db76 Compare May 13, 2024 15:45
Copy link
Member

@eamonnmcmanus eamonnmcmanus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! The old code uses exceptions for control flow which is indeed bad, especially for performance. With this change we will usually not have to do that.

@eamonnmcmanus eamonnmcmanus merged commit 454a491 into google:main May 18, 2024
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants