Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Approx(0.0) does not approx anything #1444

Closed
karyon opened this issue Nov 16, 2018 · 20 comments
Closed

Approx(0.0) does not approx anything #1444

karyon opened this issue Nov 16, 2018 · 20 comments

Comments

@karyon
Copy link

karyon commented Nov 16, 2018

Similar to #1079, #1096, and #1174.

Right now, REQUIRE(value == Approx(0.0)) will fail unless value is exactly zero. That is, Approx(0.0) has no effect and value == Approx(0.0) behaves the same as value == 0.0 (not sure about -0.0 though). The reason is that with the current implementation, the epsilon used for comparison is multiplied with zero and thus has no effect.

That is unexpected. I understand the reason why this happens, and I understand that float comparisons are hard. However, this makes using catch harder to use with no advantage obvious to the user, and I do think there has to be a better way, even if it's just a special case for zero.

In any case, I would suggest to at least document that you cannot use Approx on zero, even though I would much prefer if the practicality and user experience would take precedence over the technical purity of the current solution.

@horenmar
Copy link
Member

This is a fundamental problem with 0, in that, a relative comparison with sane (< 1) epsilon will always fail when the target is 0, even if it takes into account either of the sides.

Consider expression actual == Approx(target). This evaluates to true, if | actual - target | < margin holds, so the question boils down to what is the margin like. There are 2 potential sources of margin in Approx, an absolute value set via call to setMargin, which defaults to 0, and a relative value that is currently computed as ε * | target |. However, lets consider instead consider ε * max( |target| , |actual| ) in this example.

By setting target to 0 and simplifying the above expressions, we get | actual | < ε * | actual |... For this to evaluate to true, ε > 1 has to hold, which breaks any semblance of sanity in comparisons for other cases (because it would also mean that e.g. 10 == Approx(20)).

@karyon
Copy link
Author

karyon commented Nov 18, 2018

I think I do understand the math and the fundamental problem. I still think that a better solution can be found, not in terms of mathematical "correctness", but in terms of matching what the user expects, which is that Approx(0.0) does actually approximate something.

I am aware that this has probably discussed to death. I hope to add another point of view in the following, if I'm just repeating what has already been said then I'm sorry.

I would suggest to either add a special case for zero, using some reasonably guesstimated fixed epsilon in that case, simply to create a practical solution for an annoyance in a common case (comparing to zero). A more elaborate rationale would be that when comparing to Approx(<very small number>), chances are that the "scale" of the calculation was indeed very small (i.e. using only very small numbers), so using an even smaller epsilon is fine. However, when comparing to Approx(0.0), chances are that the calculation was actually done at a non-zero scale, since most calculations are not just about zeroes (right? :) ), and applying an epsilon of zero does not help in that case.

Alternatively I would suggest to add a minimum epsilon to the calculated one to not let the applied epsilon become ridiculously small for small numbers and ultimately zero, or in other words, to use min(ε * max( |target| , |actual| ), minimum_ε) or perhaps ε * max( |target| , |actual| ) + minimum_ε (which would be equivalent of setting the existing scale variable to some minimum_ε i think?). again, the rationale could be that the calculations are always happening at some scale that is not zero.

@horenmar
Copy link
Member

The problem with setting a minimum ε is that it also leads to unintuitive, and what is worse, incorrect, results. Let's say that we set minimum ε to std::numeric_limits<double>::epsilon, which is the difference between 1 and the next smallest representable value, or roughly 2.2e-16. This sounds good, until you realise that while this seems insanely small, there are still 4372995238176751615 (4*10^18) numbers between 0 and std::numeric_limits<double>::epsilon...

Also, Approx actually used to have scale default to 1.0, but that lead to false positives in code such as REQUIRE(1e-14 == Approx(1e-15).epsilon(0.01)), which is obviously incorrect (1e-14 is not within 1% of 1e-15), but would pass because the scale would blow out the actual numbers used in the comparison. That and some other similar problems made me ultimately decide to go for the mathematically more correct comparison, even though it is possibly surprising. Fundamentally, comparing floats is hard, and I think it is better to surprise people who don't know what they are doing than it is to surprise people who do know what they are doing.

@karyon
Copy link
Author

karyon commented Nov 18, 2018

Alright, I agree that the minimum epsilon is a very inconsistent solution. Any practical downside for the special-case-for-zero-solution? I can't imagine people who know what they are doing to expect Approx(0.0) to not approximate anything or to be surprised if it did approximate something.

@jayeshbadwaik
Copy link
Contributor

jayeshbadwaik commented Nov 18, 2018

@karyon Would you be happy with using ULP matchers instead? I think that is the better way to achieve what you want.

@karyon
Copy link
Author

karyon commented Nov 21, 2018

Thanks. However, I can't explain my colleagues that they can use Approx if they want, but for zeroes, that does not work and they should use something else instead. We preferred practicality and modified Approx instead (we set the scale to std::numeric_limits::epsilon by default i think)

@Danielmelody
Copy link

A effective comparison between doubles should compare the exponent bits and fraction bits separately, 0.0 can be well handled in this case.

@hollasch
Copy link
Contributor

@Danielhu229 writes:

A effective comparison between doubles should compare the exponent bits and fraction bits separately, 0.0 can be well handled in this case.

Not around zero. The problem isn't the zero point, it's the fact that floating-point numbers near zero (over 8 million values) are denormalized. For these values, the exponent and mantissa are mixed together. Particularly around zero, there's a danger of advertising one approach as the right one. In this particular case, where you're testing around a fixed value, an epsilon value can make sense (though not a thoughtlessly chosen one). An ULP comparison ends up being equivalent, again, due to the fixed comparison point. However, there's a huge relative leap from the smallest positive value to zero to the largest negative value. In addition, a sign-change here could mean a clear error that should be flagged.

For all these reasons, I'd prefer the following example code, as it's much simpler, vastly more clear, and does not depend on the reader having memorized the underlying current implementation of Catch2's Approx() helper:

REQUIRE (x > 0)
REQUIRE (x < 1e-140)

@TheNicker
Copy link

Just encountered this issue as well. I understand the floating point issue, however I think it is a bug in the sense that asking the API doing something does nothing.
One solution is to force the user to provide a margin for Approx(0)
In the current standard I don't think it's possible to do it at compile time, but surely possible at runtime.

@BenjaminNavarro
Copy link

I just stumbled on this problem and lost some time figuring out why the approximation didn't work. I let you decide what's the best solution for this but in the meantime, can this be at least explicitly documented ? This would probably help people to not fall into this trap

@dimaga
Copy link

dimaga commented Mar 19, 2020

Previously, Approx::m_scale was set 1.0 by default and there were no problems with Approx(0.0). By the recent version, Approx::m_scale default value has been changed to 0.0. Does anybody know, why?

@hollasch
Copy link
Contributor

I think that as @jayeshbadwaik commented above, ULP comparison is the way to go. std::frexp() provides an easy way to accomplish this. It makes much more sense to me to assert that f1 and f2 agree to n bits.

@tdegeus
Copy link
Contributor

tdegeus commented Oct 14, 2020

I can only very highly endorse @BenjaminNavarro remark of at least documenting this. But in the end I also think that the API should allow for

REQUIRE(a == Approx(0.0))

to work...

@tjwrona
Copy link

tjwrona commented Nov 15, 2020

@tdegeus I agree, it seems broken that it does not work at all. Even if it worked by requiring the user to set a default epsilon for Approx that would be better than it is now.

@TheNicker
Copy link

TheNicker commented Apr 26, 2021

Many developers have spent too much time on this for many moons now.
There's no perfect solution, that's why not providing a solution is also a solution.
But I believe as previously said that my suggested solution is better the the current situation.

Add strict mode that forces the user to supply a margin for Approx(0) and make it default.
For backward compatibility, current implementations should just disable strict mode.
Yes, technically it is a breaking change.

A more subtle approach would be to have strict mode disabled by default and provide a prominent documentation for new users.

@SirNate0
Copy link

SirNate0 commented Aug 3, 2021

As someone who is looking into using this library, I wanted to share my thoughts. I think the present solution where Approx(0) only matches zero is not expected behavior. Technically it is, if you read the documentation, but just seeing the code it would not be. I think the main thing is that Approx by default only address relative differences (|x/y-1| < ε) when I would also expect it to address absolute differences (|x-y| < δ).

Other than this, an issue I'm glad I found on google when looking up Approx (as it's not mentioned in the documentation), the library is looking great and I think I will use it.

@Sidelobe
Copy link

Sidelobe commented Oct 1, 2022

For future readers -- Martin explains the logic behind the Catch2 implementation here very well:

https://codingnest.com/the-little-things-comparing-floating-point-numbers/

@BenjaminNavarro
Copy link

I guess we all agree, as the linked blog post explains, that floating point comparison is difficult and that's why use Approx to try to deal with this complexity.

The but core issue remains, Approx(0.0) being a valid construct is a problem because the API lies to you. There's no "approximately equal" test here, just an equality check in disguise.

The doc still doesn't cover this issue so we can expect people falling into the same trap.

And the code doesn't check for this particular case to warn the users. But looking at the other member functions, CATCH_ENFORCE is being used to make sure given values are within an acceptable range. Why the constructor couldn't do something like CATCH_ENFORCE(value != 0.0, "Approx can't be used to compare with zero. link_to_the_docs_for_why") ? Sure, a fallback strategy for comparison to zero would be better but in the meantime it would make sure that users are not misusing the API.

@Sidelobe
Copy link

Sidelobe commented Oct 1, 2022

Fully agree with you, @BenjaminNavarro.

As Martin writes in the piece I linked:

There are two crucial things to remember about Approx. The first is that the epsilon comparison scales only with the Approx'd value, not the min/max of both sides of the comparison. The other is that a default-constructed Approx instance only performs epsilon comparison (margin defaults to 0).

Both of these behaviours are there for legacy reasons because I do not want to break backwards compatibility in a way that can cause tests to pass when they would not before. Generally, I recommend considering Approx legacy and building the comparisons out of matchers.

That is all fine and dandy (a valid design/maintenance decision), but then why not mark Approx() explicitly as deprecated/legacy in the documentation?

And on top of that, specifically for the Approx(0.0) (which is what this issue is about, sorry for deviating) adding an enforce (as @BenjaminNavarro suggests) or at least a warning seems a good idea to me (on by default, with an option to silence it for legacy code you don't want to update).

NOTE: If I'm not mistaken, a user-changed non-zero margin does allow us to compare with 0.0 (Approx(1.401e-45f).margin(1.401e-45f) == 0.0) evaluates to true), so the check would need to be in the evaluation function, not contructor.

@horenmar
Copy link
Member

horenmar commented Oct 9, 2022

NOTE: If I'm not mistaken, a user-changed non-zero margin does allow us to compare with 0.0 (Approx(1.401e-45f).margin(1.401e-45f) == 0.0) evaluates to true), so the check would need to be in the evaluation function, not contructor.

This is correct. The constructor cannot guard against 0, because Approx(0).margin(0.5) is a perfectly fine approx instance to create. Overall I am currently not sold on having the check in the comparison function itself either. The reason is that whether it is opt-in or it is opt-out, tthe warning will kinda suck.

The reason is that if it is on-by-default, there has to be a way to suppress it, so it requires adding a new CLI flag as well. It also If it is off-by-default, then it is mostly useless, as people are unlikely to get bitten by this repeatedly, which is the main reason to have a warning flag for something.

About docs, the current Approx docs haven't been updated in ages. The two tell-tale signs are that it still uses "Catch" as the framework name, and that the markdown sources do not adhere to a column limit, unlike docs written in the last 3+ years. I am going to rewrite the docs when I find extra time, including some introduction into floating point comparisons, but that does run into the perennial problem; I have to find free time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests