Fix null convertion in `Num::fromString` (#1131) #1132

mhermier · 2023-01-05T22:44:43Z

When strtod returns 0.0, no conversion could have happened. Handle the error case correctly.

When `strtod` returns `0.0`, no conversion could have happened. Handle the error case correctly.

PureFox48 · 2023-01-05T23:33:47Z

If there's no conversion then strtod will return 0.0 but it would also return 0.0 if passed "0.0" or "0.0suffix" say.

However, your actual patch does handle that correctly.

Personally, I'd like to have seen "1.2suffix" return 1.2 as it does in several other languages I've used including C itself which saves you having to trim off extraneous characters after the numerical part. However, I've just checked and it returns null at present so, unfortunately, that would be a breaking change.

mhermier · 2023-01-06T00:31:18Z

The statement "C itself which saves you having to trim off extraneous characters after the numerical part" is misleading. The libc API offers incremental parsing. As such it is up to the user to decide/handle error cases, such as extraneous characters. Being lax about extra characters is the lazy route.

But I think I prefer the stricter route for parsing. It is more easier to compose (to parse more diverse inputs) and propagate errors. Relying on these incremental parsing API based on expected usages, always backfire when you have to fight against these expected usages.

CrazyInfin8 · 2023-01-06T00:57:05Z

Personally, I'd like to have seen "1.2suffix" return 1.2 as it does in several other languages I've used including C itself which saves you having to trim off extraneous characters after the numerical part. However, I've just checked and it returns null at present so, unfortunately, that would be a breaking change.

I suppose that #984 could provide this functionality. Though I'd argue that this behavior should be a separate function or using extra, optional parameters on fromString to specify whether we want to parse the whole string or just the first parse-able number. wrenParseNum is also able to return the number of characters in the number which could be useful for consuming strings in wren, as well as providing info of what caused a parse to fail.

I think if there is interest on this topic, we could open a new discussion.

PureFox48 · 2023-01-06T10:44:05Z

@mhermier

Well it's worth noting that Num.fromString is not completely 'pure' in that it does ignore extraneous whitespace (" 123 " works fine) and also parses number forms such as 0X123 with a capital X or 0123 which are not valid Wren though, thankfully, the latter is parsed as a decimal rather than a C-style octal literal.

However, the general vibe of the method seems to be to parse anything that looks in its entirety like a number and return null for anything else so I'm content to leave things as they are.

In most cases that arise in practice you know (or think you know) that the numeric part will either be followed by a space + a word or by a fixed-length non-numeric suffix so it's easy to trim these off using split or by taking a sub-string. Other cases are more awkward to deal with (particularly when the number is a hex literal) and you then need to search for the first non-numeric character. I might add something which does this to my 'str' module though it's already bulging at the seams following the recent addition of some more Unicode-aware versions of lower and upper case conversion methods :)

@CrazyInfin8

Hadn't considered it but, yes, we could provide the above functionality via an extra parameter to Num.fromString. This would be backwards compatible as the current version could then become an overload of it which maintained the status quo by setting the extra parameter to false.

I don't mind having a separate discussion about it though I suspect @mhermier might not be too keen as he's against the idea on principle.

Perhaps it's best to leave it for now until we see what's going to happen about binary/octal literals and digit separators being added to the language which would, of course, necessitate #984 or something like it being seriously entertained.

mhermier · 2023-01-06T11:39:29Z

@PureFox48 I'm not against it, but if an API is complex and user can fall into a pit of partial success by being lazy, they will do. The problem with strtod is exactly that. The API is simple and meant to be used for incremental parsing, ie the user get back how much was parsed. On the other hand Num.fromString lack the possibility to output how much was parsed. So we can't know if a string parsed today, will be parsed the same way tomorrow, unless we properly check for a complete parsing of the input.
As an example, 16rcafebabe is a valid hexadecimal number in smalltalk. With what you suggest 16 would be returned and rcafebabe would be ignored because it would makes no sense. If tomorrow, we decide to support that syntax, it would return hex cafebabe value, breaking code and there is no sane way to recover from that.

PureFox48 · 2023-01-06T12:50:59Z

Well, if this were done as an extra parameter, then the user would have to explicitly set it. We'd therefore have:

var a = Num.fromString("16rcafebabe")       //> null
var b = Num.fromString("16rcafebabe", true) //> 16

So I don't really accept that this would be an easy pit into which a lazy user could unwittingly fall.

However, I do accept that as we don't know what the future may hold we should stick to the status quo for now as far as the core library is concerned (folks like me can always roll their own) though I'd be very surprised indeed if Smalltalk-type integer literals were ever to be introduced into Wren!

mhermier · 2023-01-06T12:53:30Z

Anyway, I think the way it is fixed here is acceptable.

PureFox48 · 2023-01-06T12:54:45Z

Yes, so do I and the quicker it's merged the better.

PureFox48 · 2023-01-06T15:16:46Z

OK, this is what I've come up with for my 'str' module:

class Str {
    //  ... (existing stuff)

    // After trimming whitespace from the string 's', takes as many characters as possible
    // to form a valid number and converts it thereto using the Num.fromString method.
    // Returns null if such a conversion is impossible.
    static toNum(s) {
        if (s is Num) return s
        if (!(s is String)) s = "%(s)"
        s = s.trim()
        var n = Num.fromString(s)
        if (n) return n
        if (s.count < 2) return null
        var chars = s.toList
        for (i in chars.count-1..1) {
            chars.removeAt(i)
            if (n = Num.fromString(chars.join())) return n
        }
        return null
    }
}

var strs = ["", "?", "7Z", "3 dog night", "123.45 dollars", "$123.45", "-infiniteCalm", "nansAreGoodToEat", "0x1gen"]
for (s in strs) System.print("\"%(s)\" -> %(Str.toNum(s))")

/* output:

"" -> null
"?" -> null
"7Z" -> 7
"3 dog night" -> 3
"123.45 dollars" -> 123.45
"$123.45" -> null
"-infiniteCalm" -> -infinity
"nansAreGoodToEat" -> nan
"0x1gen" -> 1

*/

This isn't very efficient as I'm just removing characters from the end of the string until I find something that Num.fromString can parse or returning null if nothing can be parsed.

However, it's a heck of a lot less code than trying to parse the string from the front and deal with all the possibilities that can arise.

As the strings would not usually be very long, it should be quick enough in practice. However, if anyone has any suggestions for improvement, I'd be glad to hear them :)

Fix null convertion in Num::fromString (wren-lang#1131)

fd3fe7f

When `strtod` returns `0.0`, no conversion could have happened. Handle the error case correctly.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix null convertion in `Num::fromString` (#1131) #1132

Fix null convertion in `Num::fromString` (#1131) #1132

mhermier commented Jan 5, 2023

PureFox48 commented Jan 5, 2023

mhermier commented Jan 6, 2023

CrazyInfin8 commented Jan 6, 2023

PureFox48 commented Jan 6, 2023 •

edited

mhermier commented Jan 6, 2023

PureFox48 commented Jan 6, 2023

mhermier commented Jan 6, 2023

PureFox48 commented Jan 6, 2023

PureFox48 commented Jan 6, 2023 •

edited

Fix null convertion in Num::fromString (#1131) #1132

Are you sure you want to change the base?

Fix null convertion in Num::fromString (#1131) #1132

Conversation

mhermier commented Jan 5, 2023

PureFox48 commented Jan 5, 2023

mhermier commented Jan 6, 2023

CrazyInfin8 commented Jan 6, 2023

PureFox48 commented Jan 6, 2023 • edited

mhermier commented Jan 6, 2023

PureFox48 commented Jan 6, 2023

mhermier commented Jan 6, 2023

PureFox48 commented Jan 6, 2023

PureFox48 commented Jan 6, 2023 • edited

Fix null convertion in `Num::fromString` (#1131) #1132

Fix null convertion in `Num::fromString` (#1131) #1132

PureFox48 commented Jan 6, 2023 •

edited

PureFox48 commented Jan 6, 2023 •

edited