New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Display a clear error on empty character literals '' #10197
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code seems obviously fine - possible alternate message
parsing/lexer.mll
Outdated
"The empty character literal '' is invalid. \ | ||
We expect a character between single quotes (possibly a space ' ') \ | ||
or a single quote for type variables 'a." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Over here, using "we" when you are singular is generally reserved for the monarch 🙂 However, I think it's more consistent with the other messages to describe the properties of what is expected:
"The empty character literal '' is invalid. \ | |
We expect a character between single quotes (possibly a space ' ') \ | |
or a single quote for type variables 'a." | |
"Character expected between two single quotes (' ', for example). \ | |
The empty character literal '' is invalid. \ | |
A single quote is used for type variables 'a." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm unconvinced by your proposition: I have doubts that "Character expected ... ." is easier to read than "We expect a character ... .", and your message splits the allowed forms in two halves, one before and one after the diagnostic, I find this confusing.
If the We
is really a problem (we, the authors of the compiler), I can change to "Invalid empty character literal ''.", but I would rather keep this structure -- or something else. But then there are many occurrences of "we" in the compiler text, mostly in comments or in the manual, but also a handful in error messages. (git grep ' [Ww]e ' | wc -l
returns 2440 on my copy of trunk.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the
We
is really a problem (we, the authors of the compiler),
Personally except for the notable exception of did you mean-like error messages, I dislike the anthropomorphisation of computer program error messages.
Also there are already errors patterns in the character space with which consistency could be maintained e.g. this use of "illegal" vs "invalid".
# "bl\a";;
Warning 14: illegal backslash escape in string.
- : string = "bl\\a"
Here would be my proposal:
illegal empty character literal ''. Did you mean ' ' ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The use of "we" in the manual or comments is entirely distinct from messages displayed by the tools. We may write the compiler, but on encountering an error the message comes from the compiler, not its authors. There may be other error messages elsewhere in the compiler which do this, but the other lexer error messages right around this one do not.
@dbuenzli's suggestion looks much better to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I went with
Error: Illegal empty character literal; did you mean ' ' or a type variable 'a ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would make the hint part more distinct:
Error: Illegal empty character literal
Hint: Did you mean ' ' or a type variable 'a ?
(With the hope of one day using pretty printer for hints).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Error: Illegal empty character literal; did you mean ' ' or a type variable 'a ?
Sorry for the extreme hci bikesheding @gasche.
Since the original report is about the difficulty to distinguish spaces with certain fonts I think it would be good to write down the empty ''
as was in your your original message and in my proposal.
That way a direct, local, comparison between ''
and ' '
can be made which should easily show the difference; the context switch to the actual code may make you loose the distinction.
Also I notice the existing format for reporting Did you mean
is slightly different:
# let fab x = x;;
val fab : 'a -> 'a = <fun>
# fib;;
Error: Unbound value fib
Hint: Did you mean fab?
Not that I'm necessarily in favor of that one. However I'm slightly averse to have the did you mean after a semicolon, rather than a .
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case I don't mind the bikeshedding, especially given that you all had good reasons to suggest changes. I will follow the standard "did you mean" format suggested by @Octachron, but include the ''
back. (I agree it's interesting to show it.)
Another confusing error is when we compile e.g. |
We could have a regexp for just some sequences that are single-symbol-wide and likely culprits of this mistakes, but doing anything in the lexer would still prevent people from naming their type variable |
|
908c5f2
to
2b7916a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error message looks good to me and recognizing ''
as an invalid character literal (or type variable name) seems sensible.
parsing/lexer.mll
Outdated
| Empty_character_literal -> | ||
Location.errorf ~loc | ||
"Illegal empty character literal ''\n\ | ||
Hint: Did you mean ' ' or a type variable 'a?" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpicking: the hint could be a suberror message:
Location.error ~loc ~sub:[Location.msg "Hint: Did you mean ' ' or a type variable 'a?"]
"Illegal empty character literal ''"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
2b7916a
to
49cd6c4
Compare
I changed the error message slightly:
|
49cd6c4
to
1e712d7
Compare
Shedding my own bike:
|
I liked the original error message better. |
One thing I don't like about the short version is |
Before: > # '';; > Error: Syntax error After: > # '';; > Error: Illegal empty character literal '' > Hint: Did you mean ' ' or a type variable 'a? Before, this input would get correctly lexed into QUOTE QUOTE and then fail in the parser with a generic "Syntax error" message. (Even if we had better error messages in the parser, here the parser-level error would not be very illuminating as ' is never the start of a valid expression or structure item). Fixes ocaml#10196.
1e712d7
to
966d10c
Compare
(Still went back to the short version. Maybe it's allright now?) |
My approval still stands for this version. I sympathize with the punctuation collision issue, but I don't see a way to circumvent it (without an uniform code quoting standard in error messages). |
Merged, then. Thanks to everyone! |
Before:
After:
Before, this input would get correctly lexed into QUOTE QUOTE and then
fail in the parser with a generic "Syntax error" message. (Even if we
had better error messages in the parser, here the parser-level error
would not be very illuminating as ' is never the start of a valid
expression or structure item).
Fixes #10196.