Allow leading logical operators #5068

helixbass · 2018-05-16T22:40:51Z

@GeoffreyBooth I’ve been thinking about Prettier and formatting of long logical conditions (including your expressed opinion that you don’t like breaking lines after and/or)

One of the few things I’ve consistently wished that I could do in Coffeescript was to lead subsequent lines with the logical operator rather than have to break after the logical operator at the end of the line - I find it much easier to digest the “shape” of the logical expression if I don’t have to look to the ends of the lines to see how they relate logically

And in looking at code I’m formatting with the Prettier plugin, what I would like to see as “pretty” formatting would be when breaking long logical expressions, to lead lines with the logical operator. Eg instead of this which Prettier is currently generating (I have it breaking lines at logical operators similarly to how Prettier for JS does it - if you really don’t want to break long lines at logical operators, that’s something that we could easily add an option for):

shouldOmitBracesButNotIndent =
  (parent.type is 'ExpressionStatement' and
    grandparent.type is 'ClassBody') or
  (parent.type is 'ArrayExpression' and parent.elements.length is 1) or
  isObjectPropertyValue path, {stackOffset, last: yes}

I’d like to see it formatted like this:

shouldOmitBracesButNotIndent =
  (parent.type is 'ExpressionStatement'
    and grandparent.type is 'ClassBody')
  or (parent.type is 'ArrayExpression' and parent.elements.length is 1)
  or isObjectPropertyValue path, {stackOffset, last: yes}

To me that’s way easier to follow the logical flow

And I’d never realized it’s a simple change to allow this long-desired (by me) feature, just making the logical operators be LINE_CONTINUERs. It wouldn’t break existing code, since no valid lines could currently begin with and(space)/or(space)/&&/||

PS the code snippet above is from the Prettier plugin printer source, which I’ve been writing in JS but have hacked together a very primitive version of es2coffee (repo, built as a Babel plugin that feeds its transformed AST to the Prettier plugin) which can transform it to Coffeescript pretty well - the decision to make our AST types as similar as possible to Babel AST types has proven beneficial

GeoffreyBooth · 2018-05-16T23:01:03Z

I’m still more of the mindset that using line continuers like and to imply a line continuation, rather than making it explicit with the \ operator, is hacky and bad style. Either let your line keep going and going (and use soft word wrap in your editor) or use \. But I’m not sure I’m in the majority on this one.

Since this is possible with and etc. at the end of the line, it seems reasonable to also allow it at the start. That doesn’t mean I think Prettier should automatically format like this; that’s another discussion. One of the other tenets of Prettier is that it’s opinionated, so we shouldn’t have too many options for people to have to wade through.

Inve1951 · 2018-05-17T07:27:22Z

One of the few things I’ve consistently wished that I could do in Coffeescript was to lead subsequent lines with the logical operator rather than have to break after the logical operator at the end of the line.

Same here.

There's an addition I'd love to see here which is indentation sensitive line continuation similar to leading dot property accessor. Consider this:

bGreets =
  /^hello/i.test myString
  and
    /\s+world$/i.test myString
    or /\s+coffee$/.test myString
  or
    /^goodbye$/.test myString

jashkenas · 2018-05-17T17:08:26Z

FWIW, I think it makes sense to allow any binary operator, when placed at the beginning or end of a line, to parse as a continuation of the preceding or following line...

GeoffreyBooth · 2018-05-17T17:45:12Z

I would also warn that the leading dot chaining notation, e.g.

foo()
.then fn
.catch handleError

Has been the largest source of bugs in CoffeeScript’s history. The more things that cause line continuation, the more potential trouble that gets caused, e.g.:

foo()
.then fn() and
  otherFn()
  .then yetAnotherFn # Does this chain off of foo() or otherFn()?

The problem is that suddenly whitespace is insignificant here. You can indent or not and it doesn’t matter. It’s hard for users to follow what implied line continuation matches with what, and it goes against the general philosophy that indentation should inform these things.

Since and already implies a line continuation, it’s not terrible to allow it to do so from the start of a line rather than the end. But I would encourage us not to keep adding ways to imply a line continuation, as it’ll keep creating new issues like we’ve seen with leading dot.

helixbass · 2018-05-20T00:43:25Z

@GeoffreyBooth you're right LINE_CONTINUER is sort of inherently problematic. So I thought about it a bit more and reimplemented to use a primarily grammar-based solution

Basically it adds rules for Expression TERMINATOR && Expression and Expression TERMINATOR || Expression. This caused a grammar ambiguity so I added a rewriter pass to convert eg TERMINATOR && -> LEADING_&& and the grammar rule is actually Expression LEADING_&& Expression. It's possible there's some grammar cleverness that could avoid the ambiguity, but it's a simple rewriter pass

Then, since the desired behavior (sort of like chaining) is to be able to either align or indent the leading-logical lines (ie both of these should be ok):

a
and b

a
  and b

, instead of making the leading-logical regex be a LINE_CONTINUER, I made it a new INDENT_SUPPRESSOR - basically then a TERMINATOR will be generated rather than an INDENT even if the line is indented

This has proved to work well for the cases I've come up with (added a bunch of tests involving things like if blocks, implicit objects/calls, functions with inline/indented bodies), everything seems to be behaving in accordance with what the indentation implies. It does close open implicit calls/objects in an intuitive way with respect to indentation by virtue of the rewriter pass coming after addImplicitBracesAndParens() (so the TERMINATOR triggers closing them before being rewritten away)

I think it makes sense to allow any binary operator, when placed at the beginning or end of a line, to parse as a continuation of the preceding or following line

@jashkenas it'd be great to be able to lead lines with any binary operator as the same readability benefits apply (you can just scan the left side of the expression to get a sense of its "shape"). But I got scared off thinking about potential tricky cases eg continuing / vs regex, continuing +/- vs unary , as well as knowing that LINE_CONTINUER is kind of a hack. But maybe if this grammar-based approach proves sound we could allow it for some unambiguous binary operators as well. The only other thing I'd worry about is that by allowing a lot of leading-continuer "symbols" we could potentially be blocking the way for some future language constructs that would want to use those symbols as leading tokens

@Inve1951 with this branch you couldn't write your example exactly as written - for one thing you couldn't indent the clause after "standalone" and/or (which I don't hate the look of but it kind of reads as a "block"). Also are there implied parens around the inner or because it's "nested" inside the and? If so, again I don't hate how it looks but that's a rather complex and specialized indentation style/rule that'd have to be supported

GeoffreyBooth · 2018-05-20T00:49:13Z

src/rewriter.coffee

@@ -661,6 +662,18 @@ exports.Rewriter = class Rewriter
      @detectEnd i + 1, condition, action
      return 1

+  # Convert TERMINATOR followed by && or || into a single LEADING_&& or
+  # LEADING_|| token to disambiguate grammar.


Why not LEADING_AND and LEADING_OR? We prefer the English word names as a general rule in the codebase. Unless you need separate tokens for LEADING_AND and LEADING_&&?

Speaking of which, this should work for and as well as &&.

@GeoffreyBooth sure, I used LEADING_&& to mimic the existing && token type, but agreed it reads badly, updated

It works equivalently for and and &&, as by the time the rewriter/grammar sees them they're the same token type (and both variants are included in INDENT_SUPPRESSOR). A couple of the tests use &&/||

GeoffreyBooth · 2018-05-20T00:51:03Z

src/lexer.coffee

@@ -1294,6 +1302,7 @@ POSSIBLY_DIVISION   = /// ^ /=?\s ///
 HERECOMMENT_ILLEGAL = /\*\//

 LINE_CONTINUER      = /// ^ \s* (?: , | \??\.(?![.\d]) | \??:: ) ///
+INDENT_SUPPRESSOR   = /// ^ \s* (?: and\s+(?!:)\S | or\s+(?!:)\S | && | \|\| ) ///


The code could use a nice detailed comment explaining these two variables. Not the regexes themselves necessarily, though that wouldn’t hurt, but what a line continuer is and what INDENT_SUPPRESSOR is for (in terms of when and why you would suppress indents, of course).

@GeoffreyBooth sure, added comments for LINE_CONTINUER and INDENT_SUPPRESSOR - the comments I added explain the token mechanics, but not sure if you think there should also be eg examples of what syntax these regexes allow?

Inve1951 · 2018-05-20T01:14:13Z

@Inve1951 with this branch you couldn't write your example exactly as written - for one thing you couldn't indent the clause after "standalone" and/or (which I don't hate the look of but it kind of reads as a "block"). Also are there implied parens around the inner or because it's "nested" inside the and? If so, again I don't hate how it looks but that's a rather complex and specialized indentation style/rule that'd have to be supported

@helixbass correct. The indented block after standalone and implies parens in my example. I think that's a very straight forward and readable way to write it. I'm aware that in order to support this there's probably a lot of work involved. I brought it up right away to avoid a breaking change when considering this later only.

Speaking of which, that would already be a breaking change. 😕

helixbass

@GeoffreyBooth updated per your comments

helixbass · 2018-05-20T15:24:02Z

src/lexer.coffee

@@ -1294,6 +1302,7 @@ POSSIBLY_DIVISION   = /// ^ /=?\s ///
 HERECOMMENT_ILLEGAL = /\*\//

 LINE_CONTINUER      = /// ^ \s* (?: , | \??\.(?![.\d]) | \??:: ) ///
+INDENT_SUPPRESSOR   = /// ^ \s* (?: and\s+(?!:)\S | or\s+(?!:)\S | && | \|\| ) ///


@GeoffreyBooth sure, added comments for LINE_CONTINUER and INDENT_SUPPRESSOR - the comments I added explain the token mechanics, but not sure if you think there should also be eg examples of what syntax these regexes allow?

helixbass · 2018-05-20T15:26:15Z

src/rewriter.coffee

@@ -661,6 +662,18 @@ exports.Rewriter = class Rewriter
      @detectEnd i + 1, condition, action
      return 1

+  # Convert TERMINATOR followed by && or || into a single LEADING_&& or
+  # LEADING_|| token to disambiguate grammar.


@GeoffreyBooth sure, I used LEADING_&& to mimic the existing && token type, but agreed it reads badly, updated

It works equivalently for and and &&, as by the time the rewriter/grammar sees them they're the same token type (and both variants are included in INDENT_SUPPRESSOR). A couple of the tests use &&/||

helixbass · 2018-05-20T15:39:26Z

@Inve1951 here's the code you linked to:

bTest = 
  abc and
    d or g

To be clear, this PR isn't a "breaking change" with regards to this example nor should it be a breaking change at all

If you're saying that your proposed "indented or clause implies parens" syntax is breaking then ya I'd say this is a good example of why that syntax, while perhaps convenient, would be at the very least confusing with regards to existing syntax. I don't think there's a great way around the fact that someone can indent in a misleading way like in the above example (which implies abc and (d or g) but in fact is equivalent to (abc and d) or g). Although I'll say that using Prettier helps point it out by both explicitly parenthesizing mixed and/or expressions for you and for longer expressions using indentation to help show the shape of the parsed expression. Eg the Prettier plugin I'm working on formats your example like:

bTest = (abc and d) or g

helixbass · 2018-05-21T12:18:33Z

src/rewriter.coffee

@@ -649,7 +650,8 @@ exports.Rewriter = class Rewriter
    condition = (token, i) ->
      [tag] = token
      [prevTag] = @tokens[i - 1]
-      tag is 'TERMINATOR' or (tag is 'INDENT' and prevTag not in SINGLE_LINERS)
+      [nextTag] = @tokens[i + 1] unless i is @tokens.length - 1
+      tag is 'TERMINATOR' and nextTag not in LEADING_LOGICAL or (tag is 'INDENT' and prevTag not in SINGLE_LINERS)


@GeoffreyBooth made a small update to support leading logical operators inside eg an if condition (added corresponding tests). For example:

if someCondition and someOtherCondition or somethingElse doSomething()

Basically what was happening was since tagPostfixConditionals() runs before tagLeadingLogical(), the IF was seeing a following TERMINATOR (which hadn't yet been converted into eg LEADING_AND) and deciding that it was a POST_IF

Can't just move tagLeadingLogical() to before tagPostfixConditionals(), since there's a dependency: tagLeadingLogical() should run after addImplicitBracesAndParens(), which should run after tagPostfixConditionals()

So basically detect what will become an eg LEADING_AND here to avoid tagging as a postfix conditional

carlsmith · 2018-05-21T13:02:59Z

This is nice:

if someCondition
    and someOtherCondition
    or somethingElse
  doSomething()

But the indentation is confusing. I would have thought the block should be more indented than the predicate, like:

if someCondition
    and someOtherCondition
    or somethingElse
        doSomething()

Does the block need to be less indented?

The named and and or operators complement if, making the whole thing more readable. Having other operators, like

if someValue
    + foo
    * bar
        doSomething()

Doesn't have the same appeal. There's a case for limiting this to named logical operators, and (a bit weaker case for) only when they're part of a predicate.

helixbass · 2018-05-21T14:31:29Z

@carlsmith nope the indentation of the continuing lines is up to you (as long as they’re not outdented), eg all of these work:

if someCondition
and someOtherCondition
or somethingElse
  doSomething()

if someCondition
  and someOtherCondition
  or somethingElse
    doSomething()

if someCondition
  and someOtherCondition
  or somethingElse
  doSomething()

if someCondition
    and someOtherCondition
      or somethingElse
  doSomething()

GeoffreyBooth · 2018-05-21T17:20:21Z

@helixbass The third example shouldn’t work. This is invalid today:

if someCondition and
  someOtherCondition or
  somethingElse
  doSomething()

helixbass · 2018-05-21T18:29:18Z

@GeoffreyBooth ya I noticed that behavior with respect to chaining. While obviously it's a poor choice to "indent" the continuations the same as the following indented block, it also feels inconsistent that every other indentation is allowed:

if a
.b
  c

if a
 .b
  c

if a
    .b
  c

but just that one (where the indentation "happens to match up") fails:

if a
  .b
  c

I also wasn't sure exactly what was causing the current behavior (for chaining or your example), I'm guessing it's that LINE_CONTINUER does this when it sees what would be an indent:

@indebt = size - @indent unless backslash

which I haven't fully thought through but seemed questionable

So if it's desirable to preserve this behavior (of failing if you try to "indent" continuation lines the same as the following actual indent) then I'll look more closely at how LINE_CONTINUER currently achieves that and think about whether there's a better way to implement it for INDENT_SUPPRESSOR

GeoffreyBooth · 2018-05-21T18:33:20Z

I’m not sure what’s desirable. I think the whole idea of indentation being insignificant strikes me as wrong, that we’re going against a core tenet of the language.

helixbass · 2018-05-21T19:08:04Z

@GeoffreyBooth I understand the unease. I do think that for both chaining and this, it makes sense to be flexible as far as how continuation lines are "indented", ie it wouldn't be a great idea syntax-wise to enforce a rigid "you have to align continuation lines with the initial line" or "you have to indent continuation lines consistently with respect to the initial line and a possible following indented block". So I think the mechanism here (INDENT_SUPPRESSOR) is at least an improvement as far as reducing the amount of non-grammatical magic/special-casing

I believe Python works this way, where once you're continuing a line, indentation is not as strict

I think the important questions then are whether there are strange interactions around these features that could be handled better. On a separate branch, I've been trying applying INDENT_SUPPRESSOR to chaining (so basically getting rid of LINE_CONTINUER) and it did basically work (as far as passing existing tests) and simplified code. But it seems clear the biggest edge case is around implicit objects eg

a: b
    .c()

If you see eg #4533, this has already been kind of an issue. But it's still wildly inconsistent, it only seems to apply the chain to the preceding object-property value if it's a non-initial property of a multi-property object, eg the above is currently equivalent to {a: b}.c(). While this:

a: b
c: d
    .e()

is equivalent to {a: b, c: d.e()}. And this:

a: b
    .c()
d: e

is equivalent to {a: b}.c(); {d: e}

So in thinking about how to handle the equivalent case for leading logical operators (which we haven't defined the behavior of yet) and at least in theory how it should work for chaining I'd propose:

the better choice between whether the chain/logical operator applies to the whole object or the preceding value is the preceding value, as use cases for chaining or logical operator following an object seem quite small (logical operators don't make sense, objects are consistently truthy)
it should be consistent, so all three of the above variants (or their corresponding examples with logical operator continuers rather than chains) should compile into a single object where one of the values was continued
I should implement that behavior for leading logical operators as part of this branch
then we should consider whether it'd be worth introducing some breakage at some point to at least make chaining consistent with regards to implicit objects and perhaps migrate chaining to also use INDENT_SUPPRESSOR

vendethiel · 2018-05-23T09:55:11Z

Detecting such things with regexes has a downside, if very minor:

$ ./bin/coffee -bcs
a
  and ::b

[stdin]:2:1: error: unexpected indentation
  and ::b
^^

Also while supporting more operators here would be – imho – a good thing, although there's no way to support operators that are both unary and binary.

helixbass · 2018-05-23T12:37:04Z

@vendethiel ah interesting edge case, ya that wouldn't have parsed anyway but it is a slightly unexpected error message. Updated the regex and added a test case

To explain more fully for those interested:

Initially the INDENT_SUPPRESSOR regex just looked for leading and/or followed by whitespace (or &&/||). Then our lexer.coffee happened to contain a case that I hadn't thought of where that is actually a legitimate beginning of a line: and/or as an object key with whitespace before the :, eg and : 1. So extended the regex to check that the first non-whitespace character after and/or isn't :

But now @vendethiel pointed out an edge case to that edge case: a : may not be a : token, it might be part of :: (even though you'd always get parse errors for and and :: as consecutive tokens). So extended the regex to verify that it's a single : (not followed by another :), so that his example, even though still not valid code, will parse as a continuation and fail on unexpected :: rather than unexpected indentation

carlsmith · 2018-05-27T20:59:03Z

Thanks for the clarification.

helixbass · 2018-07-07T15:00:00Z

@GeoffreyBooth implemented leading-logical-operator-continues-object-property-value behavior per previous comment and merged ast, so this should be ready for review

GeoffreyBooth · 2018-07-07T18:26:33Z

This isn't really required for generating an AST, is it? It's just for making the Prettier output a little prettier?

If so, this feels like something we should add in a second phase, once we have regular AST output working. This change could have other side effects that we should study before merging it in.

voxsoftware · 2019-03-23T19:01:44Z

Hi, for all, thanks for effort in make better coffeescript but I really thinks that this proposal is not good or necessary. I prefer use explicit \

shouldOmitBracesButNotIndent =
  (parent.type is 'ExpressionStatement' \
    and grandparent.type is 'ClassBody') \
  or (parent.type is 'ArrayExpression' and parent.elements.length is 1) \
  or isObjectPropertyValue path, {stackOffset, last: yes}

Inve1951 · 2019-03-28T18:09:29Z

Of course it's not necessary but neither are any of the other implicit line continuations; They are nice to have tho.
When a line starts with a logical operator it can really only mean one thing so why not support it.

I really thinks that this proposal is not good or necessary.

Why do you dislike it? I personally crave leading logicals and hope they get added into master. @voxsoftware
Note that your code exampe would still be valid.

voxsoftware · 2019-03-28T19:49:59Z

@Inve1951 yes, my example is using explicit, I prefer that syntax and avoid errors can inherited from have a non explicit version

helixbass · 2019-12-30T20:20:08Z

Re-opened as #5279

helixbass force-pushed the allow-leading-logical-operators branch from fe0298c to 2a7d1fe Compare May 19, 2018 23:52

GeoffreyBooth requested changes May 20, 2018

View reviewed changes

helixbass commented May 20, 2018

View reviewed changes

helixbass mentioned this pull request May 21, 2018

Chained do IIFE: use grammar, not rewriter #5070

Merged

helixbass commented May 21, 2018

View reviewed changes

GeoffreyBooth changed the base branch from master to ast June 11, 2018 04:45

helixbass added 6 commits March 17, 2019 16:01

allow non-indented leading logical

c85e6c0

suppress indents

2b7725f

fixes from code review

36dd387

allow leading logical inside if condition

4cf6652

fix from code review

0f3f5ac

continue object property values

99a5b81

helixbass force-pushed the allow-leading-logical-operators branch from 57c967e to 99a5b81 Compare March 17, 2019 20:10

fix rebase

11ca17f

GeoffreyBooth closed this Dec 30, 2019

helixbass mentioned this pull request Dec 30, 2019

Allow leading logical operators #5279

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow leading logical operators #5068

Allow leading logical operators #5068

helixbass commented May 16, 2018 •

edited by GeoffreyBooth

GeoffreyBooth commented May 16, 2018

Inve1951 commented May 17, 2018

jashkenas commented May 17, 2018

GeoffreyBooth commented May 17, 2018

helixbass commented May 20, 2018

GeoffreyBooth May 20, 2018

helixbass May 20, 2018

GeoffreyBooth May 20, 2018

helixbass May 20, 2018

Inve1951 commented May 20, 2018

helixbass left a comment

helixbass May 20, 2018

helixbass May 20, 2018

helixbass commented May 20, 2018

helixbass May 21, 2018

carlsmith commented May 21, 2018 •

edited

helixbass commented May 21, 2018 •

edited by GeoffreyBooth

GeoffreyBooth commented May 21, 2018

helixbass commented May 21, 2018

GeoffreyBooth commented May 21, 2018

helixbass commented May 21, 2018 •

edited

vendethiel commented May 23, 2018 •

edited

helixbass commented May 23, 2018

carlsmith commented May 27, 2018

helixbass commented Jul 7, 2018

GeoffreyBooth commented Jul 7, 2018

voxsoftware commented Mar 23, 2019

Inve1951 commented Mar 28, 2019

voxsoftware commented Mar 28, 2019

helixbass commented Dec 30, 2019

Allow leading logical operators #5068

Allow leading logical operators #5068

Conversation

helixbass commented May 16, 2018 • edited by GeoffreyBooth

GeoffreyBooth commented May 16, 2018

Inve1951 commented May 17, 2018

jashkenas commented May 17, 2018

GeoffreyBooth commented May 17, 2018

helixbass commented May 20, 2018

GeoffreyBooth May 20, 2018

Choose a reason for hiding this comment

helixbass May 20, 2018

Choose a reason for hiding this comment

GeoffreyBooth May 20, 2018

Choose a reason for hiding this comment

helixbass May 20, 2018

Choose a reason for hiding this comment

Inve1951 commented May 20, 2018

helixbass left a comment

Choose a reason for hiding this comment

helixbass May 20, 2018

Choose a reason for hiding this comment

helixbass May 20, 2018

Choose a reason for hiding this comment

helixbass commented May 20, 2018

helixbass May 21, 2018

Choose a reason for hiding this comment

carlsmith commented May 21, 2018 • edited

helixbass commented May 21, 2018 • edited by GeoffreyBooth

GeoffreyBooth commented May 21, 2018

helixbass commented May 21, 2018

GeoffreyBooth commented May 21, 2018

helixbass commented May 21, 2018 • edited

vendethiel commented May 23, 2018 • edited

helixbass commented May 23, 2018

carlsmith commented May 27, 2018

helixbass commented Jul 7, 2018

GeoffreyBooth commented Jul 7, 2018

voxsoftware commented Mar 23, 2019

Inve1951 commented Mar 28, 2019

voxsoftware commented Mar 28, 2019

helixbass commented Dec 30, 2019

helixbass commented May 16, 2018 •

edited by GeoffreyBooth

carlsmith commented May 21, 2018 •

edited

helixbass commented May 21, 2018 •

edited by GeoffreyBooth

helixbass commented May 21, 2018 •

edited

vendethiel commented May 23, 2018 •

edited