Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: Incorrect code path analysis for return in try blocks with catch clause #17579

Open
1 task done
fasttime opened this issue Sep 18, 2023 · 30 comments
Open
1 task done
Assignees
Labels
accepted There is consensus among the team that this change meets the criteria for inclusion breaking This change is backwards-incompatible bug ESLint is working incorrectly repro:yes

Comments

@fasttime
Copy link
Member

fasttime commented Sep 18, 2023

Environment

Node version: v20.6.1
npm version: v9.8.1
Local ESLint version: Not found
Global ESLint version: v8.49.0
Operating System: darwin 22.5.0

What parser are you using?

Default (Espree)

What did you do?

// test.js
function foo() {
    try {
        bar();
        return;
    } catch {}
    baz();
}

What did you expect to happen?

From #16593 (comment):

code path analysis considers segments containing returns from the try block as possible previous segments for segments in the catch block. This is logically incorrect, as the execution after return; cannot go into the catch block, but fixing this in the code path analysis would be a breaking change.

I can reproduce this in current versions of ESLint for the above code. For example, when I run this command in a zsh terminal:

ESLINT_USE_FLAT_CONFIG=true DEBUG=eslint:code-path npx eslint --no-config-lookup test.js

I'm getting this Graphviz code in the output:

digraph {
node[shape=box,style="rounded,filled",fillcolor=white];
initial[label="",shape=circle,style=filled,fillcolor=black,width=0.25,height=0.25];
final[label="",shape=doublecircle,style=filled,fillcolor=black,width=0.25,height=0.25];
s2_1[label="FunctionDeclaration:enter\nIdentifier (foo)\nBlockStatement:enter\nTryStatement:enter\nBlockStatement:enter\nExpressionStatement:enter\nCallExpression:enter\nIdentifier (bar)"];
s2_2[label="CallExpression:exit\nExpressionStatement:exit\nReturnStatement"];
s2_3[style="rounded,dashed,filled",fillcolor="#FF9800",label="<<unreachable>>\nBlockStatement:exit"];
s2_4[label="CatchClause:enter\nBlockStatement\nCatchClause:exit"];
s2_5[label="TryStatement:exit\nExpressionStatement:enter\nCallExpression:enter\nIdentifier (baz)\nCallExpression:exit\nExpressionStatement:exit\nBlockStatement:exit\nFunctionDeclaration:exit"];
initial->s2_1->s2_2->s2_3->s2_4->s2_5;
s2_1->s2_4;
s2_3->s2_5;
s2_2->final;
s2_5->final;
}

See the diagram

Notice the edge s2_3->s2_4 connecting the unreachable segment to the catch clause which seems wrong.

What actually happened?

There should be no parent-child connection from the unreachable segment after the return statement to the segment with the catch clause.

stateDiagram-v2
classDef common fill: white, stroke: black
class s2_1, s2_2, s2_4, s2_3, s2_5 common
classDef unreachable fill: #FF9800, stroke-dasharray: 5 5
class s2_3 unreachable
state "FunctionDeclaration:enter\nIdentifier (foo)\nBlockStatement:enter\nTryStatement:enter\nBlockStatement:enter\nExpressionStatement:enter\nCallExpression:enter\nIdentifier (bar)" as s2_1
state "CallExpression:exit\nExpressionStatement:exit\nReturnStatement" as s2_2
state "CatchClause:enter\nBlockStatement\nCatchClause:exit" as s2_4
state "❮❮unreachable❯❯\nBlockStatement:exit" as s2_3
state "TryStatement:exit\nExpressionStatement:enter\nCallExpression:enter\nIdentifier (baz)\nCallExpression:exit\nExpressionStatement:exit\nBlockStatement:exit\nFunctionDeclaration:exit" as s2_5
[*] --> s2_1
s2_1 --> s2_2
s2_1 --> s2_4
s2_2 --> s2_3
s2_4 --> s2_5
s2_3 --> s2_4: WRONG!
s2_3 --> s2_5
s2_2 --> [*]
s2_5 --> [*]

Link to Minimal Reproducible Example

https://stackblitz.com/edit/stackblitz-starters-3zwhfs

Participation

  • I am willing to submit a pull request for this issue.

Additional comments

I haven't figured out how to fix this yet, so I can't say if the fix would have any side effects.

This bug can be worked around (we are doing this for example in the no-useless-return rule), so it's an open question whether this should be tackled in v9.

@fasttime fasttime added bug ESLint is working incorrectly repro:needed labels Sep 18, 2023
@fasttime fasttime changed the title Bug: Incorrect code analysis for return in try blocks with catch clause Bug: Incorrect code path analysis for return in try blocks with catch clause Sep 18, 2023
@fasttime fasttime added breaking This change is backwards-incompatible repro:yes and removed repro:needed labels Sep 18, 2023
@BYK
Copy link
Member

BYK commented Sep 19, 2023

I think #16593 (comment) is a clearer demonstration of the issue.

@nzakas nzakas self-assigned this Sep 20, 2023
@nzakas nzakas added the accepted There is consensus among the team that this change meets the criteria for inclusion label Sep 20, 2023
@nzakas
Copy link
Member

nzakas commented Sep 20, 2023

I think this is something we should fix. Whether or not it's practical to do in the v9 timeframe is the question.

I've just spent a bunch of time digging into code path analysis so I'll take a look.

@nzakas
Copy link
Member

nzakas commented Oct 2, 2023

I took an initial look at this. At this point, it seems like the problem is in this method:

makeCatchBlock() {
const context = this.tryContext;
const forkContext = this.forkContext;
const originalThrownForkContext = context.thrownForkContext;
/*
* We are now in a catch block so we need to update the context
* with that information. This includes creating a new fork
* context in case we encounter any `throw` statements here.
*/
context.position = "catch";
context.thrownForkContext = ForkContext.newEmpty(forkContext);
context.lastOfTryIsReachable = forkContext.reachable;
// Merge the thrown paths from the `try` and `catch` blocks
originalThrownForkContext.add(forkContext.head);
const thrownSegments = originalThrownForkContext.makeNext(0, -1);
// Fork to a bypass and the merged thrown path.
this.pushForkContext();
this.forkBypassPath();
this.forkContext.add(thrownSegments);

When we get here, forkContext.reachable is false, but we don't know that this is because of the return, which actually doesn't have any unreachable segments after it (alas, return statements always add unreachable segments).

Question: Should the unreachable block even be there at all?

(Just leaving these as notes for myself, or in case anyone else wants to take a look.)

@nzakas
Copy link
Member

nzakas commented Oct 3, 2023

Does this look right to everyone? I've hacked together some stuff that is definitely not ready for production. Just trying to see how close I can get.

graphviz(11)

@fasttime
Copy link
Member Author

fasttime commented Oct 4, 2023

Does this look right to everyone? I've hacked together some stuff that is definitely not ready for production. Just trying to see how close I can get.

graphviz(11)

I'm clearly not the one who's supposed to know better, so sorry if my comments sound naive. Just unclear about a couple of things in this graph.

  1. The edge CallExpression:exit -> TryStatement:exit is not obvious to me. After the return statement, baz ist not called, so I wouldn't expect a connection there.
  2. I don't fully grasp the meaning of an edge from a reachable segment to an unreachable one, but I guess the edge CallExpression:exit -> <<unreachable>> is correct at its place (maybe because a return statement must precede the unreachable segment it always creates)?
  3. An edge from an unreachable to a reachable segment is more understandable to me, so I would expect an arrow like <<unreachable>> -> TryStatement:exit, but there is none in the above graph.

@nzakas
Copy link
Member

nzakas commented Oct 4, 2023

  1. There is no link from CallExpression:exit to TryStatement:exit. Inside of each block, it's the last line that indicates what is connecting to the next segment. So in that block, we have CallExpression:exit,ExpressionStatement:exit,ReturnStatement it's ReturnStatement that links to the unreachable segment.
  2. Again, this is actually ReturnStatement -> <<unreachable>>.
  3. Hmmmm, I'm not sure what makes more sense here. To me, it makes sense that an unreachable segment doesn't connect to anywhere else, but I can see your point as well. @mdjermanovic care to chime in?

@nzakas
Copy link
Member

nzakas commented Oct 9, 2023

@mdjermanovic looking for your feedback here.

@mdjermanovic
Copy link
Member

For reference, here's the original issue: #7481

@mdjermanovic
Copy link
Member

3. To me, it makes sense that an unreachable segment doesn't connect to anywhere else, but I can see your point as well.

Unreachable segments continuing to other segments is common in our code path analysis and seems logical to me.

For example, in if (a) { return; } b;, unreachable end of the if block continues to the code after if:

image

@mdjermanovic
Copy link
Member

Does this look right to everyone? I've hacked together some stuff that is definitely not ready for production. Just trying to see how close I can get.

graphviz(11)

The arrows from ReturnStatement to CatchClause:enter and TryStatement:exit seem wrong to me as the execution after return; can't go into either. I'd expect this graph:

image

@mdjermanovic
Copy link
Member

Code path analysis currently creates a connection from the first throwable node (references are considered to be throwable) to the catch clause, and from the end of the try block (BlockStatement:exit) to the catch clause.

For example:

try {
    a;
    b;
    c;
} catch {}

after;

image

Now, connections from BlockStatement:exit of the try block to the catch clause always seem logically wrong, because the execution from that point (whether it's reachable or not) never falls into the catch clause. It either goes to finally, if there's finally, or exits the whole try statement and continues to the code that comes after.

However, if we just remove it:

image

then there will be no paths from other throwable nodes (b and c) to the catch clause.

A solution for logically correct code paths could be to make a connection from each throwable node to the catch clause, as suggested in #7481 (comment), but that could create huge graphs and thus is a performance concern as noted in #7481.

Another solution, that doesn't create that many segments, could be to somehow split segments into throwable and non-throwable parts, but that seems very complicated. The graph could look like this:

image

@nzakas
Copy link
Member

nzakas commented Oct 11, 2023

The arrows from ReturnStatement to CatchClause:enter and TryStatement:exit seem wrong to me as the execution after return; can't go into either.

I can see your point about CatchClause:enter but I'm not sure I agree about TryStatement:exit. Logically, the return causes you to exit the try statement altogether, so an arrow from ReturnStatement to TryStatement:exit makes sense to me.

Also, keep in mind that these are not 100% relevant to the code path and are just output by the debugging to show where in the traversal we are. It doesn't necessarily mean the segment represents exiting a try statement, just that that's where the traversal is when that segment is created.

And your second comment seems like a tangential issue with how try-catch is handled currently? If you agree, I'd suggest opening a separate issue, as this discussion is already difficult enough to follow.

@mdjermanovic
Copy link
Member

I can see your point about CatchClause:enter but I'm not sure I agree about TryStatement:exit. Logically, the return causes you to exit the try statement altogether, so an arrow from ReturnStatement to TryStatement:exit makes sense to me.

Also, keep in mind that these are not 100% relevant to the code path and are just output by the debugging to show where in the traversal we are. It doesn't necessarily mean the segment represents exiting a try statement, just that that's where the traversal is when that segment is created.

Segment with TryStatement:exit includes code after the try statement. Per the graph in #17579 (comment), there's an execution path that goes through return; and baz(), which shouldn't be the case.

And your second comment seems like a tangential issue with how try-catch is handled currently? If you agree, I'd suggest opening a separate issue, as this discussion is already difficult enough to follow.

I think it's the same problem. Because code path analysis always generates a connection from the end of the try block to the catch block, the graph in #17579 (comment) suggests that this return; can prevent the execution from falling into the catch block, which isn't true. This was the problem in #7481.

@nzakas
Copy link
Member

nzakas commented Oct 12, 2023

Okay, I'm having trouble keeping all of that in my brain at the moment. Would you be interested in tackling this issue?

@nzakas
Copy link
Member

nzakas commented Jan 16, 2024

Folks, I'm way too burned out at the moment to even attempt to look at this. @mdjermanovic or @fasttime, do you want to take a look?

Otherwise, let's just move this out of v9.

@nzakas nzakas removed their assignment Jan 16, 2024
@fasttime
Copy link
Member Author

I can have a look, but no problem to move this out of v9.

@nzakas
Copy link
Member

nzakas commented Jan 17, 2024

Thanks @fasttime. I'd love to get this done, but also realize that code path analysis is hard and I don't think it should block v9.

@nzakas
Copy link
Member

nzakas commented Feb 5, 2024

@fasttime are you still working on this?

@fasttime
Copy link
Member Author

fasttime commented Feb 6, 2024

@fasttime are you still working on this?

Sorry, I've been busy doing other work. I can do some analysis on this on the weekend.

@nzakas
Copy link
Member

nzakas commented Feb 6, 2024

Okay, no problem. This is one of the last two issues we had planned for the beta, so if you don't think you'll be able to address this soonish, we can move it to v10.

@sam3k
Copy link
Contributor

sam3k commented Feb 9, 2024

Per 2024-02-08 tsc-meeting, this issue won't be blocking today's beta.0 release and we will target beta.1 release.

Corrected release *

@nzakas
Copy link
Member

nzakas commented Feb 9, 2024

That should be the beta.1 release.

@fasttime
Copy link
Member Author

Just wanted to give an update on what I've found out so far. It seems to me that all unreachable segments produced by a statement that exits a try block are getting an inconsistent connection to the catch clause.

For example, I believe this is another case where code path analysis is producing an incorrect connection:

foo:
for (;;) {
    try {
        bar();
        break foo;
    }
    catch {
        baz();
    }
}

graphviz

In this example, the break statement generates an unreachable segment, similarly to any return statement. All statements that stop the regular top-down control flow produce unreachable segments. If my analysis is correct, the incoming connection to an unreachable segment indicates what would happen if the generating statement (break, return, continue, or an infinite loop) did not change the control flow at that point. The target segment being unreachable is an indication by itself that the connection is actually impossible: in fact, the whole point of an unreachable segment is to indicate an impossible connection from its parent segment to its child segments (it's debatable whether this is the best model to describe the flow of a program, but that's how it's done).

Following this model, in this example there should be no connection from "<<unreachable>> BreakStatement:exit" to "CatchClause:enter", because even if break foo did not cause a change in control flow, there would be way to enter the catch clause after the break statement. At least, this is the case under the assumption that a break statement by itself cannot throw.

So I think that the fix for this bug should consist in disconnecting unreachable segments from segments that enter a catch clause. The unreachable segments should probably still remain in place to indicate the impossible flow of the program from the statement in the try block to what comes after the try-catch statement.

@nzakas
Copy link
Member

nzakas commented Feb 12, 2024

So I think that the fix for this bug should consist in disconnecting unreachable segments from segments that enter a catch clause. The unreachable segments should probably still remain in place to indicate the impossible flow of the program from the statement in the try block to what comes after the try-catch statement.

This seems correct to me. 👍 @mdjermanovic what do you think?

@fasttime
Copy link
Member Author

This is another example that highlights the problem even more:

try { }
catch {
    foo();
}

Here, the code inside the catch clause will never run, because the try block never throws, right? Code path analysis doesn't care and the result is a straight connection from the empty try block into the catch block:

graphviz(1)

It seems unlikely that an empty block can ever throw an exception, but it's a possibility that cannot be completely ruled out. For example, an exception could occur while halting on a breakpoint, or the runtime could throw an exception to report an unexpected error like a stack overflow or out-of-memory condition. At least in theory, it's not impossible for an exception to pop out of nowhere while running code that does nothing. Technically this can happen and it could be entirely unpredictable.

But if we accept that an empty block can throw, then so can an empty return or a break statement, and in this case, the code path analysis would be correct in connecting the unreachable segments of these statements to the catch clause, and this issue should be closed.

I think that for almost all practical cases it's safe to assume that certain nodes never throw, and it would be good to teach code path analysis to recognize those nodes and not to branch into the catch clause. This should fix the incorrect connection in try-catch statements for unreachable segments that just exit some nodes, and also for segments whose nodes never throw, like an empty block.

@mdjermanovic
Copy link
Member

mdjermanovic commented Feb 15, 2024

There is logic that determines whether a node can throw, but it's used only to make the first throwable path:

case "Identifier":
if (isIdentifierReference(node)) {
state.makeFirstThrowablePathInTryBlock();
dontForward = true;
}
break;
case "CallExpression":
case "ImportExpression":
case "MemberExpression":
case "NewExpression":
case "YieldExpression":
state.makeFirstThrowablePathInTryBlock();
break;

The last path is always made from the end of the try block. Additional paths are made only for explicit throw statements. This was intentional (and documented) for performance reasons (#7481 (comment)).

A logically correct solution would probably be to create a path for each throwable node. And never from BlockStatement:exit of the try block regardless of whether it's reachable or not. That would solve problems like this one:

/* eslint no-useless-assignment: "error" */

function foo() {
    // does nothing
}

function bar() {
    throw new Error("Boom!");
}

let x;

try {
    foo();
    x = 1; // false positive, this value will be used in `catch {}` if `bar()` throws.
    bar();
    x = 2; // false negative, this one is actually never used.
} catch {
    console.log(x); // 1
}

@fasttime
Copy link
Member Author

A logically correct solution would probably be to create a path for each throwable node. And never from BlockStatement:exit of the try block regardless of whether it's reachable or not.

Do you mean to split the try block into as many segments as there are throwable nodes and (contiguous) non-throwable nodes, and create paths only from the throwable segments into the catch block?

I guess that should work, although it would make the code graph much larger in some cases. Also, if there is no surrounding try-block, shouldn't the throwable segments have paths to the end of the function or program for consistency?

/* eslint no-useless-assignment: "error" */

function foo() {
    // does nothing
}

function bar() {
    throw new Error("Boom!");
}

let x;

try {
    foo();
    x = 1; // false positive, this value will be used in `catch {}` if `bar()` throws.
    bar();
    x = 2; // false negative, this one is actually never used.
} catch {
    console.log(x); // 1
}

This looks a different bug than the one in the original post, but I agree that it would be good to fix this case as well while we rework the code path analysis.

@mdjermanovic
Copy link
Member

Do you mean to split the try block into as many segments as there are throwable nodes and (contiguous) non-throwable nodes, and create paths only from the throwable segments into the catch block?

Yes, I think that would be a logically correct code path analysis.

I guess that should work, although it would make the code graph much larger in some cases.

Performance is a concern, indeed. That seems to be the reason why code path analysis in try-catch wasn't implemented this way from the start (#7481 (comment)).

Also, if there is no surrounding try-block, shouldn't the throwable segments have paths to the end of the function or program for consistency?

That would be correct for consistency. However, it would be an even bigger performance concern, while not being of much use to the rules, I think, so we could leave code path analysis outside try-catch as is.

/* eslint no-useless-assignment: "error" */

function foo() {
    // does nothing
}

function bar() {
    throw new Error("Boom!");
}

let x;

try {
    foo();
    x = 1; // false positive, this value will be used in `catch {}` if `bar()` throws.
    bar();
    x = 2; // false negative, this one is actually never used.
} catch {
    console.log(x); // 1
}

This looks a different bug than the one in the original post, but I agree that it would be good to fix this case as well while we rework the code path analysis.

I think the x = 2; false negative is the same problem - there should be no connection from BlockStatement:exit to CatchClause:enter.

@sam3k
Copy link
Contributor

sam3k commented Mar 12, 2024

Per 2024-03-07 TSC meeting, we will follow up on the status of this issue in a few weeks. If we are still not close to done by then, we will contemplate moving to v10.

@fasttime
Copy link
Member Author

This issue has been moved to v10 as agreed in the 2024-03-21 TSC meeting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted There is consensus among the team that this change meets the criteria for inclusion breaking This change is backwards-incompatible bug ESLint is working incorrectly repro:yes
Projects
Status: Evaluating
Development

No branches or pull requests

5 participants