Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.load infinite loop in REPL #46731

Closed
harrisi opened this issue Feb 19, 2023 · 6 comments · Fixed by #46742
Closed

.load infinite loop in REPL #46731

harrisi opened this issue Feb 19, 2023 · 6 comments · Fixed by #46742
Labels
confirmed-bug Issues with confirmed bugs. repl Issues and PRs related to the REPL subsystem.

Comments

@harrisi
Copy link
Contributor

harrisi commented Feb 19, 2023

Version

v19.6.0

Platform

Darwin 22.3.0 Darwin Kernel Version 22.3.0: Mon Jan 30 20:38:43 PST 2023; root:xnu-8792.81.3~2/RELEASE_ARM64_T8112 arm64

Subsystem

repl

What steps will reproduce the bug?

$ cat foo.js
function a(b) {
  return b }
a(1)
$ node
Welcome to Node.js v19.6.0.
Type ".help" for more information.
> .load foo.js
  function a(b) {
  function a(b) {
  function a(b) {
  ...

This prints function a(b) { forever until killing the process (it doesn't respond to Ctrl-c)

How often does it reproduce? Is there a required condition?

Every time, with the correct file.

What is the expected behavior?

$ cat foo.js
function a(b) {
  return b }
a(1)
$ node
Welcome to Node.js v19.6.0.
Type ".help" for more information.
> .load foo.js
function a(b) {
  return b }
a(1)

1
>

What do you see instead?

  function a(b) {
  function a(b) {
  function a(b) {
  ...

Additional information

I'm struggling to figure out what's happening here. If the file contents don't include a new line after the first {, i.e.

function a(b) { return b }
a(1)

everything works fine. This lead me to believe it was somehow a parsing issue, but there are a number of iterations of the file that either do or do not work that don't seem to line up. The following files do work (separated by lines only containing //):

// same as above
function a(b) { return b }
a(1)

// not calling a
function a(b) {
  return b }

// using arrow function
a = b => { return b }

// random example that works fine
a = 1 + 1

console.log(a)

The following do not work:

// same as above
function a(b) {
  return b }
a(1)

// using arrow function (not calling a)
a = b => {
  return b }

// using arrow function (calling a)
a = b => {
  return b }
a(1)

I believe there were a few other weird things that worked and didn't work.

The odd thing is that reverting 6874aa1, lines 1554-1559 (6874aa1#diff-392b628f0a5eb047b1117351e6eedcefb8ec70a48933479e0e9bdc711a04f047L1554-L1559), seems to fix this. I tried a few other things, such as loading the file into a buffer before writing it out, toString()ing it after Buffer.from()ing it), changing the encoding, and changing the line endings. None of these worked.

Also, the tests run fine, even when modifying them to include the text that breaks outside of the tests.

@harrisi
Copy link
Contributor Author

harrisi commented Feb 19, 2023

The behavior is the same with v18.14.1 and v18.13.0, but works fine with v18.12.1. The only relevant change I can see between the two is from #45539, but I'm not sure why that would be causing this issue. It's also a little too in the weeds for me to spend much more time on it right now, unfortunately. Hopefully this information helps at least.

@cola119 cola119 added the repl Issues and PRs related to the REPL subsystem. label Feb 20, 2023
@Theo-Steiner
Copy link
Contributor

I did some testing and this seems to also relate to newline characters:
The infinite loop does not occur if the loaded file does not end with a newline character.

Node 18.12.0Node 20.0.0-pre
without ending newline character
> const testFile = `function a(b) {\n  return b }\na(1)`
> fs.writeFileSync('foo.js', testFile)
> .load foo.js
function a(b) {
  return b }
  a(1)
1
> const testFile = `function a(b) {\n  return b }\na(1)`
> fs.writeFileSync('foo.js', testFile)
> .load foo.js
function a(b) {
  return b }
undefined
with ending newline character
> const testFile = `function a(b) {\n  return b }\na(1)\n`
> fs.writeFileSync('foo.js', testFile)
> .load foo.js
function a(b) {
  return b }
  a(1)
1

> const testFile = `function a(b) {\n  return b }\na(1)\n`
> fs.writeFileSync('foo.js', testFile)
> .load foo.js
  function a(b) {
  function a(b) {
  function a(b) {
  ...

Note that even without the ending newline character, the output between v20.0.0-pre and 18.12.0 is different.

@targos
Copy link
Member

targos commented Feb 20, 2023

Bisected to #45614

/cc @aduh95

@harrisi
Copy link
Contributor Author

harrisi commented Feb 20, 2023

I'm confused by the new line situation. For one, you can see that without the newline it's not even reading the last line, apparently. Actually, this is interesting:

> const fs = require('fs')
> fs.writeFileSync('foo.js', `function a(b) {\n  return b }`)
> .load foo.js
function a(b) {
... 

i.e., the repl is left in editor mode. That seems like a separate issue, however, so I'd be happy to open another issue if so.

@cola119
Copy link
Member

cola119 commented Feb 20, 2023

Minimum reproduce code: \n}\n

$ node
> fs.writeFileSync('foo.js', ` \n}\n`)
> .load foo.js



@Theo-Steiner
Copy link
Contributor

Theo-Steiner commented Feb 21, 2023

I found the Problem!

Problem

The lineEnding RegEx is declared on the module scope of lib/internal/readline/interface.js.
This means that its state for calling RegExpPrototypeExec can be overwritten if called alternately on different strings.
This is exactly what happens here for certain inputs, as the indentation preservation of the editor mode calls the same write method in lib/internal/readline/interface.js as the load method from within a while loop inside load. Thus RegEx state is overwritten in a very unfortunate way, and we end up with an infinite loop.

Detailed Description

  1. the load command's action method is called with a file path ('foo.js').
  2. editorMode is turned on and the file is read.
  3. this.write (this === repl) is called with the file content (' \n}\n') while in editorMode
  4. Write is not a method on the REPLServer itself, so we go up the prototype chain to the Interface.
  5. The Interface has the write method we were looking for and it is called with d = ' \n}\n', key = undefined.
  6. Since the REPL is not paused and we are in terminal mode, we call kTtyWrite, passing along s = d, key = key.
  7. Since key is undefined we fall right through to the default case
  8. since s = ' \n}\n' we enter the top if clause and call exec on the lineEnding RegEx on our string s.
  9. lineEnding (/\r?\n|\r(?!\n)/g) matches \r\n, \n, or \r followed by something other than \n.
  10. The returned value of calling exec on this RegEx with argument s is saved into a variable named nextMatch. With the minimal reproduction string of ' \n}\n' nextMatch is assigned ['\n', index: 1, input: ' \n{\n', groups: undefined]
  11. Since nextMatch is not null, we enter the next if clause.
  12. Here, we call kInsertString with a slice of the input string. We slice from 0 to the index of nextMatch, which is 1 in our case, hence ' ' is inserted.
  13. Now we move the start of our selection by saving the lastIndex property of the lineEnding RegEx that was set during our last invocation of exec.
  14. To move the end of our selection, exec is called with s again, to get the nextMatch. The variable assignment in our case is: s = ' \n}\n', lastIndex = 2, nextMatch = ['\n', index: 3, input: ' \n{\n', groups: undefined].
  15. Since nextMatch is not null, the while loop's body is entered, and this[kLine] is called. [kLine] calls this[kOnLine], which triggers a callback in lib/repl.js with argument cmd = line. In our case we inserted ' ' into the line, so cmd = ' '.
  16. Here a feature that is meant to preserve the indentation when writing code in the editorMode, calls a RegEx on the cmd to parse preceding whitespace and prefixes it to the next line. So it calls write with an argument of ' '.
  17. We enter kTtyWrite again from here with arguments c = ' ', and key = undefined.
  18. Since key=undefined we end up in the default case, but this time the lineEnding RegEx finds no match, so we jump [into the else clause.]
    (
    this[kInsertString](s);
    )
    ** However, the problem is that by using the same lineEnding RegEx, it loses its state from the top while loop, where we already called lineEnding.exec(s) and the next time we would call it, it would return null, since we iterated through all matches **
  19. After calling [kInsertString] and adding ' ' to the current line we return to the next line of the on line callback in internal/repl.js.
  20. Here we set the line and cursorand return to the while loop that called kLine in the first place.
  21. We now call kInsertString with the next slice, from lastIndex = 2 to nextMatch.index = 3, which means we add } to the current line.
  22. lastIndex is set to the newest lastIndex property of the lineEnding RegEx. It was last called in (19), where it was set to 0 as no match was found.
  23. For the next iteration nextMatch SHOULD BE NULL, but because the lineEnding RegEx was reused in [No. 19] we get nextMatch = ['\n', index: 1, input: ' \n{\n', groups: undefined] again.
  24. Restart from [No. 15] with lastIndex: 0 and nextMatch: 1 -> infinite loop

nodejs-github-bot pushed a commit that referenced this issue Mar 1, 2023
Since the lineEnding Regular Expression is declared on the module scope,
recursive invocations of its `[kTtyWrite]` method share one instance of
this Regular Expression.
Since the state of a RegExp is managed by instance,
alternately calling RegExpPrototypeExec with the same RegExp on
different strings can lead to the state changing unexpectedly.
This is the root cause of this infinite loop bug when calling .load on
javascript files of certain shapes.

PR-URL: #46742
Fixes: #46731
Reviewed-By: Kohei Ueno <kohei.ueno119@gmail.com>
Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
targos pushed a commit that referenced this issue Mar 13, 2023
Since the lineEnding Regular Expression is declared on the module scope,
recursive invocations of its `[kTtyWrite]` method share one instance of
this Regular Expression.
Since the state of a RegExp is managed by instance,
alternately calling RegExpPrototypeExec with the same RegExp on
different strings can lead to the state changing unexpectedly.
This is the root cause of this infinite loop bug when calling .load on
javascript files of certain shapes.

PR-URL: #46742
Fixes: #46731
Reviewed-By: Kohei Ueno <kohei.ueno119@gmail.com>
Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
targos pushed a commit that referenced this issue Mar 14, 2023
Since the lineEnding Regular Expression is declared on the module scope,
recursive invocations of its `[kTtyWrite]` method share one instance of
this Regular Expression.
Since the state of a RegExp is managed by instance,
alternately calling RegExpPrototypeExec with the same RegExp on
different strings can lead to the state changing unexpectedly.
This is the root cause of this infinite loop bug when calling .load on
javascript files of certain shapes.

PR-URL: #46742
Fixes: #46731
Reviewed-By: Kohei Ueno <kohei.ueno119@gmail.com>
Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
danielleadams pushed a commit that referenced this issue Apr 11, 2023
Since the lineEnding Regular Expression is declared on the module scope,
recursive invocations of its `[kTtyWrite]` method share one instance of
this Regular Expression.
Since the state of a RegExp is managed by instance,
alternately calling RegExpPrototypeExec with the same RegExp on
different strings can lead to the state changing unexpectedly.
This is the root cause of this infinite loop bug when calling .load on
javascript files of certain shapes.

PR-URL: #46742
Fixes: #46731
Reviewed-By: Kohei Ueno <kohei.ueno119@gmail.com>
Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
confirmed-bug Issues with confirmed bugs. repl Issues and PRs related to the REPL subsystem.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants