Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

readFile in promises very slow #151

Open
xJeneKx opened this issue Mar 11, 2024 · 25 comments
Open

readFile in promises very slow #151

xJeneKx opened this issue Mar 11, 2024 · 25 comments

Comments

@xJeneKx
Copy link

xJeneKx commented Mar 11, 2024

Version

v21.7.1

Platform

Darwin evgeniis.local 23.4.0 Darwin Kernel Version 23.4.0: Wed Feb 21 21:44:43 PST 2024; root:xnu-10063.101.15~2/RELEASE_ARM64_T6000 arm64

Subsystem

No response

What steps will reproduce the bug?

slow

const fs = require('fs/promises');

const start = Date.now();
let count = 0;
for (let i = 0; i < 10000; i++) {
	fs.readFile("./text.txt", { encoding: 'utf-8' })
		.then((data) => {
			if (data !== "Hello, world") throw 1;
			count++
			if (count === 10000) {
				console.log('time: ', Date.now() - start);
			}
		})
		.catch((err) => {
			throw 1
		})
}

fast

const fs = require('fs');
const util = require('util');
const readFile = util.promisify(fs.readFile);

const start = Date.now();
let count = 0;
for (let i = 0; i < 10000; i++) {
	readFile("./text.txt", { encoding: 'utf-8' })
		.then((data) => {
			if (data !== "Hello, world") throw 1;
			count++
			if (count === 10000) {
				console.log('time: ', Date.now() - start);
			}
		})
		.catch((err) => {
			throw 1
		})
}

How often does it reproduce? Is there a required condition?

No response

What is the expected behavior? Why is that the expected behavior?

No response

What do you see instead?

The promise version is 2 times slower. My tests showed 200ms vs 100ms

Additional information

No response

@AlttiRi
Copy link

AlttiRi commented Mar 12, 2024

The same on Windows.

However, I need to decrease the count from 10000 to 8189.
There is the error when I use 8190+:

[Error: EMFILE: too many open files, open 'C:\test\text.txt'] {
  errno: -4066,
  code: 'EMFILE',
  syscall: 'open',
  path: 'C:\\test\\text.txt'
}
Node.js v18.18.1

@UlisesGascon
Copy link
Member

I suggest to transfer this issue to the performance team repo... what do you think @anonrig?

@anonrig anonrig transferred this issue from nodejs/node Mar 12, 2024
@lemire
Copy link
Member

lemire commented Mar 12, 2024

You can go faster (maybe 2x faster) by going synchronous.

const fs = require('fs');
const readFileSync = fs.readFileSync;

const start = Date.now();
let count = 0;
for (let i = 0; i < 10000; i++) {
        var data = readFileSync("./text.txt", { encoding: 'utf-8' })
}
console.log('time: ', Date.now() - start)

When benchmarking using bun, all three approaches run at the same speed.

Here bun vs node on the first benchmark:

>  node  test.js
time:  218
>  bun run test.js
time:  53

The cost of promises in Node seems extraordinarily high.

@lemire
Copy link
Member

lemire commented Mar 12, 2024

You can basically recover the performance by calling readFileSync within an async function...

const fs = require('fs');
const readFileSync = fs.readFileSync;
async function f(name, options) {
  return readFileSync(name, options);
}

const start = Date.now();
let count = 0;
for (let i = 0; i < 10000; i++) {
        f("./text.txt", { encoding: 'utf-8' })
                .then((data) => {
                        count++
                        if (count === 10000) {
                                console.log('time: ', Date.now() - start);
                        }
                })
                .catch((err) => {
                        throw 1
                })
}

@meyfa
Copy link

meyfa commented Mar 12, 2024

You can basically recover the performance by calling readFileSync within an async function

That cannot be optimal? Maybe in this synthetic example it is, but won't readFileSync just block the whole program until the entire file is read? The await in this case just gives the illusion of being async as a microtick is introduced after the entire file has been read synchronously...

@meyfa
Copy link

meyfa commented Mar 12, 2024

readFileSync has to return the file contents synchronously, so it must read the whole file before returning. And since JS isn't multi-threaded this will block execution. fs.promises.readFile can allow other stuff to run while it fetches the file asynchronously, fulfilling the returned promise when done. The presence of await or the fact that the readFileSync call happens in an async function does not affect its internal behavior.

While readFileSync may speed up the benchmark, I think it is unsuitable for applications that could be doing other work while waiting for I/O to complete.

@aduh95
Copy link

aduh95 commented Mar 12, 2024

You can basically recover the performance by calling readFileSync within an async function

That's going to depend on the size of your file and if you're program has other things to do – like @meyfa said, using readFileSync might get you the content of the file slightly faster, but the tradeoff is that your program cannot do anything in the mean time.

@GeoffreyBooth
Copy link
Member

JS isn’t multi-threaded

https://nodejs.org/api/worker_threads.html

@ronag
Copy link
Member

ronag commented Mar 12, 2024

I think we are getting a bit off topic here... I would recommend all mentions of readFileSync should be hidden as off topic.

@lemire
Copy link
Member

lemire commented Mar 12, 2024

I ran some benchmarks on a large server. See blog post.

Screenshot 2024-03-12 at 11 45 15 AM

@benjamingr
Copy link
Member

@Linkgoron

@benjamingr
Copy link
Member

Yeah last time we looked at it there were many opportunities to make the slower discrete versions (e.g readFile and not createReadStream) much much faster so this result isn’t too surprising.

@lemire
Copy link
Member

lemire commented Mar 12, 2024

I think we are getting a bit off topic here... I would recommend all mentions of readFileSync should be hidden as off topic.

In Bun, fs.readFileSync can be slower than fs.promises, and fs.readFile definitively appears to be able to be faster than fs.readFileSync. In Node.js, the opposite is true. See my numbers above. Granted, it is one test on one system, and one's results will vary.

Is that what one expects?

@lemire
Copy link
Member

lemire commented Mar 12, 2024

@aduh95

the tradeoff is that your program cannot do anything in the mean time.

In my test (see blog post link), readFileSync uses 300 ms of CPU time, fs.promises uses 7 seconds of CPU time (over several cores). And fs.promises return the file with significantly higher latency.

These are not small differences.

@lemire
Copy link
Member

lemire commented Mar 12, 2024

Our results suggest that fs.promises is significantly slower than fs.readFile. There is a related issue from 2021:

Related: nodejs/node#37583

@lemire
Copy link
Member

lemire commented Mar 12, 2024

My tests ran with a small file. If you use a large file (e.g., 512kB), then all the Node.js tests except those relying on readFileSync fail/crash with an allocation failure.

Bun does not run out of memory.

@mcollina
Copy link
Member

The problem is that readFile allocate 3x the memory of the file size.

  1. to read all the chunks
  2. to concatenate them in a single buffer
  3. to convert said buffer to utf8

The solution for utf-8 files is relatively simple to describe:

  1. move most of the readFile logic to C++
  2. use a streaming utf8 parser and use it to allocate V8/JS strings.
  3. Concatenate said V8 strings (this is cheap)

For binary reads is a bit more tricky.

  1. read most of the readFile logic to C++
  2. allocate a big chunk of memory to fit the file
  3. read it straight there

On this latter case, this is mostly done, but not in C++, causing a lot of thrashing.


Consider also nodejs/node#17054 and nodejs/node#25741. Notes that those fix a real problem, but we might want to make a different tradeoff there, or specify an option for the user to provide.

@aduh95
Copy link

aduh95 commented Mar 13, 2024

For binary reads […]
2. allocate a big chunk of memory to fit the file
3. read it straight there

I guess that's already what we do, although the "big chunk" is maybe not that big (64 kiB) – but it might already be considered too big depending on your use case. Maybe we could add an option to readFile to let the user specify the size of that buffer, a bit like the highWatermark option for streams.

@mcollina
Copy link
Member

@aduh95 I think you misunderstood me. I wrote "to fit the file", so that if the file is 100MB, we would allocate 100MB of contiguous memory for it and avoid the concatenation.

@mcollina
Copy link
Member

Anyhow I think it's more important to optimize the utf8 case.

@aduh95
Copy link

aduh95 commented Mar 13, 2024

Maybe that's a wrong assumption from my side, but I think it's not possible to know the size of a file ahead of time. So my point was the current buffer is already a "big chunk to fit the file", but that's only true for files smaller than 64kiB – 64K is arbitrary, but unless we can know the size of the file, we will inevitably need some concatenation given a large enough file.
Anyway, optimising the UTF-8 case certainly makes sense.

@Linkgoron
Copy link
Member

Maybe that's a wrong assumption from my side, but I think it's not possible to know the size of a file ahead of time. So my point was the current buffer is already a "big chunk to fit the file", but that's only true for files smaller than 64kiB – 64K is arbitrary, but unless we can know the size of the file, we will inevitably need some concatenation given a large enough file. Anyway, optimising the UTF-8 case certainly makes sense.

Maybe I'm misunderstanding what you're saying, but today the code uses fstat in-order to allocate the returned buffer. If it fails, it allocates 64kIb chunks for every read and then concats them in the end (note that even if fstat succeeds, Node still reads in 512 kib chunks into the larger buffer).

@mcollina
Copy link
Member

@Linkgoron @aduh95 you are right, this optimization was already done.

The problem remains because there are a lot of C++/JS transitions for large files, and the utf8 parsing still allocates double the memory.

@BridgeAR
Copy link
Member

@mcollina the fs.promises.readFile() implementation allocates ~100.5 MB when reading a 100 MB big file. See nodejs/node#44295 It still has multiple C++ / JS boundary transitions though. Moving things into C++ would therefore likely provide further gains.

We do have different implementations throughout the codebase and that is obviously not ideal. Maybe we can unify the promise and the callback versions internally?

@joyeecheung
Copy link
Member

joyeecheung commented Apr 14, 2024

You can check the impact combined with JS code execution by the way, for example (running this in the project directory to parse a markdown file):

See code
'use strict';
const { readFileSync } = require('fs');
const { readFile } = require('fs/promises');
const marked = require('./test/fixtures/snapshot/marked');

let count = 0;
let total = 100;
const file = './README.md';  // 38K
//const file = 'GOVERNANCE.md';  // 7.5K

let result;
function syncParse(name, options) {
  const start = Date.now();
  for (let i = 0; i < total; i++) {
    const data = readFileSync(file, { encoding: 'utf-8' });
    result = marked.parse(data);
  }
  console.log('syncParse: ', Date.now() - start, result.length);
}

async function asyncParse(name, options) {
  const start = Date.now();
  for (let i = 0; i < total; i++) {
    readFile(file, { encoding: 'utf-8' })
      .then((data) => {
        result = marked.parse(data);
        if (++count === total) {
          console.log('asyncParse: ', Date.now() - start, result.length);
        }
      }, (error) => {
        console.log(error);
      });
  }
}

// Ensure it's cached.
readFileSync(file, { encoding: 'utf-8' });
readFileSync(file, { encoding: 'utf-8' });
readFileSync(file, { encoding: 'utf-8' });

//syncParse();
asyncParse();

For GOVERNANCE.md which is 7.5K, the synchronous version is ~20% faster on my Linux server, and for README.md (38K) the two version perform basically the same. On a MacBook both perform similarly for GOVERNANCE.md, and the asynchronous version of README.md is about 1% faster. For even bigger files (e.g. doc/api/fs.md) the difference is negligible because the actual parsing eats most of the time.

So I guess yes, the async version would allow you to run some code at the mean time, but the overhead of the asynchronous version may already cancel out the benefit you get from not blocking especially when your files aren't that big and are cached. Also on a reasonably fast file system, the more JS execution you need to do, the less the speed of fs calls is going to matter, because the JS execution will completely dominate the profile and the overhead of blocking file reads will only be a tiny part of it, blocking or not.

Also, to vary the input a bit, just run marked on all the documents under doc/api:

See code
'use strict';
const { readdirSync, readFileSync } = require('fs');
const { readFile } = require('fs/promises');
const marked = require('./test/fixtures/snapshot/marked');

const files = readdirSync('./doc/api/').map(i => `./doc/api/${i}`);
let result;

function syncParse() {
  const start = Date.now();
  for (let i = 0; i < files.length; i++) {
    const data = readFileSync(files[i], { encoding: 'utf-8' });
    result = marked.parse(data);
  }
  console.log('syncParse: ', Date.now() - start, result.length);
}

async function asyncParse() {
  const start = Date.now();
  let count = 0;
  for (let i = 0; i < files.length; i++) {
    readFile(files[i], { encoding: 'utf-8' })
      .then((data) => {
        result = marked.parse(data);
        if (++count === files.length) {
          console.log('asyncParse: ', Date.now() - start, result.length);
        }
      }, (error) => {
        console.log(error);
      });
  }
}

//syncParse();
asyncParse();

There's no practical difference on either my Linux server (~1.4s)or MacBook (~0.7s) (those documents tend to be big and most time is spent on parsing). Also, to prove my point about JS execution speed matter more when you actually need to do a lot of JS, despite #151 (comment) showing that Bun has faster file I/O, running the same script on my Linux server with Bun took ~35s both for the async and the synchronous version, I suppose something is wrong with JSC there, but at this point you probably don't even care about the few milliseconds of file I/O if you are going spend >35x more time on JS execution...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests