fs.readFile possible memory leak #21266

ip413 · 2018-06-11T20:21:10Z

4.8.6:
Linux as-t430s 4.4.0-67-generic Proposal: return Promises as well as taking callbacks. #88-Ubuntu SMP Wed Mar 8 16:34:45 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux:
fs:

fs.readFile seems to have memory leak while reading asynchronously the same file many times
fs.readFileSync seems to have very different behaviour and clean up memory efficiently

Image showing issue https://github.com/ip413/node-read-file/blob/master/docs/data.png
Code is here https://github.com/ip413/node-read-file/blob/master/index.js
Excel is here https://github.com/ip413/node-read-file/blob/master/docs/data.ods

The same problem occurs on 9.5.0 node version.

mscdex · 2018-06-11T21:21:01Z

I'm not convinced there is a leak. Since you're kicking off a large number of requests in a tight loop, there are no available threads to service the requests, so they are placed into a queue. The queued tasks themselves consume memory. That is what you're seeing. You should instead measure memory usage after all of the requests have completed.

The synchronous methods perform the work immediately on the main thread and thus do not consume any extra resources, which is why you see stable memory usage there.

aduh95 · 2018-06-12T14:29:44Z

FYI, neither Node 4.8.6 nor 9.5.0 are supported version of Node, and you should report issue that affect up-to-date versions of Node. You can have a look on the LTS Release Schedule to see what versions are supported by the Node community, and on the CHANGELOG table to have the list of the last version released for each semver-major (at the time of writing, supported versions are v6.14.2 (maintenance only), v8.11.2, v9.11.1 (maintenance only) and v10.4.0).

ryzokuken · 2018-06-13T19:50:46Z

@ip413 could you please check if this affects master? You could also try LTS, but I'd be more interested in knowing if master has this too.

ip413 · 2018-06-20T13:58:51Z

@aduh95 - yep, sorry for that.
@ryzokuken - will check it

davisjam · 2018-08-25T13:24:21Z

I believe @mscdex's diagnosis is accurate. But we would need the "after requests are completed" data to prove it. I'm guessing this will be an easy-to-clean-up issue if someone wants to collect that data.

gireeshpunathil · 2020-01-02T08:15:35Z

@ip413 - is this still an issue?

ip413 · 2020-01-15T14:28:47Z

@ryzokuken - I haven't tested master, but tested last version of node 12.14.1, so probably master is affected
@gireeshpunathil - yep, I believe it is still an issue

I'm personally no longer interested in this issue, but...

I removed memwatch-next dependency to be able to test also node v12
I checked it once again and my conclusion is the same: there could be some issue with memory during many async read

Conclusions:

issue is almost the same with LTS node v6 and v12
the bigger this file is, the more memory is used
no amount of waiting will clean up memory (I assume that 1 hour is enough, and this is what I've tested)
no "cleanup of variable" takes an effect (undefined, null, delete)

Personally I don't believe is such a big leak... but I don't have explanation of this behaviour. Maybe with normal usage of file everything is fine. Probably @mscdex is right... but I can't prove it.

gireeshpunathil · 2020-01-23T10:00:22Z

I am able to see the issue.

0000000004a34000 67944K rw--- [ anon ]

sections such as this, never gets unmapped.

strace also shows a similar story:

[pid 38519] 1579773078.495439 mmap(NULL, 134217728, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0 <unfinished ...>
[pid 38520] 1579773078.495481 mmap(NULL, 134217728, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0 <unfinished ...>
[pid 38519] 1579773078.495489 <... mmap resumed> ) = 0x7f4640000000
[pid 38520] 1579773078.495516 <... mmap resumed> ) = 0x7f4638000000

there is no matching munmap calls.

gireeshpunathil · 2020-01-23T11:18:50Z

(gdb) b mmap if $rsi >= 134217728
Starting program: node --expose-gc 21266
...
Breakpoint 2, 0x00007ffff6ec6d50 in mmap64 () from /lib64/libc.so.6
(gdb) where
#0  0x00007ffff6ec6d50 in mmap64 () from /lib64/libc.so.6
#1  0x00007ffff6e4d231 in new_heap () from /lib64/libc.so.6
#2  0x00007ffff6e4dc64 in arena_get2.isra.3 () from /lib64/libc.so.6
#3  0x00007ffff6e53b0e in malloc () from /lib64/libc.so.6
#4  0x00007ffff792eecd in operator new(unsigned long) () from /lib64/libstdc++.so.6
#5  0x0000000000d87c68 in std::__detail::_Map_base<v8::internal::MemoryChunk*, std::pair<v8::internal::MemoryChunk* const, v8::internal::MemoryChunkData>, std::allocator<std::pair<v8::internal::MemoryChunk* const, v8::internal::MemoryChunkData> >, std::__detail::_Select1st, std::equal_to<v8::internal::MemoryChunk*>, v8::internal::MemoryChunk::Hasher, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, false, true>, true>::operator[](v8::internal::MemoryChunk* const&) ()
#6  0x0000000000d88681 in v8::internal::ConcurrentMarkingVisitor::ShouldVisit(v8::internal::HeapObject)
    ()
#7  0x0000000000d8e4f4 in v8::internal::ConcurrentMarking::Run(int, v8::internal::ConcurrentMarking::TaskState*) ()
#8  0x0000000000d0b283 in v8::internal::CancelableTask::Run() ()
#9  0x0000000000b05dc9 in node::(anonymous namespace)::PlatformWorkerThread(void*) ()
#10 0x00007ffff71a3ea5 in start_thread () from /lib64/libpthread.so.0
#11 0x00007ffff6ecc8cd in clone () from /lib64/libc.so.6
(gdb) i r $rsi
rsi            0x8000000	134217728
(gdb)

gireeshpunathil · 2020-01-23T11:19:06Z

but this one looks like heap expansion!

gireeshpunathil · 2020-01-23T11:20:41Z

expanded v8 heaps are never shrunk, even if the usage is long ceased?

gireeshpunathil · 2020-01-23T11:22:39Z

also, is there a race condition by which multiple threads believe (together) that they need to expand?

HarshithaKP · 2020-01-31T09:27:14Z

I have been working with @gireeshpunathil on this. After some investigation, I have found out that this is not an issue at all, but just the way rss works. The resident memory shot up due to the large amount of activities in the program (a thousand concurrent reads), but there wasn’t anything in the program / system to bring it down - i.e, the rss reduces only when the OS swaps out the pages in response to memory demands in other processes.

Here is the proof:
$ cat 21266.js

var fs = require(‘fs’)
function report() {
  global.gc()
  console.log(‘after gc:’)
  setInterval(() => {
    console.log(Math.round(process.memoryUsage()[‘rss’] / (1024 * 1024)))
  }, 10000)
}
function readFile() {
  fs.readFile(‘./foo.txt’, (e, d) => {
    if(e) console.log(e)
  })
}
console.log(`process id: ${process.pid}`)
fs.readFile(‘./foo.txt’, (e, d) => {
  console.log(‘baseline:’)
  console.log(Math.round(process.memoryUsage()[‘rss’] / (1024 * 1024)))
  setTimeout(() => {
    for (var i = 0; i < 1000; i++) {
      readFile()
    }
    report()
  }, 10000)
})

$ cat gc.js

let list2 = []
setInterval(() => {
  let list1 = []
  // make 4K data, aligning to linux page size
  for(var i=0; i< 4096; i++)
    list1.push(‘x’.repeat(4096))
  list2.push(list1)
  list2.forEach((list) => {
    list.forEach((item) => {
      // make sure we ‘touch’ every pages of our memory
      item.indexOf(‘x’)
    })
  })
}, 10)
setInterval(() => {
  console.log(Math.round(process.memoryUsage()[‘rss’] / (1024 * 1024)))
}, 1000)

check the free memory in your system. If it is huge, the above program may need to be modified to create larger arrays / strings; or else we need to wait for a longer period of time.

My system has only ~2GB, so not very bad.

[root@r811 test]# cat /proc/meminfo | grep MemFree
MemFree:         1704748 kB
[root@r811 test]#

run the first program in a terminal, and wait for the rss to grow up and stabilise.

[root@r811 test]# node --expose-gc 21266
process id: 2355
baseline:
33
after gc:
128
128
128
...

then run the second program to eat up all the free memory in the system.

[root@r811 test]# node --max-old-space-size=10000 gc
133
335
577
823
1062
1297
1521
1597
^C

When the consumption approaches the free memory limit, check the status of the first program. It would have come down!

inference: your program causes active memory to shoot up, but the system was relatively too free to reclaim those memory.

There is no bug / misbehavior observed in node.
Hope this helps!
It would be great if you can validate this in your system.

jasnell · 2020-01-31T12:27:13Z

Yep, the behavior you're seeing here is absolutely normal. While seeing that RSS rise and hold for so long is surprising and concerning for some, it's not the memory metric you need to watch with regards to memory leaks.

machinetherapist · 2020-05-18T00:08:26Z

i have same with fs.writeFileSync

bnoordhuis · 2020-05-18T08:41:38Z

Looks like this was forgotten to be closed out? #11289 is a similar issue, FWIW.

@machinetherapist If you think you've found a legitimate bug, please open a new issue.

gireeshpunathil mentioned this issue Feb 1, 2020

Investigate flaky parallel/test-worker-memory #23277

Closed

gireeshpunathil mentioned this issue Mar 7, 2020

Memory Issues with Node 10 nodejs/help#2512

Closed

bnoordhuis closed this as completed May 18, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fs.readFile possible memory leak #21266

fs.readFile possible memory leak #21266

ip413 commented Jun 11, 2018 •

edited

mscdex commented Jun 11, 2018 •

edited

aduh95 commented Jun 12, 2018

ryzokuken commented Jun 13, 2018

ip413 commented Jun 20, 2018

davisjam commented Aug 25, 2018

gireeshpunathil commented Jan 2, 2020

ip413 commented Jan 15, 2020 •

edited

gireeshpunathil commented Jan 23, 2020

gireeshpunathil commented Jan 23, 2020

gireeshpunathil commented Jan 23, 2020

gireeshpunathil commented Jan 23, 2020

gireeshpunathil commented Jan 23, 2020

HarshithaKP commented Jan 31, 2020 •

edited

jasnell commented Jan 31, 2020

machinetherapist commented May 18, 2020

bnoordhuis commented May 18, 2020

fs.readFile possible memory leak #21266

fs.readFile possible memory leak #21266

Comments

ip413 commented Jun 11, 2018 • edited

mscdex commented Jun 11, 2018 • edited

aduh95 commented Jun 12, 2018

ryzokuken commented Jun 13, 2018

ip413 commented Jun 20, 2018

davisjam commented Aug 25, 2018

gireeshpunathil commented Jan 2, 2020

ip413 commented Jan 15, 2020 • edited

gireeshpunathil commented Jan 23, 2020

gireeshpunathil commented Jan 23, 2020

gireeshpunathil commented Jan 23, 2020

gireeshpunathil commented Jan 23, 2020

gireeshpunathil commented Jan 23, 2020

HarshithaKP commented Jan 31, 2020 • edited

jasnell commented Jan 31, 2020

machinetherapist commented May 18, 2020

bnoordhuis commented May 18, 2020

ip413 commented Jun 11, 2018 •

edited

mscdex commented Jun 11, 2018 •

edited

ip413 commented Jan 15, 2020 •

edited

HarshithaKP commented Jan 31, 2020 •

edited