Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate flaky crash in test/pummel/test-repl-empty-maybelocal-crash.js on freebsd #42719

Closed
RaisinTen opened this issue Apr 13, 2022 · 5 comments
Labels
flaky-test Issues and PRs related to the tests with unstable failures on the CI.

Comments

@RaisinTen
Copy link
Contributor

The problem doesn't seem completely fixed. New test crashes sometimes on freebsd:
https://ci.nodejs.org/job/node-test-commit-freebsd/43628/
https://ci.nodejs.org/job/node-test-commit-freebsd/43633/

Originally posted by @targos in #42409 (comment)

12:54:37 not ok 3276 pummel/test-repl-empty-maybelocal-crash
12:55:10   ---
12:55:10   duration_ms: 33.90
12:55:10   severity: crashed
12:55:10   exitcode: -9
12:55:10   stack: |-
12:55:10     > 
12:55:10   ...
@RaisinTen RaisinTen added the flaky-test Issues and PRs related to the tests with unstable failures on the CI. label Apr 13, 2022
RaisinTen added a commit to RaisinTen/node that referenced this issue Apr 13, 2022
Refs: nodejs#42719
Signed-off-by: Darshan Sen <raisinten@gmail.com>
RaisinTen added a commit to RaisinTen/node that referenced this issue Apr 13, 2022
Refs: nodejs#42719
Signed-off-by: Darshan Sen <raisinten@gmail.com>
@targos
Copy link
Member

targos commented Apr 13, 2022

The issue seems similar to #41513 (review)

@RaisinTen
Copy link
Contributor Author

@targos if you're talking about the freebsd error in this CI run - #41513 (comment), i.e., the java.nio.channels.ClosedChannelException error, I think that's different from this one. In this issue, it's just the node process that crashes, and the runner still remains connected to the CI server. (FWIW, we faced that error on arm for this PR, discussed here - #42409 (comment))

@richardlau
Copy link
Member

richardlau commented Apr 13, 2022

I have suspicions that the new test is causing the disconnects on fedora-latest, e.g. https://ci.nodejs.org/job/node-test-commit-linux/nodes=fedora-latest-x64/45410/console -- it looks like the OOM Killer is stepping in. I don't have definitive proof it is this test, but we're in the area of executing the pummel tests and this has only started happening recently.

journalctl logs
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: sshd invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: CPU: 1 PID: 2647789 Comm: sshd Not tainted 5.13.12-200.fc34.x86_64 #1
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: Hardware name: DigitalOcean Droplet, BIOS 20171212 12/12/2017
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: Call Trace:
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel:  dump_stack+0x76/0x94
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel:  dump_header+0x4a/0x1f3
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel:  oom_kill_process.cold+0xb/0x10
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel:  out_of_memory+0x229/0x4d0
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel:  __alloc_pages_slowpath.constprop.0+0xbb4/0xc90
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel:  __alloc_pages+0x1dc/0x210
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel:  pagecache_get_page+0x291/0x630
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel:  filemap_fault+0x615/0x970
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel:  ? next_uptodate_page+0x1b4/0x2a0
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel:  ext4_filemap_fault+0x2d/0x40
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel:  __do_fault+0x36/0x100
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel:  __handle_mm_fault+0xf8a/0x1570
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel:  handle_mm_fault+0xd5/0x2b0
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel:  do_user_addr_fault+0x1b7/0x670
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel:  exc_page_fault+0x78/0x160
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel:  ? asm_exc_page_fault+0x8/0x30
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel:  asm_exc_page_fault+0x1e/0x30
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: RIP: 0033:0x7fb93f4e1ce7
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: Code: 43 68 48 8b 40 08 48 89 44 24 10 48 8b 83 00 03 00 00 48 85 c0 0f 84 d8 00 00 00 48 8b 7c 24>
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: RSP: 002b:00007ffc43b8b550 EFLAGS: 00010246
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: RAX: 000056218ce8b3c0 RBX: 00007fb93f50a1e0 RCX: 0000000000000000
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: RDX: 0000000000000003 RSI: 00000000a832b2bb RDI: 00000000a832b2bb
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: RBP: 0000000000000000 R08: 0000000000000001 R09: 00007fb93f50a4a0
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: R10: 00007fb93efdd540 R11: 000056218ce8b3e8 R12: 0000000000000020
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: R13: 00007fb93e832828 R14: 00007ffc43b8b680 R15: 0000000000000000
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: Mem-Info:
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: active_anon:820 inactive_anon:955807 isolated_anon:0
                                                          active_file:42 inactive_file:26 isolated_file:0
                                                          unevictable:0 dirty:0 writeback:0
                                                          slab_reclaimable:5557 slab_unreclaimable:8396
                                                          mapped:36 shmem:2762 pagetables:2721 bounce:0
                                                          free:21658 free_pcp:238 free_cma:0
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: Node 0 active_anon:3280kB inactive_anon:3823228kB active_file:168kB inactive_file:104kB unevictabl>
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: Node 0 DMA free:14848kB min:260kB low:324kB high:388kB reserved_highatomic:0KB active_anon:0kB ina>
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: lowmem_reserve[]: 0 3436 3869 3869 3869
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: Node 0 DMA32 free:62552kB min:59796kB low:74744kB high:89692kB reserved_highatomic:0KB active_anon>
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: lowmem_reserve[]: 0 0 432 432 432
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: Node 0 Normal free:9232kB min:7524kB low:9404kB high:11284kB reserved_highatomic:2048KB active_ano>
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: lowmem_reserve[]: 0 0 0 0 0
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: Node 0 DMA: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 1*512kB (U) 0*1024kB 1*2048kB (M) 3*4>
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: Node 0 DMA32: 823*4kB (UME) 1278*8kB (UME) 848*16kB (UM) 533*32kB (UME) 157*64kB (UME) 54*128kB (U>
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: Node 0 Normal: 728*4kB (UMEH) 368*8kB (UMEH) 153*16kB (UMEH) 25*32kB (UE) 2*64kB (E) 0*128kB 0*256>
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: 2787 total pagecache pages
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: 0 pages in swap cache
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: Swap cache stats: add 0, delete 0, find 0/0
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: Free swap  = 0kB
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: Total swap = 0kB
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: 1048473 pages RAM
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: 0 pages HighMem/MovableOnly
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: 46368 pages reserved
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: 0 pages cma reserved
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: 0 pages hwpoisoned
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: Tasks state (memory values in pages):
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: Tasks state (memory values in pages):
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: [    484]     0   484    29210      530   245760        0          -250 systemd-journal
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: [    497]     0   497    11506      354    94208        0         -1000 systemd-udevd
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: [    522]   998   522     4428      223    69632        0          -900 systemd-oomd
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: [    523]   193   523     9579     1689   102400        0             0 systemd-resolve
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: [    524]     0   524     6472      161    65536        0         -1000 auditd
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: [    553]     0   553    12140      450   106496        0             0 sssd
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: [    555]   995   555    23782      193    81920        0             0 chronyd
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: [    556]     0   556     4494      235    77824        0             0 systemd-homed
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: [    558]    81   558     2472      167    57344        0          -900 dbus-broker-lau
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: [    560]    81   560     1359      160    45056        0          -900 dbus-broker
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: [    561]     0   561    12582      594    98304        0             0 sssd_be
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: [    562]     0   562    15521      370   151552        0             0 sssd_nss
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: [    568]     0   568     4483      253    73728        0             0 systemd-logind
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: [    580]     0   580    66002      622   135168        0             0 NetworkManager
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: [    688]     0   688     5941      246    73728        0         -1000 sshd
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: [    690]     0   690     2408       30    49152        0             0 agetty
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: [    691]     0   691     3050       35    45056        0             0 agetty
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: [    753]     0   753     4387      213    69632        0             0 systemd-userdbd
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: [2005598]  1000 2005598   639643    48526   770048        0             0 java
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: [2521676]  1000 2521676     1759       94    53248        0             0 bash
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: [2521684]  1000 2521684     3565      276    65536        0             0 make
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: [2591624]  1000 2591624     3848      562    65536        0             0 make
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: [2598859]  1000 2598859    24759     4357    98304        0             0 python3
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: [2636865]     0 2636865     4470      217    69632        0             0 systemd-userwor
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: [2636968]     0 2636968     4470      218    77824        0             0 systemd-userwor
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: [2636989]     0 2636989     4470      217    73728        0             0 systemd-userwor
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: [2647771]  1000 2647771  1034804   889751  7659520        0             0 node
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: [2647785]     0 2647785     6394      262    77824        0             0 sshd
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: [2647786]     0 2647786     5994      250    73728        0             0 sshd
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: [2647787]     0 2647787     5998      250    73728        0             0 sshd
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: [2647788]     0 2647788     6390      277    73728        0             0 sshd
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: [2647789]     0 2647789     3606      151    57344        0             0 sshd
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: [2647790]     0 2647790     3606      129    57344        0             0 sshd
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: [2647791]     0 2647791     3606      129    57344        0             0 sshd
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: [2647792]    74 2647792     5972      246    61440        0             0 sshd
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: [2647793]    74 2647793     5972      247    65536        0             0 sshd
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: [2647794]    74 2647794     5972      246    65536        0             0 sshd
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: [2647795]    74 2647795     5972      247    65536        0             0 sshd
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: [2647796]     0 2647796     3606      128    61440        0             0 sshd
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: [2647797]     0 2647797     3606       81    57344        0             0 sshd
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: [2647798]     0 2647798     3561       71    65536        0             0 sshd
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: [2647799]     0 2647799      326        9    36864        0             0 sshd
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=>
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: Out of memory: Killed process 2647771 (node) total-vm:4139216kB, anon-rss:3559004kB, file-rss:0kB,>
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 kernel: oom_reaper: reaped process 2647771 (node), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
Apr 13 16:09:40 test-digitalocean-fedora34-x64-1 systemd[1]: jenkins.service: A process of this unit has been killed by the OOM killer.
Apr 13 16:09:41 test-digitalocean-fedora34-x64-1 systemd[1]: jenkins.service: Main process exited, code=exited, status=143/n/a
Apr 13 16:09:41 test-digitalocean-fedora34-x64-1 audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=kernel msg='unit=jenkins comm="syst>
Apr 13 16:09:41 test-digitalocean-fedora34-x64-1 systemd[1]: jenkins.service: Failed with result 'oom-kill'.
Apr 13 16:09:41 test-digitalocean-fedora34-x64-1 systemd[1]: jenkins.service: Consumed 2h 21.133s CPU time.

The droplet has 4 GB of memory.

Also this is different from the NIO disconnects we've seen elsewhere caused by Jenkins ping timeouts -- there's no corresponding ping thread timeout message in the Jenkins server log.

RaisinTen added a commit to RaisinTen/node that referenced this issue Apr 14, 2022
It was disconnecting the runners from the CI server. Not worth having a
resource-intensive test for this kind of an edge cases.

Fixes: nodejs#42719
Signed-off-by: Darshan Sen <raisinten@gmail.com>
@BethGriggs
Copy link
Member

Landed in 19064be

xtx1130 pushed a commit to xtx1130/node that referenced this issue Apr 25, 2022
It was disconnecting the runners from the CI server. Not worth having a
resource-intensive test for this kind of an edge cases.

Fixes: nodejs#42719
Signed-off-by: Darshan Sen <raisinten@gmail.com>
PR-URL: nodejs#42720
Fixes: nodejs#42719
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Michaël Zasso <targos@protonmail.com>
Reviewed-By: Richard Lau <rlau@redhat.com>
Reviewed-By: Beth Griggs <bgriggs@redhat.com>
Reviewed-By: Stewart X Addison <sxa@redhat.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Dossar pushed a commit to Dossar/node that referenced this issue May 26, 2022
It was disconnecting the runners from the CI server. Not worth having a
resource-intensive test for this kind of an edge cases.

Fixes: nodejs#42719
Signed-off-by: Darshan Sen <raisinten@gmail.com>
PR-URL: nodejs#42720
Fixes: nodejs#42719
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Michaël Zasso <targos@protonmail.com>
Reviewed-By: Richard Lau <rlau@redhat.com>
Reviewed-By: Beth Griggs <bgriggs@redhat.com>
Reviewed-By: Stewart X Addison <sxa@redhat.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Dossar pushed a commit to Dossar/node that referenced this issue May 26, 2022
It was disconnecting the runners from the CI server. Not worth having a
resource-intensive test for this kind of an edge cases.

Fixes: nodejs#42719
Signed-off-by: Darshan Sen <raisinten@gmail.com>
PR-URL: nodejs#42720
Fixes: nodejs#42719
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Michaël Zasso <targos@protonmail.com>
Reviewed-By: Richard Lau <rlau@redhat.com>
Reviewed-By: Beth Griggs <bgriggs@redhat.com>
Reviewed-By: Stewart X Addison <sxa@redhat.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
juanarbol pushed a commit that referenced this issue May 31, 2022
It was disconnecting the runners from the CI server. Not worth having a
resource-intensive test for this kind of an edge cases.

Fixes: #42719
Signed-off-by: Darshan Sen <raisinten@gmail.com>
PR-URL: #42720
Fixes: #42719
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Michaël Zasso <targos@protonmail.com>
Reviewed-By: Richard Lau <rlau@redhat.com>
Reviewed-By: Beth Griggs <bgriggs@redhat.com>
Reviewed-By: Stewart X Addison <sxa@redhat.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
juanarbol pushed a commit that referenced this issue Jun 1, 2022
It was disconnecting the runners from the CI server. Not worth having a
resource-intensive test for this kind of an edge cases.

Fixes: #42719
Signed-off-by: Darshan Sen <raisinten@gmail.com>
PR-URL: #42720
Backport-PR-URL: #42967
Fixes: #42719
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Michaël Zasso <targos@protonmail.com>
Reviewed-By: Richard Lau <rlau@redhat.com>
Reviewed-By: Beth Griggs <bgriggs@redhat.com>
Reviewed-By: Stewart X Addison <sxa@redhat.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
BethGriggs pushed a commit that referenced this issue Jun 1, 2022
It was disconnecting the runners from the CI server. Not worth having a
resource-intensive test for this kind of an edge cases.

Fixes: #42719
Signed-off-by: Darshan Sen <raisinten@gmail.com>
PR-URL: #42720
Backport-PR-URL: #42967
Fixes: #42719
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Michaël Zasso <targos@protonmail.com>
Reviewed-By: Richard Lau <rlau@redhat.com>
Reviewed-By: Beth Griggs <bgriggs@redhat.com>
Reviewed-By: Stewart X Addison <sxa@redhat.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flaky-test Issues and PRs related to the tests with unstable failures on the CI.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants