Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kernel panic error caused by a bug in the “val_to_ring” function,causing a crash of the host machine #1359

Open
Spartan-65 opened this issue Sep 20, 2023 · 5 comments
Assignees
Labels
kind/bug Something isn't working
Milestone

Comments

@Spartan-65
Copy link

Describe the bug

[1047486.856617] falco: deallocating consumer ffff9aba94a2a0e0
[1047486.938918] BUG: unable to handle kernel paging request at ffffac4fe972383e
[1047486.943701] falco: no more consumers, stopping capture
[1047486.943583] IP: [<ffffffffc0b6dd70>] val_to_ring+0x80/0x460 [falco]
[1047486.950758] PGD 179982067 PUD 179983067 PMD 4f0fee067 PTE 0
[1047486.955568] Oops: 0002 [#1] SMP
[1047486.973052] Modules linked in: udp_diag binfmt_misc falco(OE) veth ipt_rpfilter vxlan ip6_udp_tunnel udp_tunnel xt_set xt_multiport ip_set_hash_ip ip_set_hash_net ip_set ipip tunnel4 ip_tunnel ip6t_MASQUERADE nf_nat_masquerade_ipv6 xt_statistic xt_nat ipt_REJECT nf_reject_ipv4 ip6table_filter ip6table_mangle ip_vs_sh ip_vs_wrr ip_vs_rr ip_vs nf_tables iptable_raw xt_CT dummy rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache ip6table_nat ip6_tables iptable_mangle xt_comment xt_mark tcp_diag inet_diag xt_conntrack ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_conntrack_netlink nfnetlink xt_addrtype iptable_filter iptable_nat br_netfilter bridge stp llc overlay(T) openvswitch nf_conntrack_ipv6 nf_nat_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_defrag_ipv6 nf_nat nf_conntrack
[1047487.021023]  ppdev cirrus ttm iosf_mbi drm_kms_helper crc32_pclmul syscopyarea sysfillrect sysimgblt ghash_clmulni_intel fb_sys_fops drm aesni_intel joydev lrw gf128mul drm_panel_orientation_quirks glue_helper ablk_helper virtio_balloon i2c_piix4 parport_pc cryptd parport pcspkr drbd_transport_tcp(OE) drbd(OE) ip_tables xfs libcrc32c ata_generic pata_acpi virtio_net virtio_blk scsi_transport_iscsi ata_piix libata crct10dif_pclmul crct10dif_common virtio_pci virtio_ring crc32c_intel serio_raw virtio floppy sunrpc dm_mirror dm_region_hash dm_log dm_mod
[1047487.052673] CPU: 4 PID: 55365 Comm: find Kdump: loaded Tainted: G           OE  ------------ T 3.10.0-1062.9.1.el7.x86_64 #1
[1047487.073362] Hardware name: RDO OpenStack Compute, BIOS 1.11.0-2.el7 04/01/2014
[1047487.077771] task: ffff9ab58f82d230 ti: ffff9ab949200000 task.ti: ffff9ab949200000
[1047487.082308] RIP: 0010:[<ffffffffc0b6dd70>]  [<ffffffffc0b6dd70>] val_to_ring+0x80/0x460 [falco]
[1047487.087522] RSP: 0018:ffff9ab949203b80  EFLAGS: 00010287
[1047487.091793] RAX: 000000000000001e RBX: ffff9ab949203d98 RCX: 0000000000000000
[1047487.096184] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffac4fe9723818
[1047487.100793] RBP: ffff9ab949203bc0 R08: 0000000000000000 R09: 0000000000000098
[1047487.105080] R10: 0000000000000001 R11: 0000000000000246 R12: 000000000000fde8
[1047487.109184] R13: 0000000000000000 R14: 0000000000000001 R15: ffffac4fe972383e
[1047487.113464] FS:  0000000000000000(0000) GS:ffff9abb3fd00000(0000) knlGS:0000000000000000
[1047487.118294] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[1047487.122034] CR2: ffffac4fe972383e CR3: 000000054ee06000 CR4: 00000000001606e0
[1047487.127158] Call Trace:
[1047487.129721]  [<ffffffffa2104262>] ? security_inode_permission+0x22/0x30
[1047487.134438]  [<ffffffffa20565b2>] ? __inode_permission+0x52/0xd0
[1047487.138980]  [<ffffffffc0b7058d>] f_proc_startupdate+0x77d/0x1250 [falco]
[1047487.144099]  [<ffffffffa2588a26>] ? trace_do_page_fault+0x56/0x150
[1047487.149122]  [<ffffffffc0b6b576>] record_event_consumer+0x4b6/0xdf0 [falco]
[1047487.154468]  [<ffffffffa204737a>] ? __check_object_size+0x1ca/0x250
[1047487.173196]  [<ffffffffa25789b1>] ? create_elf_tables+0x542/0x56d
[1047487.177178]  [<ffffffffc0b6bf24>] record_event_all_consumers+0x74/0xb0 [falco]
[1047487.181531]  [<ffffffffc0b6c27d>] syscall_exit_probe+0xed/0x120 [falco]
[1047487.185909]  [<ffffffffa1e3c22d>] syscall_trace_leave+0xfd/0x110
[1047487.189593]  [<ffffffffa258e220>] int_check_syscall_exit_work+0x13/0x1c
[1047487.193456] Code: 46 e2 8b 53 34 48 c1 e0 06 4c 29 c8 49 89 f6 48 69 d2 30 07 00 00 48 8d 94 10 40 1a b8 c0 8b 42 50 83 f8 1b 74 25 31 d2 83 f8 2e <66> 41 89 17 0f 87 8c 02 00 00 48 8b 04 c5 10 16 b8 c0 e9 89 4d
[1047487.207611] RIP  [<ffffffffc0b6dd70>] val_to_ring+0x80/0x460 [falco]
[1047487.213099]  RSP <ffff9ab949203b80>
[1047487.216526] CR2: ffffac4fe972383e

How to reproduce it
Repeatedly reload the Falco process(send SIGHUP signal)

There is no stable reproduction method, but based on the dmesg information, the anomaly occurred right at the attempting a second restart for capture.

[744044.448687] falco: initializing ring buffer for CPU 0
[744044.650350] falco: CPU buffer initialized, size=134217728
[744044.664054] falco: initializing ring buffer for CPU 1
[744045.008622] falco: CPU buffer initialized, size=134217728
[744045.021499] falco: initializing ring buffer for CPU 2
[744045.208599] falco: CPU buffer initialized, size=134217728
[744045.225987] falco: initializing ring buffer for CPU 3
[744045.408549] falco: CPU buffer initialized, size=134217728
[744045.421837] falco: initializing ring buffer for CPU 4
[744045.646908] falco: CPU buffer initialized, size=134217728
[744045.659648] falco: initializing ring buffer for CPU 5
[744045.798287] falco: CPU buffer initialized, size=134217728
[744045.810377] falco: initializing ring buffer for CPU 6
[744046.151417] falco: CPU buffer initialized, size=134217728
[744046.162903] falco: initializing ring buffer for CPU 7
[744046.304424] falco: CPU buffer initialized, size=134217728
[744046.316198] falco: starting capture
[744398.712038] falco: deallocating consumer ffff9ab9fe618000
[744398.788622] falco: no more consumers, stopping capture
[744399.940525] falco: adding new consumer ffff9ab9fe618000
[744399.999193] falco: initializing ring buffer for CPU 0
[744400.199128] falco: CPU buffer initialized, size=134217728
[744400.211459] falco: initializing ring buffer for CPU 1
[744400.599133] falco: CPU buffer initialized, size=134217728
[744400.619842] falco: initializing ring buffer for CPU 2
[744400.899144] falco: CPU buffer initialized, size=134217728
[744400.913185] falco: initializing ring buffer for CPU 3
[744401.299113] falco: CPU buffer initialized, size=134217728
[744401.315185] falco: initializing ring buffer for CPU 4
[744401.599137] falco: CPU buffer initialized, size=134217728
[744401.611852] falco: initializing ring buffer for CPU 5
[744401.899065] falco: CPU buffer initialized, size=134217728
[744401.912112] falco: initializing ring buffer for CPU 6
[744402.159074] falco: CPU buffer initialized, size=134217728
[744402.176069] falco: initializing ring buffer for CPU 7
[744402.599085] falco: CPU buffer initialized, size=134217728
[744402.614843] falco: starting capture
[744606.011475] falco: deallocating consumer ffff9ab9fe618000
[744606.128370] falco: no more consumers, stopping capture
[744607.334996] falco: adding new consumer ffff9ab9fe618000
[744607.393689] falco: initializing ring buffer for CPU 0
[744607.593637] falco: CPU buffer initialized, size=134217728
[744607.613109] falco: initializing ring buffer for CPU 1
[744607.893716] falco: CPU buffer initialized, size=134217728
[744607.907875] falco: initializing ring buffer for CPU 2
[744608.293588] falco: CPU buffer initialized, size=134217728
[744608.307129] falco: initializing ring buffer for CPU 3
[744608.593622] falco: CPU buffer initialized, size=134217728
[744608.606678] falco: initializing ring buffer for CPU 4
[744608.793564] falco: CPU buffer initialized, size=134217728
[744608.810289] falco: initializing ring buffer for CPU 5
[744609.093557] falco: CPU buffer initialized, size=134217728
[744609.112077] falco: initializing ring buffer for CPU 6
[744609.397101] falco: CPU buffer initialized, size=134217728
[744609.414630] falco: initializing ring buffer for CPU 7
[744610.093546] falco: CPU buffer initialized, size=134217728
[744610.105775] falco: starting capture
[744613.235171] falco[977062]: segfault at 14 ip 0000000000dec8b0 sp 00007ffe8256f8d8 error 4 in falco[400000+f7e000]
.
.
.
[748207.518699] falco: initializing ring buffer for CPU 7
[748207.797835] falco: CPU buffer initialized, size=134217728
[748207.813321] falco: starting capture
[748211.141318] falco[1520458]: segfault at 14 ip 0000000000dec8b0 sp 00007ffcef650028 error 4 in falco[400000+f7e000]
.
.
.
[755863.333935] falco: CPU buffer initialized, size=134217728
[755863.348515] falco: initializing ring buffer for CPU 7
[755863.490289] falco: CPU buffer initialized, size=134217728
[755863.504481] falco: starting capture
[755866.915734] falco[1838351]: segfault at 14 ip 0000000000dec8b0 sp 00007ffcfddaff18 error 4 in falco[400000+f7e000]


[803688.923205] falco: CPU buffer initialized, size=134217728
[803688.926634] falco: initializing ring buffer for CPU 7
[803689.123176] falco: CPU buffer initialized, size=134217728
[803689.139546] falco: starting capture
[803692.449034] traps: falco[3458365] general protection ip:dec8b0 sp:7ffe6d4c4958 error:0 in falco[400000+f7e000]

Expected behaviour

Screenshots

Environment

kerenl module
Linux ecs-sit-0002 3.10.0-1062.9.1.el7.x86_64 #1 SMP Fri Dec 6 15:49:49 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
  • Falco version:
falco version=0.33.0-22+c353bb4, driver version=3.0.1+driver, arch=x86_64, kernel release=3.10.0-1062.9.1.el7.x86_64, kernel version=1
  • System info:
root@ecs-sit-0002:/# /usr/bin/falco --support
Wed Sep 20 04:04:38 2023: Falco version 0.33.0-22+c353bb4 (x86_64)
Wed Sep 20 04:04:38 2023: Falco initialized with configuration file /etc/holmes/holmes.yaml
Segmentation fault (core dumped)
  • Cloud provider or hardware configuration:
  • OS:
[root@ecs-sit-0002 ~]# cat /etc/os-release 
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"
  • Kernel:
Linux ecs-sit-0002 3.10.0-1062.9.1.el7.x86_64 #1 SMP Fri Dec 6 15:49:49 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
  • Installation method:
Kubernetes

Additional context

@Spartan-65 Spartan-65 added the kind/bug Something isn't working label Sep 20, 2023
@Andreagit97
Copy link
Member

ei @Spartan-65 I'm sorry for that! Do you mind testing the latest Falco version https://github.com/falcosecurity/falco/releases/tag/0.35.1? Just to see if the issue is still here

@Andreagit97 Andreagit97 self-assigned this Sep 20, 2023
@Andreagit97 Andreagit97 added this to the 0.14.0 milestone Sep 20, 2023
@Spartan-65
Copy link
Author

ei @Spartan-65 I'm sorry for that! Do you mind testing the latest Falco version https://github.com/falcosecurity/falco/releases/tag/0.35.1? Just to see if the issue is still here

Sorry, operations engineers are not allowed to redeploy Falco to this environment until we identify the root cause of the issue.

@Andreagit97
Copy link
Member

ok makes sense, don't worry!

Repeatedly reload the Falco process(send SIGHUP signal)
There is no stable reproduction method, but based on the dmesg information, the anomaly occurred right at the attempting a second restart for capture.

We will try to reproduce the issue using the repro you suggested

@FedeDP
Copy link
Contributor

FedeDP commented Apr 17, 2024

We weren't able to repro this :/ moving to next milestone. Hopefully we will be able to tackle this one.
/milestone 0.17.0

@poiana poiana modified the milestones: 0.16.0, 0.17.0 Apr 17, 2024
@FedeDP
Copy link
Contributor

FedeDP commented May 21, 2024

/milestone 0.18.0
We still had no luck in reproducing this.

@poiana poiana modified the milestones: 0.17.0, 0.18.0 May 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants