Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build failure on CPU without popcnt #48305

Closed
thesamesam opened this issue Jun 2, 2023 · 7 comments
Closed

Build failure on CPU without popcnt #48305

thesamesam opened this issue Jun 2, 2023 · 7 comments

Comments

@thesamesam
Copy link
Contributor

Version

20.2.0

Platform

5.15.114-gentoo-dist-hardened

Subsystem

simdutf

What steps will reproduce the bug?

Attempt to build nodejs-18.14.0 or nodejs-20.2.0 with a CPU doesn't have popcnt, like Intel-R-_Core-TM-2_Duo_CPU_T9900. It seems like the issue is having e.g. SSE4.1 but not popcnt.

The build will fail within simdutf.

How often does it reproduce? Is there a required condition?

Reproducible on each attempt with an Intel Core 2 Duo using -march=native but with no popcnt available.

The machine has the following:

arch=native; for t in param target; do cmd="gcc -Q -O2 -march=$arch --help=$t"; diff -U0 <(LANG=C $cmd -march=x86-64) <(LANG=C $cmd -march=$arch); done
--- /dev/fd/63  2023-06-02 18:52:39.582008632 +0200
+++ /dev/fd/62  2023-06-02 18:52:39.582008632 +0200
@@ -91 +91 @@
-  --param=l2-cache-size=               512
+  --param=l2-cache-size=               6144
--- /dev/fd/63  2023-06-02 18:52:39.755341026 +0200
+++ /dev/fd/62  2023-06-02 18:52:39.758674342 +0200
@@ -27 +27 @@
-  -march=                              x86-64
+  -march=                              core2
@@ -66 +66 @@
-  -mcx16                               [disabled]
+  -mcx16                               [enabled]
@@ -122 +122 @@
-  -mmwait                              [disabled]
+  -mmwait                              [enabled]
@@ -130 +130 @@
-  -mno-sse4                            [enabled]
+  -mno-sse4                            [disabled]
@@ -160 +160 @@
-  -msahf                               [disabled]
+  -msahf                               [enabled]
@@ -170 +170 @@
-  -msse3                               [disabled]
+  -msse3                               [enabled]
@@ -172 +172 @@
-  -msse4.1                             [disabled]
+  -msse4.1                             [enabled]
@@ -177 +177 @@
-  -mssse3                              [disabled]
+  -mssse3                              [enabled]
@@ -192 +192 @@
-  -mtune=                              generic
+  -mtune=                              core2
@@ -205 +205 @@
-  -mxsave                              [disabled]
+  -mxsave                              [enabled]

And /proc/cpuinfo:

processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 23
model name      : Intel(R) Core(TM)2 Duo CPU     T9900  @ 3.06GHz
stepping        : 10
microcode       : 0xa0c
cpu MHz         : 798.006
cache size      : 6144 KB
physical id     : 0
siblings        : 2
core id         : 0
cpu cores       : 2
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl cpuid aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm sse4_1 xsave lahf_lm pti tpr_shadow vnmi flexpriority vpid dtherm ida
vmx flags       : vnmi flexpriority tsc_offset vtpr vapic
bugs            : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit mmio_unknown
bogomips        : 6120.96
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:

processor       : 1
vendor_id       : GenuineIntel
cpu family      : 6
model           : 23
model name      : Intel(R) Core(TM)2 Duo CPU     T9900  @ 3.06GHz
stepping        : 10
microcode       : 0xa0c
cpu MHz         : 800.000
cache size      : 6144 KB
physical id     : 0
siblings        : 2
core id         : 1
cpu cores       : 2
apicid          : 1
initial apicid  : 1
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl cpuid aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm sse4_1 xsave lahf_lm pti tpr_shadow vnmi flexpriority vpid dtherm ida
vmx flags       : vnmi flexpriority tsc_offset vtpr vapic
bugs            : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit mmio_unknown
bogomips        : 6120.96
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:

What is the expected behavior? Why is that the expected behavior?

Successful build, as before simdutf import.

What do you see instead?

Build failure:

FAILED: �[0mobj/deps/simdutf/simdutf.simdutf.o 
x86_64-pc-linux-gnu-g++ -MMD -MF obj/deps/simdutf/simdutf.simdutf.o.d -DV8_DEPRECATION_WARNINGS -DV8_IMMINENT_DEPRECATION_WARNINGS -D_GLIBCXX_USE_CXX11_ABI=1 -DNODE_OPENSSL_CONF_NAME=nodejs_conf -DNODE_OPENSSL_CERT_STORE -DICU_NO_USER_DATA_OVERRIDE -D__STDC_FORMAT_MACROS -I../../deps/simdutf -pthread -Wall -Wextra -Wno-unused-parameter -m64 -fno-omit-frame-pointer -march=native -O2 -pipe -fno-rtti -fno-exceptions -std=gnu++17  -c ../../deps/simdutf/simdutf.cpp -o obj/deps/simdutf/simdutf.simdutf.o
In file included from /usr/lib/gcc/x86_64-pc-linux-gnu/12/include/x86gprintrin.h:71,
                 from /usr/lib/gcc/x86_64-pc-linux-gnu/12/include/x86intrin.h:27,
                 from ../../deps/simdutf/simdutf.cpp:1180:
/usr/lib/gcc/x86_64-pc-linux-gnu/12/include/popcntintrin.h: In function ‘size_t simdutf::icelake::{anonymous}::utf16_to_utf8_avx512i(const char16_t*, size_t, unsigned char*, size_t*) [with simdutf::endianness big_endian = simdutf::LITTLE]’:
/usr/lib/gcc/x86_64-pc-linux-gnu/12/include/popcntintrin.h:35:1: error: inlining failed in call to ‘always_inline’ ‘int _mm_popcnt_u32(unsigned int)’: target specific option mismatch
   35 | _mm_popcnt_u32 (unsigned int __X)
      | ^~~~~~~~~~~~~~
../../deps/simdutf/simdutf.cpp:18010:36: note: called from here
18010 |       outbuf += 31 + _mm_popcnt_u32(_cvtmask32_u32(is234byte));
      |                      ~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~
/usr/lib/gcc/x86_64-pc-linux-gnu/12/include/popcntintrin.h:35:1: error: inlining failed in call to ‘always_inline’ ‘int _mm_popcnt_u32(unsigned int)’: target specific option mismatch
   35 | _mm_popcnt_u32 (unsigned int __X)
      | ^~~~~~~~~~~~~~
../../deps/simdutf/simdutf.cpp:18010:36: note: called from here
18010 |       outbuf += 31 + _mm_popcnt_u32(_cvtmask32_u32(is234byte));
      |                      ~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~
/usr/lib/gcc/x86_64-pc-linux-gnu/12/include/popcntintrin.h:42:1: error: inlining failed in call to ‘always_inline’ ‘long long int _mm_popcnt_u64(long long unsigned int)’: target specific option mismatch
   42 | _mm_popcnt_u64 (unsigned long long __X)
      | ^~~~~~~~~~~~~~
../../deps/simdutf/simdutf.cpp:18116:36: note: called from here
18116 |     uint64_t advhi = _mm_popcnt_u64(wanthi_uint64);
      |                      ~~~~~~~~~~~~~~^~~~~~~~~~~~~~~
/usr/lib/gcc/x86_64-pc-linux-gnu/12/include/popcntintrin.h:42:1: error: inlining failed in call to ‘always_inline’ ‘long long int _mm_popcnt_u64(long long unsigned int)’: target specific option mismatch
   42 | _mm_popcnt_u64 (unsigned long long __X)
      | ^~~~~~~~~~~~~~
../../deps/simdutf/simdutf.cpp:18115:36: note: called from here
18115 |     uint64_t advlo = _mm_popcnt_u64(wantlo_uint64);
      |                      ~~~~~~~~~~~~~~^~~~~~~~~~~~~~~
/usr/lib/gcc/x86_64-pc-linux-gnu/12/include/popcntintrin.h:42:1: error: inlining failed in call to ‘always_inline’ ‘long long int _mm_popcnt_u64(long long unsigned int)’: target specific option mismatch
   42 | _mm_popcnt_u64 (unsigned long long __X)
      | ^~~~~~~~~~~~~~
../../deps/simdutf/simdutf.cpp:18116:36: note: called from here
18116 |     uint64_t advhi = _mm_popcnt_u64(wanthi_uint64);
      |                      ~~~~~~~~~~~~~~^~~~~~~~~~~~~~~
/usr/lib/gcc/x86_64-pc-linux-gnu/12/include/popcntintrin.h:42:1: error: inlining failed in call to ‘always_inline’ ‘long long int _mm_popcnt_u64(long long unsigned int)’: target specific option mismatch
   42 | _mm_popcnt_u64 (unsigned long long __X)
      | ^~~~~~~~~~~~~~
../../deps/simdutf/simdutf.cpp:18115:36: note: called from here
18115 |     uint64_t advlo = _mm_popcnt_u64(wantlo_uint64);
      |                      ~~~~~~~~~~~~~~^~~~~~~~~~~~~~~

Additional information

Forwarding this on behalf of some downstream Gentoo users.

Reported downstream in Gentoo at https://bugs.gentoo.org/900513.

See also #46789.

@mscdex
Copy link
Contributor

mscdex commented Jun 2, 2023

As noted in #46789, do you have a new enough build environment (especially binutils >= 2.30)?

@mscdex
Copy link
Contributor

mscdex commented Jun 2, 2023

/cc @anonrig

@anonrig
Copy link
Member

anonrig commented Jun 2, 2023

Can you open an issue to simdutf as well?

@thesamesam
Copy link
Contributor Author

Yeah, this is with GCC 12 (12.2.1 20230428) and Binutils 2.39. I can file it over w/ simdutf, but fwiw I haven't tried to reproduce it outside of nodejs (and I'm forwarding a bug for somebody else, so not reproduced it myself yet).

@thesamesam
Copy link
Contributor Author

Reported to simdutf at simdutf/simdutf#251.

@lemire
Copy link
Member

lemire commented Jun 3, 2023

Updating simdutf should fix this issue. However, I cannot verify the fix since I don't have this hardware. It is quite old.

The issue is unexpected.

@LiviaMedeiros
Copy link
Contributor

Fixed by #48344

Verified the fix on Intel Q8200 (bug is reproducible):

$ grep popcnt /proc/cpuinfo
$ grep ^flags /proc/cpuinfo | head -n1
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl cpuid aperfmperf pni dtes64 monitor ds_cpl est tm2 ssse3 cx16 xtpr pdcm sse4_1 lahf_lm dtherm
$ emerge --info | grep ^CFLAGS
CFLAGS="-O2 -pipe -march=native"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants