New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Internal compiler error with Cuda/11.4 GCC/10.3.0 #4334
Comments
I think that bug just comes from using Chrono itself? |
Simple reproducer: #include in a source (.cpp) file nvcc -x cu source.cpp |
@ajpowelsnl will contact Max Katz, Matt Stack to tell them about the bug, find out which other gcc / cuda combinations fail in this manner , & to file a bug report, if necessary. |
You probably meant to tag @maxpkatz |
This issue has already been reported upstream to gcc (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100102) and is reported there as fixed in 10.4 and 11.0+. But since it's present in gcc 10.3, and it seems to be always exposed when using gcc as nvcc's host compiler, I think that basically means gcc 10.3 is unuseable with CUDA for Kokkos (or any other application that uses std::chrono). |
Many thanks for the info, @maxpkatz. It sounds like the "fix" is to not use gcc-10.3. I'll check with @crtrott on this point. |
put it on the agenda for tomorrow. We could in theory avoid using chrono for this combo, and fall back to what we had before we used chrono (i.e. timespec structs). |
Looks like the error is still there: $ make
[ 3%] Building CXX object core/src/CMakeFiles/kokkoscore.dir/impl/Kokkos_Core.cpp.o
/opt/software/GCCcore/10.3.0/include/c++/10.3.0/chrono: In substitution of ‘template<class _Rep, class _Period> template<class _Period2> using __is_harmonic = std::__bool_constant<(std::ratio<((_Period2::num / std::chrono::duration<_Rep, _Period>::_S_gcd(_Period2::num, _Period::num)) * (_Period::den / std::chrono::duration<_Rep, _Period>::_S_gcd(_Period2::den, _Period::den))), ((_Period2::den / std::chrono::duration<_Rep, _Period>::_S_gcd(_Period2::den, _Period::den)) * (_Period::num / std::chrono::duration<_Rep, _Period>::_S_gcd(_Period2::num, _Period::num)))>::den == 1)> [with _Period2 = _Period2; _Rep = _Rep; _Period = _Period]’:
/opt/software/GCCcore/10.3.0/include/c++/10.3.0/chrono:473:154: required from here
/opt/software/GCCcore/10.3.0/include/c++/10.3.0/chrono:428:27: internal compiler error: Segmentation fault
428 | _S_gcd(intmax_t __m, intmax_t __n) noexcept
| ^~~~~~
0xcbb3ff crash_signal
../../gcc/toplev.c:328
0x7b259d tsubst(tree_node*, tree_node*, int, tree_node*)
../../gcc/cp/pt.c:15310
0x7c5596 tsubst_template_args(tree_node*, tree_node*, int, tree_node*)
../../gcc/cp/pt.c:13225
0x7bdf96 tsubst_aggr_type
../../gcc/cp/pt.c:13428
0x7c827f tsubst_function_decl
../../gcc/cp/pt.c:13816
0x7bec39 tsubst_decl
../../gcc/cp/pt.c:14267
0x7acc21 tsubst_copy
../../gcc/cp/pt.c:16512
0x7b051a tsubst_copy_and_build(tree_node*, tree_node*, int, tree_node*, bool, bool)
../../gcc/cp/pt.c:20707
0x7af076 tsubst_copy_and_build(tree_node*, tree_node*, int, tree_node*, bool, bool)
../../gcc/cp/pt.c:19274
0x7af076 tsubst_copy_and_build(tree_node*, tree_node*, int, tree_node*, bool, bool)
../../gcc/cp/pt.c:19896
0x7ae4ad tsubst_copy_and_build(tree_node*, tree_node*, int, tree_node*, bool, bool)
../../gcc/cp/pt.c:19274
0x7ae4ad tsubst_copy_and_build(tree_node*, tree_node*, int, tree_node*, bool, bool)
../../gcc/cp/pt.c:19588
0x7ae476 tsubst_copy_and_build(tree_node*, tree_node*, int, tree_node*, bool, bool)
../../gcc/cp/pt.c:19274
0x7ae476 tsubst_copy_and_build(tree_node*, tree_node*, int, tree_node*, bool, bool)
../../gcc/cp/pt.c:19587
0x7c0a54 tsubst_copy_and_build(tree_node*, tree_node*, int, tree_node*, bool, bool)
../../gcc/cp/pt.c:19274
0x7c0a54 tsubst_expr(tree_node*, tree_node*, int, tree_node*, bool)
../../gcc/cp/pt.c:18886
0x7c5596 tsubst_template_args(tree_node*, tree_node*, int, tree_node*)
../../gcc/cp/pt.c:13225
0x7bdf96 tsubst_aggr_type
../../gcc/cp/pt.c:13428
0x7ade97 tsubst_qualified_id
../../gcc/cp/pt.c:16215
0x7afbbd tsubst_copy_and_build(tree_node*, tree_node*, int, tree_node*, bool, bool)
../../gcc/cp/pt.c:19625
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.
make[2]: *** [core/src/CMakeFiles/kokkoscore.dir/impl/Kokkos_Core.cpp.o] Error 1
make[1]: *** [core/src/CMakeFiles/kokkoscore.dir/all] Error 2
make: *** [all] Error 2 [05:03:24][gretephi@dev-amd20-v100: ~/src/kokkos/build-fix]
$ git status
# HEAD detached at masterleinad/timer_fallback_gcc1030_nvcc
nothing to commit, working directory clean |
Hmmm... I'm having problems reproducing this. I'm getting
|
I just tried again from scratch and it's still fails: [08:05:16][gretephi@dev-amd20-v100: ~/src]
$ git clone -b timer_fallback_gcc1030_nvcc https://github.com/masterleinad/kokkos.git kokkos-gcc1030
Cloning into 'kokkos-gcc1030'...
remote: Enumerating objects: 77211, done.
remote: Counting objects: 100% (127/127), done.
remote: Compressing objects: 100% (96/96), done.
remote: Total 77211 (delta 77), reused 55 (delta 31), pack-reused 77084
Receiving objects: 100% (77211/77211), 21.32 MiB | 12.93 MiB/s, done.
Resolving deltas: 100% (63317/63317), done.
[08:06:29][gretephi@dev-amd20-v100: ~/src]
$ cd kokkos-gcc1030
[08:06:36][gretephi@dev-amd20-v100: ~/src/kokkos-gcc1030]
$ cmake -Bbuild -DKokkos_ENABLE_CUDA=ON .
-- Setting default Kokkos CXX standard to 14
-- The CXX compiler identification is GNU 10.3.0
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /opt/software/GCCcore/10.3.0/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Setting build type to 'RelWithDebInfo' as none was specified.
-- Setting policy CMP0074 to use <Package>_ROOT variables
-- The project name is: Kokkos
-- Compiler Version: 11.4.48
-- kokkos_launch_compiler (/mnt/home/gretephi/src/kokkos-gcc1030/bin/kokkos_launch_compiler) is enabled...
-- SERIAL backend is being turned on to ensure there is at least one Host space. To change this, you must enable another host execution space and configure with -DKokkos_ENABLE_SERIAL=OFF or change CMakeCache.txt
-- Using -std=c++14 for C++14 standard as feature
-- CUDA auto-detection of architecture failed with /opt/software/GCCcore/10.3.0/bin/c++. Enabling CUDA language ONLY to auto-detect architecture...
-- Looking for a CUDA compiler
-- Looking for a CUDA compiler - /opt/software/CUDAcore/11.4.0/bin/nvcc
-- The CUDA compiler identification is NVIDIA 11.4.48
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Check for working CUDA compiler: /opt/software/CUDAcore/11.4.0/bin/nvcc - skipped
-- Detecting CUDA compile features
-- Detecting CUDA compile features - done
-- Detected CUDA Compute Capability 70
-- Setting Kokkos_ARCH_VOLTA70=ON
-- Built-in Execution Spaces:
-- Device Parallel: Kokkos::Cuda
-- Host Parallel: NoTypeDefined
-- Host Serial: SERIAL
--
-- Architectures:
-- VOLTA70
-- Found CUDAToolkit: /opt/software/CUDAcore/11.4.0/include (found version "11.4.48")
-- Looking for C++ include pthread.h
-- Looking for C++ include pthread.h - found
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE
-- Found TPLCUDA: TRUE
-- Found TPLLIBDL: /usr/lib64/libdl.so
-- Kokkos Devices: CUDA;SERIAL, Kokkos Backends: CUDA;SERIAL
-- Configuring done
-- Generating done
-- Build files have been written to: /mnt/home/gretephi/src/kokkos-gcc1030/build
[08:07:40][gretephi@dev-amd20-v100: ~/src/kokkos-gcc1030]
$ cmake --build build
[ 3%] Building CXX object core/src/CMakeFiles/kokkoscore.dir/impl/Kokkos_CPUDiscovery.cpp.o
[ 7%] Building CXX object core/src/CMakeFiles/kokkoscore.dir/impl/Kokkos_Core.cpp.o
/opt/software/GCCcore/10.3.0/include/c++/10.3.0/chrono: In substitution of ‘template<class _Rep, class _Period> template<class _Period2> using __is_harmonic = std::__bool_constant<(std::ratio<((_Period2::num / std::chrono::duration<_Rep, _Period>::_S_gcd(_Period2::num, _Period::num)) * (_Period::den / std::chrono::duration<_Rep, _Period>::_S_gcd(_Period2::den, _Period::den))), ((_Period2::den / std::chrono::duration<_Rep, _Period>::_S_gcd(_Period2::den, _Period::den)) * (_Period::num / std::chrono::duration<_Rep, _Period>::_S_gcd(_Period2::num, _Period::num)))>::den == 1)> [with _Period2 = _Period2; _Rep = _Rep; _Period = _Period]’:
/opt/software/GCCcore/10.3.0/include/c++/10.3.0/chrono:473:154: required from here
/opt/software/GCCcore/10.3.0/include/c++/10.3.0/chrono:428:27: internal compiler error: Segmentation fault
428 | _S_gcd(intmax_t __m, intmax_t __n) noexcept
| ^~~~~~
0xcbb3ff crash_signal
../../gcc/toplev.c:328
0x7b259d tsubst(tree_node*, tree_node*, int, tree_node*)
../../gcc/cp/pt.c:15310
0x7c5596 tsubst_template_args(tree_node*, tree_node*, int, tree_node*)
../../gcc/cp/pt.c:13225
0x7bdf96 tsubst_aggr_type
../../gcc/cp/pt.c:13428
0x7c827f tsubst_function_decl
../../gcc/cp/pt.c:13816
0x7bec39 tsubst_decl
../../gcc/cp/pt.c:14267
0x7acc21 tsubst_copy
../../gcc/cp/pt.c:16512
0x7b051a tsubst_copy_and_build(tree_node*, tree_node*, int, tree_node*, bool, bool)
../../gcc/cp/pt.c:20707
0x7af076 tsubst_copy_and_build(tree_node*, tree_node*, int, tree_node*, bool, bool)
../../gcc/cp/pt.c:19274
0x7af076 tsubst_copy_and_build(tree_node*, tree_node*, int, tree_node*, bool, bool)
../../gcc/cp/pt.c:19896
0x7ae4ad tsubst_copy_and_build(tree_node*, tree_node*, int, tree_node*, bool, bool)
../../gcc/cp/pt.c:19274
0x7ae4ad tsubst_copy_and_build(tree_node*, tree_node*, int, tree_node*, bool, bool)
../../gcc/cp/pt.c:19588
0x7ae476 tsubst_copy_and_build(tree_node*, tree_node*, int, tree_node*, bool, bool)
../../gcc/cp/pt.c:19274
0x7ae476 tsubst_copy_and_build(tree_node*, tree_node*, int, tree_node*, bool, bool)
../../gcc/cp/pt.c:19587
0x7c0a54 tsubst_copy_and_build(tree_node*, tree_node*, int, tree_node*, bool, bool)
../../gcc/cp/pt.c:19274
0x7c0a54 tsubst_expr(tree_node*, tree_node*, int, tree_node*, bool)
../../gcc/cp/pt.c:18886
0x7c5596 tsubst_template_args(tree_node*, tree_node*, int, tree_node*)
../../gcc/cp/pt.c:13225
0x7bdf96 tsubst_aggr_type
../../gcc/cp/pt.c:13428
0x7ade97 tsubst_qualified_id
../../gcc/cp/pt.c:16215
0x7afbbd tsubst_copy_and_build(tree_node*, tree_node*, int, tree_node*, bool, bool)
../../gcc/cp/pt.c:19625
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.
gmake[2]: *** [core/src/CMakeFiles/kokkoscore.dir/impl/Kokkos_Core.cpp.o] Error 1
gmake[1]: *** [core/src/CMakeFiles/kokkoscore.dir/all] Error 2
gmake: *** [all] Error 2 I noticed that the cuda version on our machine is slightly older. I wonder if that could be related (unfortunately, this is a shared cluster so there's no easy update path I could follow). |
Can you check if commenting |
Hi @masterleinad -- To reproduce, I just imported the header (i.e., "#include ) in a file, and compiled with this command: nvcc -x cu <source_file.cpp> You see the error during compilation. Here is my environment: [ajpowel@kokkos-dev-2 kokkos]$ module list
|
@ajpowelsnl Since you can reproduce the issue, can you check if commenting |
@pgrete Since you can reproduce the issue, can you check if commenting |
@pgrete and @masterleinad -- commenting out "#include " in the file you mention does appear to fix the issue. What would you like for me to do next? |
$ git diff
diff --git a/core/src/impl/Kokkos_ClockTic.hpp b/core/src/impl/Kokkos_ClockTic.hpp
index 4e46b8d..9085314 100644
--- a/core/src/impl/Kokkos_ClockTic.hpp
+++ b/core/src/impl/Kokkos_ClockTic.hpp
@@ -47,7 +47,7 @@
#include <Kokkos_Macros.hpp>
#include <stdint.h>
-#include <chrono>
+//#include <chrono>
#ifdef KOKKOS_ENABLE_OPENMPTARGET
#include <omp.h>
#endif The error still persists (fails with an identical error message), but I'm also not sure where chrono is picked up. |
Same here. Looks like (I ran |
NVIDIA/nccl#494 mentions some workarounds to |
Observed internal compiler error last week on Perlmutter with gcc/10.3.0 and cuda/11.3.0 when trying to compile ArborX. Can't recall the exact message right now. |
CUDA Toolkit 11.4.0 clearly states gcc-9 is maximum supported. CUDA Toolkit 11.4.1 supports up to gcc-11. Therefore the inability to reproduce on 11.4.1 while the original post was on 11.4.0 (which does not support gcc-10 or gcc-11 yet). |
There is no reasonable workaround for us, and this is not officially supported by NVCC anyway. |
@kurtsansom reported this for our downstream code parthenon-hpc-lab/parthenon#585 (comment) and I was able to confirm for a plain Kokkos build using current
develop
:Environment:
Using GCC 10.2 works just fine.
The text was updated successfully, but these errors were encountered: