Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update compiler on Linux s390x #3630

Open
richardlau opened this issue Feb 13, 2024 · 10 comments
Open

Update compiler on Linux s390x #3630

richardlau opened this issue Feb 13, 2024 · 10 comments

Comments

@richardlau
Copy link
Member

@miladfarca has found that there is a nasty gcc bug on s390x with std::optional has_value() (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106355) during review of https://chromium-review.googlesource.com/c/v8/v8/+/5246687 (where V8 is proposing to drop their custom v8::base::Optional for std::optional).

While the upstream gcc bug has supposedly been fixed in gcc 10.5.0, that isn't available via dnf on RHEL 8 as gcc-toolset-10 was "retired" in November 2022 (FWIW gcc-toolset-11 is also considered "retired"), which probably means it is unlikely to get updates (we'll check): https://access.redhat.com/support/policy/updates/rhel-app-streams-life-cycle

What we know so far:

Tested the bug (using the example in bugzilla) on ubi8:

  • gcc-toolset-10 with cc 10.3.1: broken
  • gcc-toolset-11 with cc 11.2.1: broken
  • gcc-toolset-12 with cc 12.2.1: ok

A quick scan of the Node.js codebase shows we are already using std::optional outside of V8, although I don't know if we're hitting the bug with those usages.

So we're probably going to have to move to at least gcc-toolset-12 for Linux on s390x, and the only question is when and whether we make the switch for existing release lines to be on the safe side. Switching to gcc-toolset-12 should be safe from a runtime compatibility perspective as the code will still be linked against the glibc and libstdc++ versions in base RHEL 8.

@targos
Copy link
Member

targos commented Feb 13, 2024

FWIW I'm always up for upgrading compilers anywhere 😄

@mhdawson
Copy link
Member

@richardlau is the V8 code with the bug already in existing release lines or if not, likely to be backported ?

@richardlau
Copy link
Member Author

@richardlau is the V8 code with the bug already in existing release lines or if not, likely to be backported ?

Current V8 in Node.js uses V8's custom optional. I don't know if the change will end up in the version of V8 we'll end up using in Node.js 22.

@mhdawson
Copy link
Member

Ok so the question is if we need to have a new gcc for Node.js 22 or Node.js 23, right?

@mhdawson
Copy link
Member

Thinking about it, we do often pull in a version of v8 after 22 is cut, so it may not be easy to know that until close to the LTS promotion.

It may be best to assume it will be needed for 22 so that we don't need to update compilers after 22 is cut.

@richardlau
Copy link
Member Author

Another s390x gcc bug has been identified, this time in gcc 12 and 13 when c++20 is enabled:

  • Bug 113960 - [11/12 Regression] std::map with std::vector as input overwrites itself with c++20, on s390x platform

This one currently doesn't affect Node.js because we haven't landed nodejs/node#45427. We are working with the RH team that looks after the gcc-toolsets to get the fix into gcc-toolsets (preferably 12) once upstream gcc have backported the fixes.

I'm about to open a PR to Ansible installation of gcc-toolsets 12 and 13 onto our RHEL 8 machines. This will make those compilers available on the machine so we can run test builds, etc. We can then make any required select-compiler.sh changes in a follow up PR.

@richardlau
Copy link
Member Author

richardlau commented Mar 21, 2024

Another s390x gcc bug has been identified, this time in gcc 12 and 13 when c++20 is enabled:

  • Bug 113960 - [11/12 Regression] std::map with std::vector as input overwrites itself with c++20, on s390x platform

This one currently doesn't affect Node.js because we haven't landed nodejs/node#45427. We are working with the RH team that looks after the gcc-toolsets to get the fix into gcc-toolsets (preferably 12) once upstream gcc have backported the fixes.

For tracking Bug 113960 being backported to gcc-toolset-12: https://issues.redhat.com/browse/RHEL-29952

richardlau added a commit that referenced this issue Mar 21, 2024
To avoid a compiler bug in gcc, use gcc-toolset-12 on rhel8-s390x.

Refs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106355
Refs: #3630
richardlau added a commit that referenced this issue Mar 21, 2024
To avoid a compiler bug in gcc, use gcc-toolset-12 on rhel8-s390x.

Refs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106355
Refs: #3630
richardlau added a commit that referenced this issue Mar 21, 2024
To avoid a compiler bug in gcc, use gcc-toolset-12 on rhel8-s390x.

Refs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106355
Refs: #3630
@richardlau
Copy link
Member Author

richardlau commented Mar 21, 2024

I just tried running our V8 CI on rhel8-s390x for main with gcc-toolset-12 and it failed 😞.

select-compiler.sh changes in https://github.com/nodejs/build/tree/gts
https://ci.nodejs.org/job/richardlau-node-test-commit-v8-linux/544/nodes=rhel8-s390x,v8test=v8test/consoleFull

17:37:32 /opt/rh/gcc-toolset-12/root/usr/include/c++/12/bits/char_traits.h:431:56: error: ‘void* __builtin_memcpy(void*, const void*, long unsigned int)’ accessing 9223372036854775810 or more bytes at offsets [2, 9223372036854775807] and 1 may overlap up to 9223372036854775813 bytes at offset -3 [-Werror=restrict]
17:37:32   431 |         return static_cast<char_type*>(__builtin_memcpy(__s1, __s2, __n));
17:37:32       |                                        ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~

From @miladfarca:

yes the fix is is only available in main and beta. You will need to use this v8 patch: https://chromium-review.googlesource.com/c/v8/v8/+/5331756 and this gn patch: https://gn-review.googlesource.com/c/gn/+/16820

For the gn patch, we can possibly roll forwards to main (or at least to a later commit). We pinned before for compatibility for Node.js 14 -- #3335 is open to unpin or update since Node.js 14 is no longer supported.

richardlau added a commit to richardlau/build that referenced this issue Mar 21, 2024
To avoid a compiler bug in gcc, use gcc-toolset-12 on rhel8-s390x.

Refs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106355
Refs: nodejs#3630
@richardlau
Copy link
Member Author

Opened #3659 to roll forward gn and use gcc-toolset-12 on rhel8-s390x.
Also opened nodejs/node#52183 to backport https://chromium-review.googlesource.com/c/v8/v8/+/5331756.

richardlau added a commit that referenced this issue Mar 21, 2024
To avoid a compiler bug in gcc, use gcc-toolset-12 on rhel8-s390x.

Refs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106355
Refs: #3630
richardlau added a commit to richardlau/build that referenced this issue Apr 16, 2024
To avoid a compiler bug in gcc, use gcc-toolset-12 on rhel8-s390x.

Refs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106355
Refs: nodejs#3630
richardlau added a commit that referenced this issue Apr 17, 2024
To avoid a compiler bug in gcc, use gcc-toolset-12 on rhel8-s390x.

Refs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106355
Refs: #3630
@richardlau
Copy link
Member Author

I've merged #3659. It contains a temporary patch to one of the libstdc++ header files for gcc-toolset-12 while we wait for the RPM to be updated (current estimate we were given is "around May"). I'll keep this issue open to remind us to revert 9c149d0 when the fix for https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113960 makes it into gcc-toolset-12.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants