Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking Issue: enable sanitizers on our CI #32380

Open
13 tasks
mmarchini opened this issue Mar 20, 2020 · 13 comments
Open
13 tasks

Tracking Issue: enable sanitizers on our CI #32380

mmarchini opened this issue Mar 20, 2020 · 13 comments
Labels
memory Issues and PRs related to the memory management or memory footprint. meta Issues and PRs related to the general management of the project.

Comments

@mmarchini
Copy link
Contributor

mmarchini commented Mar 20, 2020

What are sanitizers?

Sanitizers are build options which can be used to identify programming errors in C++. They add overhead to the program, which is why it's not recommended to use in releases, but they can be valuable to find memory leaks and other potential crashes on our tests. V8 has ASAN buildbots which run on every CL, and they are working on making their code base UBSan compatible as well.

Cool, what do we need to make it happen?

Well, first thing, asan has to work on Node.js. We have --enable-asan on configure today, and the build will work on most computers (it requires a considerable amount of memory). But most of our tests will fail with it, because our code base is not ASAN clean. We might be also missing some ASAN flags: looking at V8 source code and build dependencies, they do one thing or two different.

There's also the memory issue: as of right now, we need a big machine to build with ASAN. GitHub Actions can't run an ASAN build because it "only" has 7Gb. Hopefully there's something we can tune in the build to use less memory. Otherwise, we'll need a custom machine for it (probably on Jenkins).

As for other sanitizers (like UBSan), upstream projects need to support those first, so we are blocked for now.

To Do List (ASAN)

  • Investigate huge memory consumption during build
  • Move gengjiawen/node-build to nodejs/ org
    • Move repository
    • Move Docker build to Node.js Docker hub
  • Enable build on our CI (no tests)
    • This will ensure we don't break the build moving forward
  • Fix failing tests
    • Once everything is working, enable tests on our CI
  • Fix ASAN flags for OS X
  • Fix ASASN flags for Windows

To Do List (UBSan)

  • Wait for V8 to be UBSan compatible (v8:3770)
    • Enable build on our CI (no tests)
      • This will ensure we don't break the build moving forward
    • Fix failing tests
      • Once everything is working, enable tests on our CI

We can list current failing tests here once we get rid of the most noisy ones.

@targos
Copy link
Member

targos commented Mar 20, 2020

If we have a machine with more memory, we could make it a self-hosted github actions runner

@mmarchini
Copy link
Contributor Author

I thought about it, but I'm worried about maintenance burden (it's already hard to keep Jenkins hosts healthy). It should still be worth investigating (not only for this use case).

@gengjiawen
Copy link
Member

Some todo from my original PR #31902 :

  • move node docker to official repo, need to setup a new repo.
  • skip some failed test due to asan flag
  • should check gcc and clang both, they have difference google/sanitizers/wiki/AddressSanitizerClangVsGCC-(6.0-vs-8.1)

cc @mmarchini

@gengjiawen
Copy link
Member

If we have a machine with more memory, we could make it a self-hosted github actions runner

If we using it, we also should take actions to avoid it being abused.

@mmarchini
Copy link
Contributor Author

Not sure we need the docker image: gcc and clang on the Actions host are fairly recent, and we don't use a docker image in any of the other build Actions. Also, instead of skipping tests, we could disable all ASAN checks and enable them one by one, that way we can fix everything from one check before moving to the other, and we're guarantee the ones we already fixed will be flagged on CI if they regress.

@gengjiawen
Copy link
Member

Docker image will has stable dev env for everyone, same gcc, clang, this will help a lot we reproduce problem. Also with docker we can also have parallel env. Like gcc8 and gcc9 same time (create different image). Also it create usefully env for macOS and windows users. (Currently macOS asan is broken for Node.js)

Also any thought on should check gcc and clang both, they have difference google/sanitizers/wiki/AddressSanitizerClangVsGCC-(6.0-vs-8.1)

@mmarchini
Copy link
Contributor Author

For ASAN, there's probably not much value on using an older compiler version (an older version won't have a check not present on a newer one). I'll read the differences tomorrow. We might have better coverage with both, or maybe just one of them is enough. I think V8 only uses clang.

@mmarchini
Copy link
Contributor Author

Going through the differences in AddressSanitizerClangVsGCC-(6.0-vs-8.1), we should be looking into using clang for ASAN. GCC doesn't provide anything extra that we can use (most things gcc has over clang are only useful for building the Linux kernel with ASAN), and gcc lacks some features.

@gengjiawen
Copy link
Member

Investigate huge memory consumption during build

From: http://llvm.org/docs/CMake.html

LLVM_USE_LINKER:STRING
Add -fuse-ld={name} to the link invocation. The possible value depend on your compiler, for clang the value can be an absolute path to your custom linker, otherwise clang will prefix the name with ld. and apply its usual search. For example to link LLVM with the Gold linker, cmake can be invoked with -DLLVM_USE_LINKER=gold

Can we try gold linker instead of ld ?

@gengjiawen
Copy link
Member

Or pass extra flags --hash-size=31 --reduce-memory-overhead https://stackoverflow.com/a/41980097/1713757.

I can try it, but I don't know where do I add them ? cc @bnoordhuis

@gengjiawen
Copy link
Member

Somehow github action can build asan again:

we have 30 test case failed:

[40:39|% 100|+ 2803|-  30]: Done

Full log: https://github.com/gengjiawen/node-github-workflow/runs/551124593?check_suite_focus=true

@mmarchini
Copy link
Contributor Author

I did some testing with clang instead of gcc and the results were promising, was able to build with ASAN (not in Debug mode) using a max of 2Gb memory. I'll open a PR to re-enable ASAN on Actions soon. there's one pending PR to get all cctests passing on ASAN, after that we can start working on getting JS tests passing as well.

@targos targos added memory Issues and PRs related to the memory management or memory footprint. meta Issues and PRs related to the general management of the project. labels Dec 26, 2020
@bcoe
Copy link
Contributor

bcoe commented Mar 24, 2021

@mmarchini I've hit a bit of a wall trying to backport source-map functionality to the v14.x-staging release line, due to ASAN tests in #37717.

Bisecting, the exceptions being raised appear to be caused by modifications to source_map_cache.js. But, the changes in this file seem fairly innocent to me, and the tests that begin failing on test-asan appear unrelated.

Any thoughts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
memory Issues and PRs related to the memory management or memory footprint. meta Issues and PRs related to the general management of the project.
Projects
None yet
Development

No branches or pull requests

4 participants