Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Signed integer overflow in latest node 18 LTS release #48621

Closed
jkrems opened this issue Jul 1, 2023 · 8 comments
Closed

Signed integer overflow in latest node 18 LTS release #48621

jkrems opened this issue Jul 1, 2023 · 8 comments
Assignees
Labels
confirmed-bug Issues with confirmed bugs. whatwg-url Issues and PRs related to the WHATWG URL implementation.

Comments

@jkrems
Copy link
Contributor

jkrems commented Jul 1, 2023

Version

18.16.1

Platform

No response

Subsystem

deps/ada

What steps will reproduce the bug?

As far as I can tell, any use of the URL constructor which now calls into native code from ada:

new URL('s://g');

This is reported when building with ubsan.

How often does it reproduce? Is there a required condition?

No response

What is the expected behavior? Why is that the expected behavior?

No undefined behavior caused by using the URL constructor.

What do you see instead?

An integer overflow.

Additional information

This was fixed upstream by using uint64 literals instead of signed ints: ada-url/ada@38d6eae.

@jkrems
Copy link
Contributor Author

jkrems commented Jul 1, 2023

The patch is a relatively simple find&replace but I wasn't sure if that's an acceptable patch in deps/ (especially since it's a generated file). The fix only landed in 2.x of ada afaict.

@jkrems
Copy link
Contributor Author

jkrems commented Jul 1, 2023

This will likely be fixed in #48345 but the fact that we ended up with undefined behavior (which yes, didn't appear to break anything, maybe) in an LTS release make me question the recent pace of adding new C++ deps. It's a lot of new, non-trivial, and non-hardened C++ code entering core.

@anonrig
Copy link
Member

anonrig commented Jul 2, 2023

Thanks for the report. I'm one of the authors of Ada. As you said, this is fixed in Ada v2.x, and is pending to be merged into the v18-x staging branch. (cc @nodejs/releasers)

The patch is a relatively simple find&replace but I wasn't sure if that's an acceptable patch in deps/ (especially since it's a generated file). The fix only landed in 2.x of ada afaict.

Any changes to deps are prohibited and it's recommended for contributors to upstream any issues & fixes.

release make me question the recent pace of adding new C++ deps

Unfortunately, Ada was the default URL parser for Node 19 and Node 20, and because most people prefer to stay in LTS, it is impossible to have the same amount of user feedback as a package would receive from a Node.js LTS release. Practically speaking, I believe it's inevitable to catch bugs like these without a certain users trying/using it.

in an LTS release make me question the recent pace of adding new C++ deps

Here's the timeframe:

  • January 29 - Ada v1.0 landed in main branch.
  • April 4 - Ada backport to v18 open.
  • April 12 - v18 first release including Ada parser.

To summarize: it took 74 days to land a new dependency to LTS release.

It's a lot of new, non-trivial, and non-hardened C++ code entering core.

I disagree with your non-trivial classification. URL is one of the most important and used components in the Node.js runtime. If you have any specific recommendations about how to classify Ada as a hardened dependency, please open an issue to Ada's repository, and I'll be happy personally to follow up with them.

Regarding the status of Ada, let me give you some insight:

Ada has more tests than Web platform tests (which influenced URL parsers like curl and boost/url with finding bugs in their implementation, and has a significant test coverage & fuzzing coverage in Google's OSS Fuzz.

But sometimes, it is not sufficient to catch bugs like these due to specific constraints, and despite how much code you write, and how much test you have in your project from time to time some significant bugs like these pass through.

cc @nodejs/url

@anonrig anonrig added confirmed-bug Issues with confirmed bugs. whatwg-url Issues and PRs related to the WHATWG URL implementation. labels Jul 2, 2023
@anonrig anonrig self-assigned this Jul 2, 2023
@RafaelGSS
Copy link
Member

This PR should help too

@lemire
Copy link
Member

lemire commented Jul 2, 2023

I think that @RafaelGSS’s answer is the right one. I expect that the community does not want to discourage fast paced contributions that improve the performance of Node. However, they also want high reliability. The two values are not in opposition: you can get more quality by improving testing as part of CI, improving tooling etc. In turn, this can make you more confident and able to receive contributions.

Regarding the URL parsing…

Since Node.js 18, a new URL parser dependency was added to Node.js — Ada. This addition bumped the Node.js performance when parsing URLs to a new level. Some results could reach up to an improvement of 400%. (State of Node.js Performance 2023)

Generally, my impression is Node should ramp up both the testing and the benchmarking… It seems that’s what Rafael is doing.

@jkrems
Copy link
Contributor Author

jkrems commented Jul 5, 2023

Thanks for the detailed response and for your work making URL parsing faster in node!

Any changes to deps are prohibited and it's recommended for contributors to upstream any issues & fixes.

I'm generally aware of that but I wasn't sure how that policy deals with cases like this where node is using a major version that seemingly is no longer maintained upstream. If this was a more critical issue - would fixing it be blocked on doing a major version upgrade of the dependency? Or would we hope that upstream would create a new release line with a backport of the fix?

I disagree with your non-trivial classification. URL is one of the most important and used components in the Node.js runtime.

By "non-trivial" I meant "not just a few lines of obviously-correct code". I didn't meant to imply anything about how important or heavily used the code is. We can definitely disagree on whether 2.5k LOC is "trivial" or not. I personally couldn't skim it and say "yep, that's all correct" which is roughly the bar I'd set for "trivial code". I'd be surprised if many developers could and that is the important part to me here, not the term "trivial".

Practically speaking, I believe it's inevitable to catch bugs like these without a certain users trying/using it.

I definitely can understand that there's a tricky chicken&egg thing going on here. But in practice, this will likely mean that we're stuck on an older 18.x and can't upgrade to the latest version and the security fixes that brings. Which means that we get neither the faster URL parsing (which would have been a neat but non-critical win) nor all the other fixes (which we'd actually care about).

So ada is getting real world exposure but at the expense of at least one (group of a few thousand) real world users not being able to upgrade within an LTS line. I'll readily admit that our little world is "special" since a typical nodejs setup doesn't require that ~all test suites pass with a bunch of sanitizers enabled. But I also can't tell teams to disable sanitizers on CI for a bit because node is incompatible.

If you have any specific recommendations about how to classify Ada as a hardened dependency, please open an issue to Ada's repository, and I'll be happy personally to follow up with them.

To me, by definition, hardening only happens over time and with real usage. Which is a frustrating answer because it's not "actionable" but that cost of new code is very hard to "solve". New code will always be unstable (see also the very quick major bump).

But I want to be a bit more clear here: When I pointed out the pace of new native deps, I wasn't talking about ada in particular. Of all the new deps, ada is likely the one I have the least concerns about. Within the lifetime of node 18, multiple new native deps were added:

  • First aklomp/base64. Around the time this was added, the author was already talking in issues about making breaking changes to the library. It had (afaik) a single author/maintainer, no other larger production uses, and no sanitizers or comparable checks (which required Fix out-of-bounds access in test aklomp/base64#105 since it had an out-of-bounds read in its tests).
  • Then simdutf, postject, and ada. This time, there were 2 sanitizer violations/failed assertions and they made it into the LTS release itself, not just tests of those libraries. Ada had already made breaking changes, so getting the upstream fixes directly wasn't easy.

The issue in the OP is the one that I could easily identify and already have a workaround for. There's a second one in ada that I still can't easily reproduce that causes a failed assertion in the string_view constructor in one of our tests (length violation). But knowing that the 1.x version has known bugs that are fixed in 2.x makes debugging this further seem like wasted effort. At this point it seems like a better use of time to give up on 18.16.1, wait for the ada 2.x backport to make it into a nodejs 18.x release, and then try again.

Which is totally fine, btw! This isn't really blocking anything and because of the way we're running nodejs, we can safely skip 18.16.1. The only fallout is a couple of days of wasted time on my end which I'll survive. ;)

@lemire
Copy link
Member

lemire commented Jul 5, 2023

@jkrems

You ran Node with ubsan, but ubsan is not used as part of continuous integration tests of Node. In fact, it appears that it is not used because Node is not generally ubsan clean, at least as far as the comments on this issue are concerned: #46297

small-icu isn't ubsan-clean, I'm finding a way to skip it for now. @RafaelGSS

Example of an error that comes up a lot: SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../deps/v8/src/api/api-arguments-inl.h:332:3 @targos

And these problems predate the recent changes you allude to.

It is perfectly reasonable to demand the Node be ubsan clean, and I think it is reasonable to expect new dependencies to be ubsan clean... but it seems that it is not a goal that has been reached yet, generally.

@jkrems
Copy link
Contributor Author

jkrems commented Jul 5, 2023

@lemire Our nodejs binaries are special™ in many ways. They link against BoringSSL, have dedicated V8 imports, are built using bazel, etc.. So I don't have any expectations of them being officially supported. The good news is that they've been ubsan clean for many years now, so code within nodejs core itself is usually already clean and I'm always happy to upstream fixes if there's any sanitizer errors.

The reason I opened this issue and made the comment above is that these kinds of failures are actually very rare and hadn't happened in multiple years. So node used to be fairly asan/ubsan clean but has stopped being reliably clean in the more recent past. And the majority (admittedly, small sample size) of issues came from the new deps with only one coming from node core itself (#48566).

@anonrig anonrig closed this as completed Jul 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
confirmed-bug Issues with confirmed bugs. whatwg-url Issues and PRs related to the WHATWG URL implementation.
Projects
None yet
Development

No branches or pull requests

4 participants