Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: crash of a window, renderer process cause of memory cage and asar #36999

Closed
3 tasks done
julianhille opened this issue Jan 23, 2023 · 21 comments · Fixed by stream-labs/desktop#4576
Closed
3 tasks done

Comments

@julianhille
Copy link

julianhille commented Jan 23, 2023

Preflight Checklist

Electron Version

22.0.x

What operating system are you using?

Windows

Operating System Version

Windows 7, 10, 11

What arch are you using?

x64

Last Known Working Electron version

20.x

Expected Behavior

no crash

Actual Behavior

It actually crashes in:

cpp-marking-state-inl.h

void CppMarkingState::MarkAndPush(const EmbedderDataSlot type_slot,
                                  const EmbedderDataSlot instance_slot) {
  LocalEmbedderHeapTracer::WrapperInfo info;
>>>>  if (LocalEmbedderHeapTracer::ExtractWrappableInfo(
          isolate_, wrapper_descriptor_, type_slot, instance_slot, &info)) {
    marking_state_.MarkAndPush(
        cppgc::internal::HeapObjectHeader::FromObject(info.second));```

with exception code: 0xc0000005

At first glace this seems to be coming from a native module which does not work with the new caging system, but i removed all native modules and it still crashed.
I figured that it does not happen when running locally through npm/yarn starts but after building to a package it does.

After looking a bit more careful at the stack trace i found that the calls originate from

::Archive::New(const v8::FunctionCallbackInfo<v8::Value> & args) Line 63
C:\projects\src\electron\shell\common\api\electron_api_asar.cc (63)
    auto* archive_wrap = new Archive(std::move(archive));
    archive_wrap->Wrap(args.This());

If the origin is asar this would explain that it only crashes after build and not
on locally runs started with npm.

I debugged with 22.0.3 (but happens with any 22.0.x) which happens to be chrome: 108.0.5359.179 which has v8 version 10.8.168.25

Testcase Gist URL

No response

Additional Information

This crashes also on linux (kernel 6.1, x64 arch)

@julianhille julianhille changed the title [Bug]: [Bug]: crash of a window, renderer process cause of memory cage and asar Jan 23, 2023
@codebytere codebytere added the blocked/need-repro Needs a test case to reproduce the bug label Jan 23, 2023
@github-actions
Copy link
Contributor

Hello @julianhille. Thanks for reporting this and helping to make Electron better!

Would it be possible for you to make a standalone testcase with only the code necessary to reproduce the issue? For example, Electron Fiddle is a great tool for making small test cases and makes it easy to publish your test case to a gist that Electron maintainers can use.

Stand-alone test cases make fixing issues go more smoothly: it ensure everyone's looking at the same issue, it removes all unnecessary variables from the equation, and it can also provide the basis for automated regression tests.

Now adding the blocked/need-repro label for this reason. After you make a test case, please link to it in a followup comment. This issue will be closed in 10 days if the above is not addressed.

@Abhijrathod

This comment was marked as abuse.

@julianhille
Copy link
Author

julianhille commented Jan 24, 2023

yes it seems asar related.
no it is not resolved by a newer version (still have the same issue with electron 23 beta 5)

@Abhijrathod what do you suggest using instead of asar?

| You can also try to create a test case and file an issue on the electron repository so that the developers can investigate this issue.

im actually trying to create a minimal version to trigger this, but im confused as i thought this is the electron repository?

Edit:
The above answer looks like an autogenerated ChatGPT answer. (no offense if not)

@github-actions github-actions bot removed the blocked/need-repro Needs a test case to reproduce the bug label Jan 25, 2023
@julianhille
Copy link
Author

julianhille commented Jan 25, 2023

I can't seem to find the root cause.
Neither on my builds nor on a brand new minimal version.
On Linux the outcome is diffent then on windows it it just not opened and not a crash.

We're reusing the window, which crashes, and on Linux closing and reopening it, instead of reusing, fixes the hiding of the window, but windows still crashes the renderer process for that particular window.

@julianhille
Copy link
Author

i was now able to fix it on windows and here it is the same, closing and reopening instead of reusing the window fixes the problem.

i removed ALL calls to any native modules so this cant be any issue.
It feels like a race condition somewhere.

What i basically do is:

  • create window
  • hide window
  • size and position window
  • load content into window
  • show window
  • user clicks in that window
  • size and position window
  • load content into window
  • show window

sometimes this crashes with a minidump and sometimes it does not. (im unable to supply the minidump)

@VerteDinde
Copy link
Member

Hey @julianhille, sorry you're experiencing this! I know you mentioned that you were unable to supply the minidump, but if you're able, seeing the full minidump would really help us figure out what's going on and why you might be hitting crashes here. If you are able to share one, you can collect one easily by adding the following snippet to your main process code, before app.whenReady:

const { app, crashReporter } = require('electron')

console.log(app.getPath('crashDumps'))
crashReporter.start({ submitURL: '', uploadToServer: false })

Then reproduce the crash, zip up the crash dumps directory and attach it here.

@julianhille
Copy link
Author

Here is a browser view crash fix: #37267

I'll try that and report back.

@julianhille
Copy link
Author

i tried 22.3.0 but that did not help. the window is still crashing

@markh-discord
Copy link

We're also seeing this, although only in CI machines. This seems to occur on Intel and M1 Macs as well as on Windows 64-bit, cannot reproduce with Windows 32-bit.

Started seeing this in 22.2.0, the last known working build in our case was 17.x.

This seems to occur early in the process currently, but could be simply based on when the asar file is accessed.

I happened to find a dump that gives a decent callstack from Windows x64 (unfortunately I can't share, but other thread stacks are nowhere near v8 at this point):

[Inline Frame] Electron.exe!std::Cr::__cxx_atomic_load(const volatile std::Cr::__cxx_atomic_base_impl<long long> * __a, std::Cr::memory_order __order) Line 948	C++
[Inline Frame] Electron.exe!std::Cr::__atomic_base<long long,0>::load(std::Cr::memory_order __m) Line 1537	C++
[Inline Frame] Electron.exe!std::Cr::atomic_load_explicit(const volatile std::Cr::atomic<long long> * __o, std::Cr::memory_order __m) Line 1916	C++
[Inline Frame] Electron.exe!v8::base::Relaxed_Load(const volatile __int64 * ptr) Line 331	C++
[Inline Frame] Electron.exe!v8::internal::ExternalPointerTable::RelaxedLoad(unsigned int index) Line 528	C++
[Inline Frame] Electron.exe!v8::internal::ExternalPointerTable::Get(unsigned int handle, v8::internal::ExternalPointerTag tag) Line 22	C++
[Inline Frame] Electron.exe!v8::internal::ReadExternalPointerField(unsigned __int64 field_address, const v8::internal::Isolate * isolate) Line 63	C++
[Inline Frame] Electron.exe!v8::internal::EmbedderDataSlot::ToAlignedPointer(v8::internal::Isolate * isolate, void * * out_pointer) Line 93	C++
[Inline Frame] Electron.exe!v8::internal::LocalEmbedderHeapTracer::ExtractWrappableInfo(v8::internal::Isolate * isolate, const v8::WrapperDescriptor & wrapper_descriptor, const v8::internal::EmbedderDataSlot & type_slot, const v8::internal::EmbedderDataSlot & instance_slot, std::Cr::pair<void *,void *> * info) Line 33	C++
Electron.exe!v8::internal::CppMarkingState::MarkAndPush(const v8::internal::EmbedderDataSlot type_slot, const v8::internal::EmbedderDataSlot instance_slot) Line 37	C++
[Inline Frame] Electron.exe!v8::internal::WriteBarrier::MarkingFromInternalFields(v8::internal::JSObject host) Line 364	C++
Electron.exe!v8::Object::SetAlignedPointerInInternalField(int index, void * value) Line 6144	C++
[Inline Frame] Electron.exe!node::ObjectWrap::Wrap(v8::Local<v8::Object> handle) Line 78	C++
Electron.exe!`anonymous namespace'::Archive::New(const v8::FunctionCallbackInfo<v8::Value> & args) Line 63	C++
Electron.exe!v8::internal::FunctionCallbackArguments::Call(v8::internal::CallHandlerInfo handler) Line 147	C++
~snip~

@markh-discord
Copy link

markh-discord commented Feb 25, 2023

With a debug Electron build a DCHECK provides a bit more info:

# Fatal error in ..\..\v8\src\objects\js-objects-inl.h, line 329
# Debug check failed: static_cast<unsigned>(index) < static_cast<unsigned>(GetEmbedderFieldCount()) (1 vs. 1).
Electron.exe!v8::base::OS::Abort()
Electron.exe!V8_Fatal(const char *)
Electron.exe!v8::base::`anonymous namespace'::DefaultDcheckHandler(const char *)
Electron.exe!v8::internal::JSObject::GetEmbedderFieldOffset(int)
[Inline Frame] Electron.exe!v8::internal::EmbedderDataSlot::EmbedderDataSlot(v8::internal::JSObject)
Electron.exe!v8::internal::LocalEmbedderHeapTracer::EmbedderWriteBarrier(v8::internal::Heap *)
Electron.exe!v8::Object::SetAlignedPointerInInternalField(int)
[Inline Frame] Electron.exe!node::ObjectWrap::Wrap(v8::Local<v8::Object>)
Electron.exe!`anonymous namespace'::Archive::New(const v8::FunctionCallbackInfo<v8::Value> &)
Electron.exe!v8::internal::FunctionCallbackArguments::Call(v8::internal::CallHandlerInfo)

@markh-discord
Copy link

markh-discord commented Feb 26, 2023

I believe this is a bug in v8, exposed recently in v22 since the v8 sandbox is enabled by default for x64/arm64 (https://chromium.googlesource.com/v8/v8.git/+/a8c27fcc9f9f15a0110a409190a2b514ec86e37f).

This seems to be caused by an out-of-bounds embedder slot index, as pinpointed by the DCHECK failure above:

# Fatal error in ..\..\v8\src\objects\js-objects-inl.h, line 329
# Debug check failed: static_cast<unsigned>(index) < static_cast<unsigned>(GetEmbedderFieldCount()) (1 vs. 1).

There's several frames here inlined, seemingly even in the debug build, so the full call stack is really:

v8::internal::JSObject::GetEmbedderFieldOffset(int)
v8::internal::EmbedderDataSlot::EmbedderDataSlot(v8::internal::JSObject)
v8::internal::LocalEmbedderHeapTracer::EmbedderWriteBarrier(v8::internal::Heap *)
WriteBarrier::MarkingSlowFromInternalFields
internal::WriteBarrier::MarkingFromInternalFields
v8::Object::SetAlignedPointerInInternalField(int)
node::ObjectWrap::Wrap(v8::Local<v8::Object>)

Rough reproduce steps:

  1. Internal application and build of Electron, so unfortunately can't provide project, utilizes asar file
  2. In our case this is exhibited after several attempts (1-30x) when reloading (ctrl-r)
  3. Possibly exhibited with --log-level=0 and --trace-warnings flags
  4. 64-bit build that uses the v8 sandbox

Analysis:

  1. DCHECK asserts from a lack of a field at index 1, where it's expecting an instance field. The embedder field count in this case is 1, and it needs to be at least 2.

  2. There is an existing check on the embedder field count in src/heap/embedder-tracing-inl.h in another overload to LocalEmbedderHeapTracer::ExtractWrappableInfo:
    https://chromium.googlesource.com/v8/v8.git/+/refs/heads/10.8.169/src/heap/embedder-tracing-inl.h#19

  3. In the original change that check also protected the EmbedderDataSlot ctor that looked up the indexes for wrapper_descriptor.wrappable_type_index (0) and wrapper_descriptor.wrappable_instance_index (1):
    https://chromium.googlesource.com/v8/v8.git/+/d10f61e1/src/heap/embedder-tracing-inl.h#19

  4. However a later refactoring moved the EmbedderDataSlot out into a separate overloaded function (https://chromium-review.googlesource.com/c/v8/v8/+/3386600):
    https://chromium.googlesource.com/v8/v8.git/+/804aaa5c6%5E!/src/heap/embedder-tracing-inl.h

  5. In the same change the LocalEmbedderHeapTracer::EmbedderWriteBarrier was updated to lookup the type_slot and instance_slot indices:
    https://chromium-review.googlesource.com/c/v8/v8/+/3386600/14/src/heap/embedder-tracing.cc
    From the call stack above though this path isn't protected by the original count check.

  6. I believe this may not be observed in non-sandbox builds because of the point where it later attempts to actually convert the embedded data slot as a pointer in EmbedderDataSlot::ToAlignedPointer:
    https://chromium.googlesource.com/v8/v8.git/+/refs/heads/10.8.169/src/objects/embedder-data-slot-inl.h#91
    This is called from LocalEmbedderHeapTracer::ExtractWrappableInfo, in release builds this path is reached: https://chromium.googlesource.com/v8/v8.git/+/refs/heads/10.8.169/src/heap/embedder-tracing-inl.h#33

  7. The fix here seems to be to add a similar count check in LocalEmbedderHeapTracer::EmbedderWriteBarrier prior to the EmbedderDataSlot construction src/heap/embedder-tracing.cc:

void LocalEmbedderHeapTracer::EmbedderWriteBarrier(Heap* heap,
						JSObject js_object) {
	DCHECK(InUse());
	DCHECK(js_object.MayHaveEmbedderFields());
	if (cpp_heap_) {
		DCHECK_NOT_NULL(heap->mark_compact_collector());
		if (js_object.GetEmbedderFieldCount() < 2) return;

		const EmbedderDataSlot type_slot(js_object,
				  wrapper_descriptor_.wrappable_type_index);
		const EmbedderDataSlot instance_slot(
				  js_object, wrapper_descriptor_.wrappable_instance_index);
...

Note 1: The root cause is unclear to me how the embedder field count can be become different from the assumed fields for the type and instance. It's also unclear why the Electron asar path exhibits this issue, I'm not familiar enough with the v8 project to really dig in here. However, since they already have a field count check it seems like there were some expected cases where it might have mismatched.

Note 2: As far as future v8 releases, they recently removed the local embedder code in favor of cppgc: https://chromium-review.googlesource.com/c/v8/v8/+/4174081
Unsure if the current cppgc logic is affected: https://chromium.googlesource.com/v8/v8.git/+/fd6efc347d78ff/src/heap/cppgc-js/cpp-heap.cc#785

I've opened a v8 issue here to get their input: https://bugs.chromium.org/p/v8/issues/detail?id=13777

@julianhille
Copy link
Author

Last known good is 20.x for me. 22.1.0 is broken for me together with any 21.x version

@bukharin
Copy link

Same problem. Is there an workaround?

@julianhille
Copy link
Author

Haven't found one besides not upgrading.

@bukharin
Copy link

I see that chromium bug is fixed. Is there any information about when this patch is delivered for Electron ?

@julianhille
Copy link
Author

As I understood the bug was fixed in a file and then got moved and maybe reintroduced in cppgc

@bukharin
Copy link

bukharin commented Apr 17, 2023

It seems that crash reproduces only in packaged app with nodeIntegration: true. Disabling nodeIntegration may be a workaround

@julianhille
Copy link
Author

any news on this one? had anyone a chance to test this on a 23.x or newer?

@13031060269
Copy link

Has the issue been fixed?

@julianhille
Copy link
Author

Not as far as I know

@codebytere
Copy link
Member

Closed in https://bugs.chromium.org/p/v8/issues/detail?id=13777 - available in v25.0.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants