Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Electron: crash reporter slows down startup #120902

Closed
bpasero opened this issue Apr 9, 2021 · 21 comments
Closed

Electron: crash reporter slows down startup #120902

bpasero opened this issue Apr 9, 2021 · 21 comments
Assignees
Labels
perf perf-startup workbench-diagnostics General VS Code built-in diagnostic issues

Comments

@bpasero
Copy link
Member

bpasero commented Apr 9, 2021

There is a new perf mark to measure the impact of the crash reporter on the startup which is a sync blocking call.

Even on my fast macOS ARM machine, this seems to consume a considerable amount of time:

image

Given the crash reporter does not provide any async API, I see no good way of fixing this unless we can start the crash reporter after the window has opened, which I think is not an option?

//cc @jrieken

@bpasero bpasero added perf-startup workbench-diagnostics General VS Code built-in diagnostic issues labels Apr 9, 2021
@bpasero bpasero added the perf label Apr 9, 2021
@bpasero
Copy link
Member Author

bpasero commented Apr 9, 2021

Hm, maybe this is a red herring: we do start the crash reporter before the ready event is fired. That makes it somewhat hard to know:

  • is the crash reporter basically blocking the ready event from firing
  • or is the ready event not impacted by the crash reporter and we are basically having some free time to do stuff without impacting startup perf?

@jrieken
Copy link
Member

jrieken commented Jun 30, 2021

I believe this is much worst than we initially though, esp. on mac. For that, even the 50th percentile is close to half a second. p90 is bad on windows and really bad on mac, only linux seems to be fine.

Screen Shot 2021-06-30 at 14 26 14

I would suggest to understand why it is so slow and/or disable the crash reporter by default. Today we are paying a very high price for it being always on

@bpasero
Copy link
Member Author

bpasero commented Jun 30, 2021

I added it as topic to discuss for our next Electron sync 👍

@deepak1556
Copy link
Contributor

deepak1556 commented Jun 30, 2021

I looked into this a while back, this is purely cost from the crashpad native component initialization https://github.com/chromium/crashpad/blob/master/doc/overview_design.md#the-crashpad-client, there is no additional cost from vscode or electron. crashReporter.start call in our main.js is a sync call to the crashpad core that spawns a crashpad handler process which is like a server process that monitors all the processes in the application client. This step is same for both windows and mac, the difference is the communication mechanism the crashpad handler process creates during this stage to monitor the client processes. On windows it creates named pipe and passes it down to the client whereas on mac it will designate a mach port as exception port (pretty sure the difference in cost comes from here). Linux uses a legacy crash client called breakpad which will be moved eventually over to crashpad.

I will get the numbers again from a minimal electron app to confirm the above details.

@bpasero
Copy link
Member Author

bpasero commented Jun 30, 2021

I think I had asked this before: is it a design limitation of the crash handlers that we cannot start the crash reporter after the window has opened? It would be great if the crash reporter could observe existing processes after they have been created instead of requiring it to be started first.

@deepak1556
Copy link
Contributor

It is a design limitation from crashpad. For ex: the mach port created on macOS is only done once and registered on the main process, this is because the OS will take care of inheriting this port to all child processes spawned from it, thereby making it easier for the crashpad handler to monitor all processes without additional effort. Starting the crash reporter in a later stage would not work with this modal to monitor existing processes.

@jrieken
Copy link
Member

jrieken commented Jun 30, 2021

Given it has such negative impact on users we consider to

  • have it off by default
  • only enable it for X% of our users (flip a coin on startup)
  • allow user to enable/disable it

@bpasero
Copy link
Member Author

bpasero commented Jun 30, 2021

Maybe one idea is to have it only enabled for our insider users? That is still an OK number of users (>30k?) and would still give us a chance to see a trend of crashes over time.

@bpasero
Copy link
Member Author

bpasero commented Jul 22, 2022

I wonder if electron/electron#34609 has addressed this.

@deepak1556
Copy link
Contributor

deepak1556 commented Jul 22, 2022

Yes it will shave off some considerable milliseconds which are responsible for spawning the crashpad process, so we will see some improvements with that PR.

@bpasero
Copy link
Member Author

bpasero commented Jul 22, 2022

The author states that there is even more room for improvement but the second PR is pending: https://chromium-review.googlesource.com/c/crashpad/crashpad/+/3711503

@bpasero
Copy link
Member Author

bpasero commented Aug 20, 2022

Comparing stable to insiders on my machine (macOS):

Insiders
insiders

Stable
stable

@bpasero
Copy link
Member Author

bpasero commented Aug 26, 2022

Anyway, this is still somewhat slow on Windows, and 50ms probably translates to more than 100ms on our perf bot, so worthwhile investigating further here.

@jrieken
Copy link
Member

jrieken commented Aug 26, 2022

Comparing stable to insiders on my machine:

Is that windows or mac?

@bpasero
Copy link
Member Author

bpasero commented Aug 26, 2022

Yeah I updated the comment: it was macOS and the perf gains are macOS only unfortunately.

@bpasero
Copy link
Member Author

bpasero commented Aug 27, 2022

Disabling crash reporter on Windows seems to bring a 30-50ms gain on perf bot (tried with a special exploration build with disabled crash reporter).

@jrieken
Copy link
Member

jrieken commented Aug 29, 2022

@bpasero Can you check locally? E.g does that build run faster on your machine too?

@bpasero
Copy link
Member Author

bpasero commented Aug 29, 2022

@jrieken it is actually more significant on my machine, 80-100ms. It is easy to compare when you pass in --disable-crash-reporter which will disable crash reporter from starting up.

Btw we have the duration as a telemetry mark and also show it in the startup editor:

image

We can probably measure in Kusto what users on Windows see on average for crash reporter startup. The marks are code/willStartCrashReporter and code/didStartCrashReporter.

@jrieken
Copy link
Member

jrieken commented Aug 29, 2022

Yeah, perf-tics for the perf-machine show that it is at ~30ms. Quite stable and the exploration runs where it was disabled also show nicely

@bpasero
Copy link
Member Author

bpasero commented Aug 29, 2022

👍 , one thing to note is that crash reporter starts before app.on("ready") and thus races with that event, so it is possible that the main process is actually somewhat idle only busy with the crash reporter while other processes in Chromium get ready. That could explain why disabling the crash reporter is not giving us back more time than those 30ms.

@deepak1556
Copy link
Contributor

There have been improvements with crashpad launching times in upstream electron/electron#34609, closing as we don't plan to disable crash reporter and investigating other areas in startup for perf gains.

@github-actions github-actions bot locked and limited conversation to collaborators Jan 28, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
perf perf-startup workbench-diagnostics General VS Code built-in diagnostic issues
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants
@bpasero @deepak1556 @jrieken and others