[Bug]: 3x performance drop since v21.2.0 #11944

feesler · 2024-02-19T06:36:58Z

Minimal, reproducible example

git clone https://github.com/feesler/puppeteer-perf.git
cd puppeteer-perf
npm install
npm test
npm install puppeteer@21.2.0
npm test

Error string

no error

Bug behavior

Flaky
PDF

Background

Similar to #9993 and #8650.

Steps to reproduce the problem:

Install puppeteer v21.1.1 dependency
Run some script and measure performance
Update puppeteer to v21.2.0 and repeat step 2

Script performing simple page query/evaluate calls.

Expectation

Expected same performance of query/evaluate calls as version 21.1.1 and earlier.

Reality

Starting from version 21.2.0 performance of script with query/evaluate calls is dropped about 3 times and remains on the same level.
No difference between ESM and CJS.

Log output:
Node: v21.6.2
Puppeteer: 21.1.1
Browser: Chrome/116.0.5845.96
Duration: 218.91580000000022

Node: v21.6.2
Puppeteer: 21.2.0
Browser: Chrome/116.0.5845.96
Duration: 783.2916

Node: v21.6.2
Puppeteer: 22.1.0
Browser: Chrome/121.0.6167.85
Duration: 775.9939000000004

Puppeteer configuration file (if used)

No response

Puppeteer version

21.2.0 - 22.1.0

Node version

21.6.2

Package manager

npm

Package manager version

9.8.1

Operating system

Windows

OrKoN · 2024-02-19T07:58:29Z

@jrandolf PTAL

OrKoN · 2024-02-19T08:29:28Z

Bisected to https://github.com/puppeteer/puppeteer/pull/10810/files

jrandolf · 2024-02-19T12:28:09Z

After investigation, we found that the performance drop is due to sandboxed queries in Puppeteer. Older versions of Puppeteer did not sandbox queries, so there was no performance loss from transferring sandboxed objects into the main execution context (the one Puppeteer library users use).

Internally, Puppeteer doesn't have an internal method of migrating arbitrary objects (in particular, arrays of nodes) from a sandbox. The CDP method DOM.describeNode only allows transfers of Nodes, so to enable adoption, we adopt each node individually which costs a significant amount of time for queries with a lot of nodes.

Perhaps the solution would be to allow users to disable sandboxing. Deferring to @OrKoN .

feesler · 2024-02-21T02:10:10Z

Well, it will be sad if there is no good solution to this! Just wonder how to workaround this from user perspective.

Sample script is, of course, intentionally not optimized.
Moving calculations for all nodes inside single page.evaluate call saves the situation, but only once.
Over long test run it still adds up to an impressive amount of time.

NC-piercej · 2024-02-27T23:02:06Z

I can confirm that headless: true on v22 is significantly slower than headless: 'shell' for our PDF generation workloads.

Interestingly, if we are just generating a single PDF in a single tab, the performance is similar. It's only when multiple tabs are open, running heavy CPU-bound javascript, and generating PDFs at once that performance degrades significantly.

If I didn't know better, it's almost like script execution in one tab (dependent on requestAnimationFrame) is blocked until another tab completes. It's weird and hard too pin down.

jrandolf · 2024-02-28T05:16:54Z

Well, it will be sad if there is no good solution to this! Just wonder how to workaround this from user perspective.

Sample script is, of course, intentionally not optimized. Moving calculations for all nodes inside single page.evaluate call saves the situation, but only once. Over long test run it still adds up to an impressive amount of time.

I think if we can get more examples of cases where the script cannot be optimized and the test runs have a high delta in time, this would strengthen the use-case. From our experience, it's never querying that increase the testing time. For example, the test may be testing too many cases at once or it may be using some flaky, time-consuming heuristic.

OrKoN · 2024-02-28T07:03:06Z

@NC-piercej this is unrelated issue. The new headless might be slower than the old headless mode but it is the same browser implementation as the regular headful Chrome. See https://developer.chrome.com/docs/chromium/new-headless

OrKoN · 2024-06-06T08:17:56Z

So one way to fix this regression would be to introduce a new method that does not isolate the query code. This way we could still keep the isolation by default like in other methods and allow use cases where many elements need to be returned. See #12539 Note that there is still a 30% increase in querying remaining after turning off the isolation so there is probably another regression.

feesler added the bug label Feb 19, 2024

github-actions bot added needs-feedback not-reproducible labels Feb 19, 2024

This comment was marked as outdated.

Sign in to view

OrKoN self-assigned this Feb 19, 2024

OrKoN added confirmed and removed needs-feedback labels Feb 19, 2024

OrKoN removed the not-reproducible label Feb 19, 2024

OrKoN assigned jrandolf and unassigned OrKoN Feb 19, 2024

OrKoN added the P1 label Feb 19, 2024

jrandolf assigned OrKoN and unassigned jrandolf Feb 19, 2024

OrKoN added P2 and removed P1 labels Feb 19, 2024

OrKoN closed this as not planned Won't fix, can't repro, duplicate, stale Feb 28, 2024

OrKoN reopened this Feb 28, 2024

mostafa-hisham mentioned this issue Jun 3, 2024

[Bug]: x1.5 CPU usage increase using Puppeteer between 20.8.3 to 20.9.0+ #12524

Open

2 tasks

OrKoN mentioned this issue Jun 5, 2024

fix: re-introduce a method for querying multiple elements without isolation #12539

Draft

OrKoN mentioned this issue Jun 10, 2024

fix(performance): use Runtime.getProperties for improved performance #12561

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: 3x performance drop since v21.2.0 #11944

[Bug]: 3x performance drop since v21.2.0 #11944

feesler commented Feb 19, 2024

This comment was marked as outdated.

OrKoN commented Feb 19, 2024

OrKoN commented Feb 19, 2024

jrandolf commented Feb 19, 2024 •

edited

feesler commented Feb 21, 2024

NC-piercej commented Feb 27, 2024

jrandolf commented Feb 28, 2024 •

edited

OrKoN commented Feb 28, 2024

OrKoN commented Jun 6, 2024

[Bug]: 3x performance drop since v21.2.0 #11944

[Bug]: 3x performance drop since v21.2.0 #11944

Comments

feesler commented Feb 19, 2024

Minimal, reproducible example

Error string

Bug behavior

Background

Expectation

Reality

Puppeteer configuration file (if used)

Puppeteer version

Node version

Package manager

Package manager version

Operating system

This comment was marked as outdated.

OrKoN commented Feb 19, 2024

OrKoN commented Feb 19, 2024

jrandolf commented Feb 19, 2024 • edited

feesler commented Feb 21, 2024

NC-piercej commented Feb 27, 2024

jrandolf commented Feb 28, 2024 • edited

OrKoN commented Feb 28, 2024

OrKoN commented Jun 6, 2024

jrandolf commented Feb 19, 2024 •

edited

jrandolf commented Feb 28, 2024 •

edited