You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I'm assuming headless mode removes all the rendering completely and so on so all it does is execute the js scripts and fill the HTML even though it's not being shown of course. I'm scraping websites, the problem is when I run everything in headless mode, if I open 30 pages (tabs) in the same browser context with headless = false then it will run smoothly, decent CPU usage. If I do the same but in headless it will use a lot of CPU, reaching 100% depending on the website you scrap. Now, the thing is that the refresh rate of javascript rendering and the HTML updates for each single tab seems to be the same in headless and non-headless both so why is it consuming so much in headless? Is there any option I don't know about to reduce the usage of each tab or something? Is there a way to "idle" (make it run less often?) the headless browser the same way the non-headless does? Thanks.
Bug behavior
Flaky
PDF
Minimal, reproducible example
constpuppeteer=require("puppeteer");asyncfunctionmain(){const_browser=awaitpuppeteer.launch({defaultViewport: null,headless: false,// change to true to verify it consumes moreignoreHTTPSErrors: true,ignoreDefaultArgs: ["--enable-automation"],args:
['--kiosks','--disable-accelerated-2d-canvas','--disable-backgrounding-occluded-windows','--disable-renderer-backgrounding','--disable-canvas-aa','--disable-2d-canvas-clip-aa','--disable-gl-drawing-for-tests','--disable-dev-shm-usage','--disable-gpu','--no-zygote','--use-gl=desktop','--hide-scrollbars','--mute-audio','--no-first-run','--disable-infobars','--disable-breakpad','--no-sandbox','--disable-setuid-sandbox'],});letpages=[];for(leti=0;i<30;++i){letpage=await_browser.newPage();awaitpage.goto('https://store.steampowered.com/charts/mostplayed');// use any website as an examplepages.push(page);}while(true){letpromises=[];for(constpageofpages){asyncfunctionparsePage(){awaitpage.waitForNetworkIdle();// this is optionalletcontent=awaitpage.content();console.log(content.length);};promises.push(parsePage());// do stuff with page}awaitPromise.all(promises);}}main();
You can try using the new headless mode: https://developer.chrome.com/articles/new-headless/ You should get the same behaviour as with the headful mode. As for the CPU consumption, it might have smth to do with the fact that you disable GPU (that might be ignored in headful). In any case, if you have a bug report about the headless performance, please open an issue at crbug.com as Puppeteer is only a client to the browser binary.
Bug expectation
Hi, I'm assuming headless mode removes all the rendering completely and so on so all it does is execute the js scripts and fill the HTML even though it's not being shown of course. I'm scraping websites, the problem is when I run everything in headless mode, if I open 30 pages (tabs) in the same browser context with headless = false then it will run smoothly, decent CPU usage. If I do the same but in headless it will use a lot of CPU, reaching 100% depending on the website you scrap. Now, the thing is that the refresh rate of javascript rendering and the HTML updates for each single tab seems to be the same in headless and non-headless both so why is it consuming so much in headless? Is there any option I don't know about to reduce the usage of each tab or something? Is there a way to "idle" (make it run less often?) the headless browser the same way the non-headless does? Thanks.
Bug behavior
Minimal, reproducible example
Error string
no error
Puppeteer configuration
Puppeteer version
v19.8.0
Node version
v18.15.0
Package manager
npm
Package manager version
9.5.0
Operating system
Windows
The text was updated successfully, but these errors were encountered: