Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Puppeteer bot detection with out-of-process iframes by user-agent and screen size #196

Closed
ukrexpo opened this issue May 10, 2020 · 11 comments
Labels
good first issue Good for newcomers plugin: stealth ㊙️ Detection evasion related workaround-available

Comments

@ukrexpo
Copy link

ukrexpo commented May 10, 2020

Real User-Agent is detectable with <iframe> when using "puppeteer-extra-plugin-stealth". To prevent it use "'--disable-features=site-per-process'" argument for puppeteer.launch(). See explanation: puppeteer/puppeteer#2548

Real screen size is also detectable with iframe, to prevent it I used preload() script.

OS: Windows 10
"puppeteer": 3.0.4
"puppeteer-extra": 3.1.9
"puppeteer-extra-plugin-stealth": 2.4.9

Code to reproduce:

const puppeteer = require('puppeteer');
const puppeteerExtraPluginStealth = require('puppeteer-extra-plugin-stealth');
const puppeteerExtraPluginUserAgentOverride = require('puppeteer-extra-plugin-stealth/evasions/user-agent-override');
const {PuppeteerExtra} = require('puppeteer-extra');

function preload(device) {
  Object.defineProperty(navigator, 'platform', {
    value: device.platform,
    writable: true,
  });
  Object.defineProperty(navigator, 'userAgent', {
    value: device.userAgent,
    writable: true,
  });
  Object.defineProperty(screen, 'height', {
    value: device.viewport.height,
    writable: true,
  });
  Object.defineProperty(screen, 'width', {
    value: device.viewport.width,
    writable: true,
  });
  Object.defineProperty(window, 'devicePixelRatio', {
    value: device.viewport.deviceScaleFactor,
    writable: true,
  });
}

const device = {
  userAgent: 'Mozilla/5.0 (Macintosh)',
  viewport: {
    width: 1200,
    height: 800,
    deviceScaleFactor: 1,
    isMobile: false,
    hasTouch: false,
    isLandscape: true,
  },
  locale: 'en-US,en;q=0.9',
  platform: 'Macintosh',
};

(async () => {
  try {
    const pptr = new PuppeteerExtra(puppeteer);
    const pluginStealth = puppeteerExtraPluginStealth();
    pluginStealth.enabledEvasions.delete('user-agent-override'); // Remove this specific stealth plugin from the default set
    pptr.use(pluginStealth);

    const pluginUserAgentOverride = puppeteerExtraPluginUserAgentOverride({
      userAgent: device.userAgent,
      locale: device.locale,
      platform: device.platform,
    });
    pptr.use(pluginUserAgentOverride);

    const browser = await pptr.launch({
      args: [
        '--disable-features=site-per-process',
        `--window-size=${device.viewport.width},${device.viewport.height}`,
      ],
      headless: false,
      defaultViewport: device.viewport,
    });
    const page = await browser.newPage();
    await page.evaluateOnNewDocument(preload, device);
    await page.goto('https://codepen.io/ukrexpo/pen/JjYverG');
  } catch (err) {
    console.error(err);
  }
})();
@ukrexpo ukrexpo changed the title Detection puppeteer with out-of-process iframes by user-agent and screen size Puppeteer bot detection with out-of-process iframes by user-agent and screen size May 10, 2020
@ruimarinho
Copy link

@ukrexpo shouldn't platform be MacIntel instead?

@brdeav39
Copy link

brdeav39 commented May 14, 2020

Awesome! This just fixed a problem I was encountering on a particular site. Thanks @ukrexpo.

@kensnyder
Copy link

kensnyder commented Jun 12, 2020

I also discovered that puppeteer reports a screen.colorDepth and screen.pixelDepth of 30 instead of the 24 I see on other desktop browsers (Windows and Mac). It sounds like there are a few devices that support 30-bit color, but theoretically the depth should match up with graphics card vendor reported by WebGL.

Additionally a plugin like this could consider changing window.outerHeight and screen.availHeight to account for an OS dock height and browser tabs + omnibar height. It gives away puppeteer quickly if the window.innerHeight doesn't make sense with the values for screen.height and screen.availHeight.

@momala454
Copy link

careful also with

document.documentElement.clientWidth
document.documentElement.clientHeight

it looks like chrome headless doesn't account for scrollbar size

regarding window.innerHeight / screen.availHeight
maybe be carefull with css media queries which could detect the real size ?

@berstend
Copy link
Owner

berstend commented Oct 3, 2020

I think in more recent Chrome versions this might be required: --flag-switches-begin --disable-site-isolation-trials --flag-switches-end

@flashjames
Copy link

Also make sure you're using a recent puppeteer version, in puppeteer@1.8.0 @berstend answer crashes chromium

@berstend
Copy link
Owner

Also make sure you're using a recent puppeteer version, in puppeteer@1.8.0 @berstend answer crashes chromium

pptr 1.8.0 has been released 3 years ago and uses chrome v70 😅 but happy to hear our efforts for backwards compatibility are appreciated :-)

@berstend
Copy link
Owner

Opportunity for someone to get involved (#439):

Would be great if someone can make a proper test case to confirm failing behavior + fixed behavior when the two launch args are used. Also it'd be worthwhile to document negative side-effects - if they're neglectable we can make an evasion to enable that as the default.

@berstend berstend added good first issue Good for newcomers plugin: stealth ㊙️ Detection evasion related workaround-available labels Mar 22, 2021
@ya-mouse
Copy link

The workaround is broken since puppeteer@12.0.0, the last working version is @11.0.0.
At least on Linux host it doesn't work. Running workaround reveals the real platform details/UA.

@berstend
Copy link
Owner

berstend commented Jul 8, 2022

Pretty sure this issue is not applicable anymore, closing as there's no substantive discussion either

@berstend berstend closed this as completed Jul 8, 2022
@onlynfp
Copy link

onlynfp commented Sep 11, 2022

@ukrexpo How can I use multiple devices from your code?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers plugin: stealth ㊙️ Detection evasion related workaround-available
Projects
None yet
Development

No branches or pull requests

9 participants