Skip to content

Instantly share code, notes, and snippets.

@tegansnyder
Created February 23, 2018 02:41
Show Gist options
  • Select an option

  • Save tegansnyder/c3aeae4d57768c58247ae6c4e5acd3d1 to your computer and use it in GitHub Desktop.

Select an option

Save tegansnyder/c3aeae4d57768c58247ae6c4e5acd3d1 to your computer and use it in GitHub Desktop.
Preventing Puppeteer Detection

I’m looking for any tips or tricks for making chrome headless mode less detectable. Here is what I’ve done so far:

Set my args as follows:

const run = (async () => {

    const args = [
        '--no-sandbox',
        '--disable-setuid-sandbox',
        '--disable-infobars',
        '--window-position=0,0',
        '--ignore-certifcate-errors',
        '--ignore-certifcate-errors-spki-list',
        '--user-agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3312.0 Safari/537.36"'
    ];

    const options = {
        args,
        headless: true,
        ignoreHTTPSErrors: true,
        userDataDir: './tmp'
    };

    const browser = await puppeteer.launch(options);

I’m loading in a preload file that overrides some window.navigator globals:

    const preloadFile = fs.readFileSync('./preload.js', 'utf8');
    await page.evaluateOnNewDocument(preloadFile);
preload.js
// overwrite the `languages` property to use a custom getter
Object.defineProperty(navigator, "languages", {
  get: function() {
    return ["en-US", "en"];
  };
});

// overwrite the `plugins` property to use a custom getter
Object.defineProperty(navigator, 'plugins', {
  get: function() {
    // this just needs to have `length > 0`, but we could mock the plugins too
    return [1, 2, 3, 4, 5];
  },
});

I see there are some other things suggested here https://intoli.com/blog/making-chrome-headless-undetectable/ but I'm not 100% certain how to implement them in puppeteer. Any ideas tips or tricks?

@debaosuidecl

Copy link
Copy Markdown

how then do we avoid web driver detection

@borispov

Copy link
Copy Markdown

For those who still interested:
https://antoinevastel.com/bot%20detection/2018/01/17/detect-chrome-headless-v2.html

Covers few ways that websites detect headless connections, few of them are covered in this gist and thread, few, however, are still unsolved (here..).

@ioannist

Copy link
Copy Markdown

@timzaak

timzaak commented Dec 30, 2019

Copy link
Copy Markdown

you might want to check out

https://www.npmjs.com/package/puppeteer-extra-plugin-stealth

thanks, it's useful

@Shery11

Shery11 commented Jan 14, 2020

Copy link
Copy Markdown

you might want to check out

https://www.npmjs.com/package/puppeteer-extra-plugin-stealth

Works like a charm Thanks

@wobsoriano

Copy link
Copy Markdown

you might want to check out

https://www.npmjs.com/package/puppeteer-extra-plugin-stealth

Doesn't work when trying to scrape youtube videos in headless mode.

@ahmedam55

Copy link
Copy Markdown

Worked like a charm!

@gokaybiz

Copy link
Copy Markdown

How do you handle popup tabs?
It's not working with them...

@vndevil

vndevil commented May 25, 2020

Copy link
Copy Markdown

you might want to check out

https://www.npmjs.com/package/puppeteer-extra-plugin-stealth

It's not working with goat.com and stockx.com. They are protected by perimeterx.com

@qo4on

qo4on commented May 26, 2020

Copy link
Copy Markdown

It's not working with goat.com and stockx.com. They are protected by perimeterx.com

This thing works with all of them.

@seahindeniz

Copy link
Copy Markdown

you might want to check out
https://www.npmjs.com/package/puppeteer-extra-plugin-stealth

It's not working with goat.com and stockx.com. They are protected by perimeterx.com

@vndevil I have just run a local test and I think it works

@krychaj5

Copy link
Copy Markdown

@shi-yan

shi-yan commented Aug 7, 2020

Copy link
Copy Markdown

https://www.houzz.com/ can detect puppeteer

@xjurko

xjurko commented Aug 28, 2020

Copy link
Copy Markdown

you might want to check out
https://www.npmjs.com/package/puppeteer-extra-plugin-stealth

Doesn't work when trying to scrape youtube videos in headless mode.

Are you trying to scrape anything besides the actual media content? If not I'd recommend youtube-dl with some ip rotation (might not be necessary)

@leehuwuj

Copy link
Copy Markdown

Magic!!!! Can you explain params and cases respectively?

@aalfiann

aalfiann commented Dec 5, 2020

Copy link
Copy Markdown

this way is doesn't work for https://imgfo.com,

you can try for their demo.

@BensimonSamy

Copy link
Copy Markdown

@vndevil do find a solution for goat and stock x please ? Goat banned my brower's signature....

@NikolaiT

NikolaiT commented Jan 13, 2021

Copy link
Copy Markdown

Hey guys, I am trying to give back a bit to the community.

https://bot.sannysoft.com/ is a bit old, isn't it?

I found a couple of new ways to detect latest puppeteer.

Check your bot here: https://incolumitas.com/pages/BotOrNot/

Best,
Nikolai

@rafakwolf

Copy link
Copy Markdown

akamai anti-bot still blocking, even with these techniques :|

@vndevil

vndevil commented May 19, 2021

Copy link
Copy Markdown

@vndevil do find a solution for goat and stock x please ? Goat banned my brower's signature....

This is my solution, you can see on my website now: https://shoegameviet.com/all-air-jordan-shoes/air-jordan-1/air-jordan-1-high (I get the latest price of Goat/StockX/MonoKabu/snkrDunk everytime load page.
My solution:

  1. using Tor browser as service to change ip automatic on server each time connect to stockx/goat api to crawl data
  2. With stockx just use axios simple
  3. With goat using puppeteer

@micha1333

micha1333 commented May 31, 2021

Copy link
Copy Markdown

Hi,
Seems like Linkedin detects puppeteer-extra-plugin-stealth.
Who have tried to avoid linkedin anti-bot?
Please help me.

@alpharameeztech

Copy link
Copy Markdown

Hi all,
Doesnt work with Kickstarter

@lakpahana

Copy link
Copy Markdown

you might want to check out

https://www.npmjs.com/package/puppeteer-extra-plugin-stealth

This worked for me

@123fischer

Copy link
Copy Markdown

@rafakwolf have you found a solution for the akamai protection? Have been trying for a while now, but to no real avail

ghost commented Feb 24, 2022

Copy link
Copy Markdown

Hi folks, are there any ways to prevent Nordstrom's detection?

@betogzo

betogzo commented Mar 25, 2022

Copy link
Copy Markdown

worked for me, now recaptcha isn't bothering me anymore. thanks!

@uzair004

Copy link
Copy Markdown

In above gist, passing some arguments won't work because those are deprecated i.e no-infobars won't hide chrome is updated by automated script info bars as chrome teams has removed this as security bug.
instead pass another array to launch method
ignoreDefaultArgs: ["--enable-automation"]

@lifeboatpres

lifeboatpres commented Sep 22, 2022

Copy link
Copy Markdown

" Object.defineProperty(navigator, 'webdriver', { get: () => false, }); " cant work,it is not enought ,because after that input " 'window' in navigator ",the result is 'True'. it still will be detected.

Better to use:

const newProto = navigator.__proto__;
delete newProto.webdriver;
navigator.__proto__ = newProto;

@Vordlex

Vordlex commented Mar 20, 2023

Copy link
Copy Markdown

" Object.defineProperty(navigator, 'webdriver', { get: () => false, }); " cant work,it is not enought ,because after that input " 'window' in navigator ",the result is 'True'. it still will be detected.

Better to use:

const newProto = navigator.__proto__;
delete newProto.webdriver;
navigator.__proto__ = newProto;

image
solve WebDriver (NEW) for me

@IggsGrey

Copy link
Copy Markdown

Works on localhost for me, fails on remote vps

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment