Headless Chrome Node.js API

Overview

Puppeteer

Build status npm puppeteer package

API | FAQ | Contributing | Troubleshooting

Puppeteer is a Node library which provides a high-level API to control Chrome or Chromium over the DevTools Protocol. Puppeteer runs headless by default, but can be configured to run full (non-headless) Chrome or Chromium.

What can I do?

Most things that you can do manually in the browser can be done using Puppeteer! Here are a few examples to get you started:

  • Generate screenshots and PDFs of pages.
  • Crawl a SPA (Single-Page Application) and generate pre-rendered content (i.e. "SSR" (Server-Side Rendering)).
  • Automate form submission, UI testing, keyboard input, etc.
  • Create an up-to-date, automated testing environment. Run your tests directly in the latest version of Chrome using the latest JavaScript and browser features.
  • Capture a timeline trace of your site to help diagnose performance issues.
  • Test Chrome Extensions.

Give it a spin: https://try-puppeteer.appspot.com/

Getting Started

Installation

To use Puppeteer in your project, run:

npm i puppeteer
# or "yarn add puppeteer"

Note: When you install Puppeteer, it downloads a recent version of Chromium (~170MB Mac, ~282MB Linux, ~280MB Win) that is guaranteed to work with the API. To skip the download, or to download a different browser, see Environment variables.

puppeteer-core

Since version 1.7.0 we publish the puppeteer-core package, a version of Puppeteer that doesn't download any browser by default.

npm i puppeteer-core
# or "yarn add puppeteer-core"

puppeteer-core is intended to be a lightweight version of Puppeteer for launching an existing browser installation or for connecting to a remote one. Be sure that the version of puppeteer-core you install is compatible with the browser you intend to connect to.

See puppeteer vs puppeteer-core.

Usage

Puppeteer follows the latest maintenance LTS version of Node.

Note: Prior to v1.18.1, Puppeteer required at least Node v6.4.0. Versions from v1.18.1 to v2.1.0 rely on Node 8.9.0+. Starting from v3.0.0 Puppeteer starts to rely on Node 10.18.1+. All examples below use async/await which is only supported in Node v7.6.0 or greater.

Puppeteer will be familiar to people using other browser testing frameworks. You create an instance of Browser, open pages, and then manipulate them with Puppeteer's API.

Example - navigating to https://example.com and saving a screenshot as example.png:

Save file as example.js

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  await page.goto('https://example.com');
  await page.screenshot({ path: 'example.png' });

  await browser.close();
})();

Execute script on the command line

node example.js

Puppeteer sets an initial page size to 800×600px, which defines the screenshot size. The page size can be customized with Page.setViewport().

Example - create a PDF.

Save file as hn.js

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  await page.goto('https://news.ycombinator.com', {
    waitUntil: 'networkidle2',
  });
  await page.pdf({ path: 'hn.pdf', format: 'a4' });

  await browser.close();
})();

Execute script on the command line

node hn.js

See Page.pdf() for more information about creating pdfs.

Example - evaluate script in the context of the page

Save file as get-dimensions.js

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  await page.goto('https://example.com');

  // Get the "viewport" of the page, as reported by the page.
  const dimensions = await page.evaluate(() => {
    return {
      width: document.documentElement.clientWidth,
      height: document.documentElement.clientHeight,
      deviceScaleFactor: window.devicePixelRatio,
    };
  });

  console.log('Dimensions:', dimensions);

  await browser.close();
})();

Execute script on the command line

node get-dimensions.js

See Page.evaluate() for more information on evaluate and related methods like evaluateOnNewDocument and exposeFunction.

Default runtime settings

1. Uses Headless mode

Puppeteer launches Chromium in headless mode. To launch a full version of Chromium, set the headless option when launching a browser:

const browser = await puppeteer.launch({ headless: false }); // default is true

2. Runs a bundled version of Chromium

By default, Puppeteer downloads and uses a specific version of Chromium so its API is guaranteed to work out of the box. To use Puppeteer with a different version of Chrome or Chromium, pass in the executable's path when creating a Browser instance:

const browser = await puppeteer.launch({ executablePath: '/path/to/Chrome' });

You can also use Puppeteer with Firefox Nightly (experimental support). See Puppeteer.launch() for more information.

See this article for a description of the differences between Chromium and Chrome. This article describes some differences for Linux users.

3. Creates a fresh user profile

Puppeteer creates its own browser user profile which it cleans up on every run.

Resources

Debugging tips

  1. Turn off headless mode - sometimes it's useful to see what the browser is displaying. Instead of launching in headless mode, launch a full version of the browser using headless: false:

    const browser = await puppeteer.launch({ headless: false });
  2. Slow it down - the slowMo option slows down Puppeteer operations by the specified amount of milliseconds. It's another way to help see what's going on.

    const browser = await puppeteer.launch({
      headless: false,
      slowMo: 250, // slow down by 250ms
    });
  3. Capture console output - You can listen for the console event. This is also handy when debugging code in page.evaluate():

    page.on('console', (msg) => console.log('PAGE LOG:', msg.text()));
    
    await page.evaluate(() => console.log(`url is ${location.href}`));
  4. Use debugger in application code browser

    There are two execution context: node.js that is running test code, and the browser running application code being tested. This lets you debug code in the application code browser; ie code inside evaluate().

    • Use {devtools: true} when launching Puppeteer:

      const browser = await puppeteer.launch({ devtools: true });
    • Change default test timeout:

      jest: jest.setTimeout(100000);

      jasmine: jasmine.DEFAULT_TIMEOUT_INTERVAL = 100000;

      mocha: this.timeout(100000); (don't forget to change test to use function and not '=>')

    • Add an evaluate statement with debugger inside / add debugger to an existing evaluate statement:

      await page.evaluate(() => {
        debugger;
      });

      The test will now stop executing in the above evaluate statement, and chromium will stop in debug mode.

  5. Use debugger in node.js

    This will let you debug test code. For example, you can step over await page.click() in the node.js script and see the click happen in the application code browser.

    Note that you won't be able to run await page.click() in DevTools console due to this Chromium bug. So if you want to try something out, you have to add it to your test file.

    • Add debugger; to your test, eg:

      debugger;
      await page.click('a[target=_blank]');
    • Set headless to false

    • Run node --inspect-brk, eg node --inspect-brk node_modules/.bin/jest tests

    • In Chrome open chrome://inspect/#devices and click inspect

    • In the newly opened test browser, type F8 to resume test execution

    • Now your debugger will be hit and you can debug in the test browser

  6. Enable verbose logging - internal DevTools protocol traffic will be logged via the debug module under the puppeteer namespace.

     # Basic verbose logging
     env DEBUG="puppeteer:*" node script.js
    
     # Protocol traffic can be rather noisy. This example filters out all Network domain messages
     env DEBUG="puppeteer:*" env DEBUG_COLORS=true node script.js 2>&1 | grep -v '"Network'
    
  7. Debug your Puppeteer (node) code easily, using ndb

  • npm install -g ndb (or even better, use npx!)

  • add a debugger to your Puppeteer (node) code

  • add ndb (or npx ndb) before your test command. For example:

    ndb jest or ndb mocha (or npx ndb jest / npx ndb mocha)

  • debug your test inside chromium like a boss!

Usage with TypeScript

We have recently completed a migration to move the Puppeteer source code from JavaScript to TypeScript and as of version 7.0.1 we ship our own built-in type definitions.

If you are on a version older than 7, we recommend installing the Puppeteer type definitions from the DefinitelyTyped repository:

npm install --save-dev @types/puppeteer

The types that you'll see appearing in the Puppeteer source code are based off the great work of those who have contributed to the @types/puppeteer package. We really appreciate the hard work those people put in to providing high quality TypeScript definitions for Puppeteer's users.

Contributing to Puppeteer

Check out contributing guide to get an overview of Puppeteer development.

FAQ

Q: Who maintains Puppeteer?

The Chrome DevTools team maintains the library, but we'd love your help and expertise on the project! See Contributing.

Q: What is the status of cross-browser support?

Official Firefox support is currently experimental. The ongoing collaboration with Mozilla aims to support common end-to-end testing use cases, for which developers expect cross-browser coverage. The Puppeteer team needs input from users to stabilize Firefox support and to bring missing APIs to our attention.

From Puppeteer v2.1.0 onwards you can specify puppeteer.launch({product: 'firefox'}) to run your Puppeteer scripts in Firefox Nightly, without any additional custom patches. While an older experiment required a patched version of Firefox, the current approach works with “stock” Firefox.

We will continue to collaborate with other browser vendors to bring Puppeteer support to browsers such as Safari. This effort includes exploration of a standard for executing cross-browser commands (instead of relying on the non-standard DevTools Protocol used by Chrome).

Q: What are Puppeteer’s goals and principles?

The goals of the project are:

  • Provide a slim, canonical library that highlights the capabilities of the DevTools Protocol.
  • Provide a reference implementation for similar testing libraries. Eventually, these other frameworks could adopt Puppeteer as their foundational layer.
  • Grow the adoption of headless/automated browser testing.
  • Help dogfood new DevTools Protocol features...and catch bugs!
  • Learn more about the pain points of automated browser testing and help fill those gaps.

We adapt Chromium principles to help us drive product decisions:

  • Speed: Puppeteer has almost zero performance overhead over an automated page.
  • Security: Puppeteer operates off-process with respect to Chromium, making it safe to automate potentially malicious pages.
  • Stability: Puppeteer should not be flaky and should not leak memory.
  • Simplicity: Puppeteer provides a high-level API that’s easy to use, understand, and debug.

Q: Is Puppeteer replacing Selenium/WebDriver?

No. Both projects are valuable for very different reasons:

  • Selenium/WebDriver focuses on cross-browser automation; its value proposition is a single standard API that works across all major browsers.
  • Puppeteer focuses on Chromium; its value proposition is richer functionality and higher reliability.

That said, you can use Puppeteer to run tests against Chromium, e.g. using the community-driven jest-puppeteer. While this probably shouldn’t be your only testing solution, it does have a few good points compared to WebDriver:

  • Puppeteer requires zero setup and comes bundled with the Chromium version it works best with, making it very easy to start with. At the end of the day, it’s better to have a few tests running chromium-only, than no tests at all.
  • Puppeteer has event-driven architecture, which removes a lot of potential flakiness. There’s no need for evil “sleep(1000)” calls in puppeteer scripts.
  • Puppeteer runs headless by default, which makes it fast to run. Puppeteer v1.5.0 also exposes browser contexts, making it possible to efficiently parallelize test execution.
  • Puppeteer shines when it comes to debugging: flip the “headless” bit to false, add “slowMo”, and you’ll see what the browser is doing. You can even open Chrome DevTools to inspect the test environment.

Q: Why doesn’t Puppeteer v.XXX work with Chromium v.YYY?

We see Puppeteer as an indivisible entity with Chromium. Each version of Puppeteer bundles a specific version of Chromium – the only version it is guaranteed to work with.

This is not an artificial constraint: A lot of work on Puppeteer is actually taking place in the Chromium repository. Here’s a typical story:

However, oftentimes it is desirable to use Puppeteer with the official Google Chrome rather than Chromium. For this to work, you should install a puppeteer-core version that corresponds to the Chrome version.

For example, in order to drive Chrome 71 with puppeteer-core, use chrome-71 npm tag:

npm install puppeteer-core@chrome-71

Q: Which Chromium version does Puppeteer use?

Look for the chromium entry in revisions.ts. To find the corresponding Chromium commit and version number, search for the revision prefixed by an r in OmahaProxy's "Find Releases" section.

Q: Which Firefox version does Puppeteer use?

Since Firefox support is experimental, Puppeteer downloads the latest Firefox Nightly when the PUPPETEER_PRODUCT environment variable is set to firefox. That's also why the value of firefox in revisions.ts is latest -- Puppeteer isn't tied to a particular Firefox version.

To fetch Firefox Nightly as part of Puppeteer installation:

PUPPETEER_PRODUCT=firefox npm i puppeteer
# or "yarn add puppeteer"

Q: What’s considered a “Navigation”?

From Puppeteer’s standpoint, “navigation” is anything that changes a page’s URL. Aside from regular navigation where the browser hits the network to fetch a new document from the web server, this includes anchor navigations and History API usage.

With this definition of “navigation,” Puppeteer works seamlessly with single-page applications.

Q: What’s the difference between a “trusted" and "untrusted" input event?

In browsers, input events could be divided into two big groups: trusted vs. untrusted.

  • Trusted events: events generated by users interacting with the page, e.g. using a mouse or keyboard.
  • Untrusted event: events generated by Web APIs, e.g. document.createEvent or element.click() methods.

Websites can distinguish between these two groups:

  • using an Event.isTrusted event flag
  • sniffing for accompanying events. For example, every trusted 'click' event is preceded by 'mousedown' and 'mouseup' events.

For automation purposes it’s important to generate trusted events. All input events generated with Puppeteer are trusted and fire proper accompanying events. If, for some reason, one needs an untrusted event, it’s always possible to hop into a page context with page.evaluate and generate a fake event:

await page.evaluate(() => {
  document.querySelector('button[type=submit]').click();
});

Q: What features does Puppeteer not support?

You may find that Puppeteer does not behave as expected when controlling pages that incorporate audio and video. (For example, video playback/screenshots is likely to fail.) There are two reasons for this:

  • Puppeteer is bundled with Chromium — not Chrome — and so by default, it inherits all of Chromium's media-related limitations. This means that Puppeteer does not support licensed formats such as AAC or H.264. (However, it is possible to force Puppeteer to use a separately-installed version Chrome instead of Chromium via the executablePath option to puppeteer.launch. You should only use this configuration if you need an official release of Chrome that supports these media formats.)
  • Since Puppeteer (in all configurations) controls a desktop version of Chromium/Chrome, features that are only supported by the mobile version of Chrome are not supported. This means that Puppeteer does not support HTTP Live Streaming (HLS).

Q: I am having trouble installing / running Puppeteer in my test environment. Where should I look for help?

We have a troubleshooting guide for various operating systems that lists the required dependencies.

Q: How do I try/test a prerelease version of Puppeteer?

You can check out this repo or install the latest prerelease from npm:

npm i --save puppeteer@next

Please note that prerelease may be unstable and contain bugs.

Q: I have more questions! Where do I ask?

There are many ways to get help on Puppeteer:

Make sure to search these channels before posting your question.

Comments
  • Chrome Headless doesn't launch on Debian

    Chrome Headless doesn't launch on Debian

    Running this example code from the README:

    const puppeteer = require('puppeteer');
    
    (async() => {
    
    const browser = await puppeteer.launch();
    const page = await browser.newPage();
    await page.goto('https://example.com');
    await page.screenshot({path: 'example.png'});
    
    browser.close();
    })();
    

    I get the following error output:

    (node:30559) UnhandledPromiseRejectionWarning: Unhandled promise rejection (rejection id: 1): Error: Failed to connect to chrome!
    (node:30559) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.
    

    Platform info:

    % uname -a
    Linux localhost 3.14.0 #1 SMP PREEMPT Thu Jul 13 12:08:15 PDT 2017 x86_64 GNU/Linux
    % lsb_release -a
    Distributor ID: Debian
    Description:    Debian GNU/Linux 9.0 (stretch)
    Release:        9.0
    Codename:       stretch
    % node --version
    v8.1.1
    % cat package.json
    {
      "dependencies": {
        "puppeteer": "^0.9.0"
      }
    }
    
    host 
    opened by fortes 193
  • TimeoutError: Timed out after 30000 ms while trying to connect to Chrome!

    TimeoutError: Timed out after 30000 ms while trying to connect to Chrome!

    Hello

    I try to implement/deploy puppeteer on Google Cloud but I have some issues. What I don't understand, it worked well 2 days ago. Since then, I have this error:

    TimeoutError: Timed out after 30000 ms while trying to connect to Chrome! The only Chrome revision guaranteed to work is r674921 at Timeout.onTimeout

    I ran this command and everything seems fine : DEBUG=* node app.js

    Version puppeteer : ^1.19.0 Version Node : v8.11.3

    And here the code:

    `async function main_screenshot(project_id) { try { const browser = await puppeteer.launch({ headless: true, args: ["--window-size=1440,1000", "--no-sandbox", "--disable-setuid-sandbox", "--disable-gpu"] }); // const browser = await puppeteer.launch({dumpio: true}); const page = await browser.newPage() page.setUserAgent('Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36') page.setViewport({ width: 1440, height: 721 }) console.log('entering in screenshot zone') page.setDefaultNavigationTimeout(30000) await page.goto('https://app.slack.com/client/TLW4JE7AA/CLTFQT9GC') await page.waitFor(3000) await page.type('#email', process.env.SLACK_EMAIL) await page.type('#password', process.env.SLACK_PWD) await page.waitFor(3000) await page.click('#signin_btn') console.log('credentials done') await page.waitFor(3000) console.log('so?') await page.goto('https://app.slack.com/client/TLW4JE7AA/CLTFQT9GC') await page.waitFor(3000) console.log('connected to the right environment') const div = await page.$$('.c-message.c-message--light') var nuwwmber_loop= 0 for (var i = div.length ; i > 0; i--) { var text = await (await div[i - 1].getProperty('textContent')).jsonValue();

      if(text.includes(project_id)) {
        number_loop = i
        break
      }
    }
    // const text = await (await div.getProperty('textContent')).jsonValue();
    
    
     const shotResult = await div[number_loop-1].screenshot();
    console.log(shotResult)
    cloudinaryOptions = {
      public_id: `${project_id}`
    }
    
    const results = await uploadToCloudinary(shotResult, cloudinaryOptions)
    console.log(results)
    browser.close()
    return results
    

    } catch (e) { console.log(e) } }`

    opened by knandraina 112
  • When would an error

    When would an error "Cannot find context with specified id undefined" happen?

    I am getting the following error:

    Error: Protocol error (Runtime.callFunctionOn): Cannot find context with specified id undefined
        at Session._onMessage (/srv/node_modules/puppeteer-edge/lib/Connection.js:205:25)
        at Connection._onMessage (/srv/node_modules/puppeteer-edge/lib/Connection.js:105:19)
        at emitOne (events.js:115:13)
        at WebSocket.emit (events.js:210:7)
        at Receiver._receiver.onmessage (/srv/node_modules/ws/lib/WebSocket.js:143:47)
        at Receiver.dataMessage (/srv/node_modules/ws/lib/Receiver.js:389:14)
        at Receiver.getData (/srv/node_modules/ws/lib/Receiver.js:330:12)
        at Receiver.startLoop (/srv/node_modules/ws/lib/Receiver.js:165:16)
        at Receiver.add (/srv/node_modules/ws/lib/Receiver.js:139:10)
        at Socket._ultron.on (/srv/node_modules/ws/lib/WebSocket.js:139:22)
        at emitOne (events.js:115:13)
        at Socket.emit (events.js:210:7)
        at addChunk (_stream_readable.js:266:12)
        at readableAddChunk (_stream_readable.js:253:11)
        at Socket.Readable.push (_stream_readable.js:211:10)
        at TCP.onread (net.js:585:20)
    
    

    It is a large codebase and it is unclear whats triggering this error.

    Any guides?

    On that note, there needs to be a better way to throw error. Without knowing the origin of the error in the code it is impossible to trace down these exceptions.

    opened by gajus 103
  • can't run puppeteer in centos7

    can't run puppeteer in centos7

    Server Info: CUP: Intel(R) Xeon(R) Platinum 8163 CPU @ 2.50GHz MemTotal: 1016396 kB OS:CentOS Linux release 7.3.1611 (Core) Node:v8.4.0

    when I try to run my app in this server, there have an error throw:

    (node:29208) UnhandledPromiseRejectionWarning: Unhandled promise rejection (rejection id: 1): Error: Failed to connect to chrome!

    I want run Chromium which download by puppeteer manual,there have some log output:

    robin@eve ~/project/memostickyserver/node_modules/puppeteer/.local-chromium/linux-494755/chrome-linux (master*) $ chrome robin@eve ~/project/memostickyserver/node_modules/puppeteer/.local-chromium/linux-494755/chrome-linux (master*) $ ./chrome [28268:28268:0819/223159.486750:FATAL:zygote_host_impl_linux.cc(123)] No usable sandbox! Update your kernel or see https://chromium.googlesource.com/chromium/src/+/master/docs/linux_suid_sandbox_development.md for more information on developing with the SUID sandbox. If you want to live dangerously and need an immediate workaround, you can try using --no-sandbox. #0 0x7fd4bc6d6657 base::debug::StackTrace::StackTrace() #1 0x7fd4bc6ea311 logging::LogMessage::~LogMessage() #2 0x7fd4bb8db1f1 content::ZygoteHostImpl::Init() #3 0x7fd4bb575da0 content::BrowserMainLoop::EarlyInitialization() #4 0x7fd4bb57c4c3 content::BrowserMainRunnerImpl::Initialize() #5 0x7fd4bb575532 content::BrowserMain() #6 0x7fd4bc3e17fd content::ContentMainRunnerImpl::Run() #7 0x7fd4bc3e9314 service_manager::Main() #8 0x7fd4bc3e0462 content::ContentMain() #9 0x7fd4bb020b74 ChromeMain #10 0x7fd4b3da2b35 __libc_start_main #11 0x7fd4bb0209d0

    Received signal 6 #0 0x7fd4bc6d6657 base::debug::StackTrace::StackTrace() #1 0x7fd4bc6d61cf base::debug::(anonymous namespace)::StackDumpSignalHandler() #2 0x7fd4ba04d370 #3 0x7fd4b3db61d7 __GI_raise #4 0x7fd4b3db78c8 __GI_abort #5 0x7fd4bc6d5202 base::debug::BreakDebugger() #6 0x7fd4bc6ea7cc logging::LogMessage::~LogMessage() #7 0x7fd4bb8db1f1 content::ZygoteHostImpl::Init() #8 0x7fd4bb575da0 content::BrowserMainLoop::EarlyInitialization() #9 0x7fd4bb57c4c3 content::BrowserMainRunnerImpl::Initialize() #10 0x7fd4bb575532 content::BrowserMain() #11 0x7fd4bc3e17fd content::ContentMainRunnerImpl::Run() #12 0x7fd4bc3e9314 service_manager::Main() #13 0x7fd4bc3e0462 content::ContentMain() #14 0x7fd4bb020b74 ChromeMain #15 0x7fd4b3da2b35 __libc_start_main #16 0x7fd4bb0209d0 r8: 00007fff21b42080 r9: 0000000000000395 r10: 0000000000000008 r11: 0000000000000206 r12: 00007fff21b424e0 r13: 000000000000016d r14: 00007fff21b424d8 r15: 00007fff21b424d0 di: 0000000000006e6c si: 0000000000006e6c bp: 00007fff21b42080 bx: 00007fff21b42080 dx: 0000000000000006 ax: 0000000000000000 cx: ffffffffffffffff sp: 00007fff21b41ed8 ip: 00007fd4b3db61d7 efl: 0000000000000206 cgf: 0000000000000033 erf: 0000000000000000 trp: 0000000000000000 msk: 0000000000000000 cr2: 0000000000000000 [end of stack trace] Calling _exit(1). Core file will not be generated. robin@eve ~/project/memostickyserver/node_modules/puppeteer/.local-chromium/linux-494755/chrome-linux (master*) $ ./chrome --no-sandbox [0819/223234.158902:ERROR:nacl_helper_linux.cc(310)] NaCl helper process running without a sandbox! Most likely you need to configure your SUID sandbox correctly

    pls help,how to fix this ?

    host 
    opened by rlog 84
  • Cooperative request intercepts

    Cooperative request intercepts

    Hello, I wanted share my fork with the Puppeteer team to see if there is anything here you'd like to bring into the core.

    This PR introduces request intercept handlers that play well with others. I needed this for a recent project and thought it might help to share my fork because it seems that there is a lot of community confusion around the way request intercepts are handled.

    The Problem: Puppeteer's core expects only one page.on('request') handler to call abort/respond/continue

    Puppeteer anticipates only one request intercept handler (page.on('request', ...)) to perform continue(), abort(), or respond(). There are valid use cases where multiple request handlers may call abort/respond/continue, unaware of each other. For example, puppeteer-extra attempts a plugin architecture, yet enabling any plugin that utilizes request intercepts (such as ad blocker) means that I cannot further alter request interceptions because calling abort/respond/continue a second time will throw an error.

    The current core design makes it impossible for multiple request handlers to cooperatively determine what to do with a request.

    Here is a sample of issues for background:

    https://github.com/berstend/puppeteer-extra/issues/364 https://github.com/berstend/puppeteer-extra/issues/156 https://github.com/smooth-code/jest-puppeteer/issues/308 https://github.com/puppeteer/puppeteer/issues/5334 https://github.com/puppeteer/puppeteer/issues/3853#issuecomment-458193921 https://github.com/puppeteer/puppeteer/issues/2687 https://github.com/puppeteer/puppeteer/issues/4524

    The solution: Cooperative Request Interception mode

    This PR proposes a cooperative interception strategy which allows multiple request intercept handlers to call abort/respond/continue in any order and multiplicity, while also making NetworkManager wait for async handlers to be fulfilled before finalizing the request interception.

    
    // 2nd param enables Cooperative mode rather than Legacy mode
    // In Cooperative mode, abort/respond/continue can be called multiple times by multiple handlers.
    // The winning action is decided after all handlers have resolved.
    page.setRequestInterception(true, true) 
    
    // If any handler calls abort(), the request will be aborted.
    // All handlers will still run, but ultimately the request will be aborted because at least one handler
    // requested an abort.
    page.on('request', req=> {
      req.abort()
    });
    
    // Had no handler called abort(), but at least one handler called respond(),
    // then the request will be responded.
    page.on('request', req=> {
      req.respond({...})
    });
    
    // This is the lowest priority. continue() is the default action and no handler even needs to call it explicitly
    // unless request info needs to be changed. A request will be continued only if no abort() or respond()
    // has been called by any handler.
    // Cooperative mode is more forgiving because no request will hang since continue() is the default action.
    page.on('request', req=> {
      req.continue() // Not even necessary, NetworkManger will fall through to continue() by default
    });
    

    Breaking changes

    None. Legacy intercept mode is still the default. A new "Cooperative Intercept Mode" is introduced by a second Page.setRequestInterception parameter.

    Todo

    • [ ] Document cooperative mode
    • [ ] Add unit tests
    • [ ] Fix failing tests/regressions
    cla: yes 
    opened by benallfree 79
  • Support browser contexts to launch different sessions

    Support browser contexts to launch different sessions

    Support browser contexts (Target.createBrowserContext) so to avoid launching multiple instances if one wants a pristine session. See proposal https://github.com/GoogleChrome/puppeteer/issues/66 and original discussion at https://github.com/cyrus-and/chrome-remote-interface/issues/118.

    A viable user scenario might be testing several users logged in simultaneously into the service. We might expose browser contexts as a string literal option to the browser.newPage:

    browser.newPage(); // creates a new page in a default browser context
    browser.newPage({ context: 'default' }); // same as previous call
    browser.newPage({ context: 'another-context' }); // creates a page in another browser context
    
    feature upstream 
    opened by aslushnikov 79
  • Puppeteer not working when is not focused

    Puppeteer not working when is not focused

    Environment:

    • Puppeteer version: 1.8.0
    • Platform / OS version: Win 10 64b and Ubuntu 18.04.1 LTS
    • URLs (if applicable): No
    • Node.js version: v10.6.0

    I am using puppeteer for testing my pages. When I run the app and hide it into background then puppeteer is paused (or freeze). When I move puppeteer into foreground is running normally.

    I think that its problem "when window is not focused, its not working".

    This behavior is on windows and linux too.

    Set up code:

    			const args = []
    			args.push('--start-maximized')
    			args.push('--disable-gpu')
    			args.push('--disable-setuid-sandbox')
    			args.push('--force-device-scale-factor')
    			args.push('--ignore-certificate-errors')
    			args.push('--no-sandbox')
    
    			this.driver = await puppeteer.launch({
    				headless: false,
    				args,
    				userDataDir: `puppeteer_profile`,
    			})
    
    
    

    i happen only "sometimes" I don't know when.

    chromium unconfirmed 
    opened by JaLe29 78
  • Having trouble with page.type not typing all characters

    Having trouble with page.type not typing all characters

    Having some issues where it seems this.page.type is not working properly and I can't figure out why. It works 100% of the time in one of my tests, but I have a second test doing the same thing on a different page (only difference is the new page has more input fields) and it fails most of the time (but occasionally passes). Here's an example of what I'm doing:

    await this.page.type('[placeholder="Enter ID"]', myObj.id);
    await this.page.waitForSelector(`input[value="${myObj.id}"]`);
    
    await this.page.type('[placeholder="Enter Type"]', myObj.type);
    await this.page.waitForSelector(`input[value="${myObj.type}"]`);
    

    I have 11 total input fields that follow the above pattern on this page, the page that has never flaked out on me has only 6 input fields.

    The problem is page.type doesn't appear to be typing everything in myObj, for example myObj.type contains 10 random words, occasionally puppeteer seems to only type some of the characters contained in myObj.type, it sometimes cuts off in the middle of a word, other times it cuts off at the end of a word. The property that doesn't type out completely is random every run, as is the number of characters it actually types out. It is not a character limit in the database or input field (I can manually type in significantly more without a problem and each time I run it cuts off at a different random point, sometimes as few as 5-6 characters, other times as many as 20 or so characters).

    Is there any debugging methods that would help me figure this out? Console logging myObj shows the full text for every property, taking screenshots or running puppeteer outside of headless mode shows that it stops inputting characters at random times.

    Additionally I've tried adding a delay option into my page.type calls and it seems to make the issue worse, typing only a single character into my first input field before breaking. It seems like it might be losing focus (when watching it in headless: false mode).

    opened by Nagisan 76
  • userDataDir + headless = lost authorization

    userDataDir + headless = lost authorization

    • Puppeteer version: v0.12.0-alpha
    • Platform / OS version: Windows 7 x64
    • URLs (if applicable): any that needs authorization
    1. Create a test script and an empty folder test-profile-dir:
    'use strict';
    
    const puppeteer = require('puppeteer');
    
    (async function main() {
      try {
        const browser = await puppeteer.launch({
          headless: false,
          userDataDir: 'test-profile-dir',
        });
        const page = await browser.newPage();
    
        await page.goto('https://twitter.com/');
    
        console.log(await page.evaluate(() => document.title));
        console.log(await page.evaluate(() => document.cookie));
    
        // await browser.close();
      } catch (err) {
        console.error(err);
      }
    })();
    

    You will see something like this:

    Twitter. It's what's happening.
    
    personalization_id=...; guest_id=...; ct0=...
    

    Sign in into Twitter and close the browser.

    1. Run the same script second time (await browser.close(); can be uncommented). You will see something like this:
    Twitter
    
    personalization_id=...; guest_id=...; ct0=...; ads_prefs=...; remember_checked_on=...;
    twid=...; tip_nightmode=...; _ga=...; _gid=...; dnt=...; lang=...
    
    1. Run the same script, but with headless: true, The output is the same as before authorization:
    Twitter. It's what's happening.
    
    personalization_id=...; guest_id=...; ct0=...
    

    I have tried various sites, all of them seem to lose authorization in headless mode.

    Some more notes:

    • response.request().headers does not contain cookies in both headless: false and headless: true modes.

    • console.log(await page.cookies('https://twitter.com/')); contains many cookies in the headless: false mode. In the headless: true mode it gives an empty array [].

    bug P1 
    opened by vsemozhetbyt 75
  • Puppeteer slow execution on Cloud Functions

    Puppeteer slow execution on Cloud Functions

    I am experimenting Puppeteer on Cloud Functions.

    After a few tests, I noticed that taking a page screenshot of https://google.com takes about 5 seconds on average when deployed on Google Cloud Functions infrastructure, while the same function tested locally (using firebase serve) takes only 2 seconds.

    At first sight, I was thinking about a classical cold start issue. Unfortunately, after several consecutive calls, the results remain the same.

    Is Puppeteer (transitively Chrome headless) so CPU-intensive that the best '2GB' Cloud Functions class is not powerful enough to achieve the same performance as a middle-class desktop?

    Could something else explain the results I am getting? Are there any options that could help to get an execution time that is close to the local test?

    Here is the code I use:

    import * as functions from 'firebase-functions';
    import * as puppeteer from 'puppeteer';
    
    export const capture =
        functions.runWith({memory: '2GB', timeoutSeconds: 60})
            .https.onRequest(async (req, res) => {
    
        const browser = await puppeteer.launch({
            args: ['--no-sandbox']
        });
    
        const url = req.query.url;
    
        if (!url) {
            res.status(400).send(
                'Please provide a URL. Example: ?url=https://example.com');
        }
    
        try {
            const page = await browser.newPage();
            await page.goto(url, {waitUntil: 'networkidle2'});
            const buffer = await page.screenshot({fullPage: true});
            await browser.close();
            res.type('image/png').send(buffer);
        } catch (e) {
            await browser.close();
            res.status(500).send(e.toString());
        }
    });
    

    Deployed with Firebase Functions using NodeJS 8.

    chromium unconfirmed 
    opened by lpellegr 68
  • window inner size not equal to viewport size

    window inner size not equal to viewport size

    Tell us about your environment:

    • Puppeteer version: 0.13.0-alpha
    • Platform / OS version: macos high sierra 10.13
    • URLs (if applicable):

    What steps will reproduce the problem?

    Please include code that reproduces the issue.

    // start.js
    
    const puppeteer = require('puppeteer')
    
    const run = async () => {
      const browser = await puppeteer.launch({
        headless: false,
        slowMo: 250,
        args: [
          '--disable-infobars',
        ],
      });
    
      const page = await browser.newPage();
    
      await page.goto('https://www.google.com/');
    
      await page.waitFor(60000)
      await browser.close();
    }
    
    run()
    
    1. node start.js

    What is the expected result?

    1. window inner size and viewport should be equal, all the default 800px x 600px

    What happens instead?

    1. window inner size is larger than the viewport size

    google-puppeteer.png

    feature 
    opened by hclj37 65
  • [Bug]: Can't install

    [Bug]: Can't install

    Bug description

    Steps to reproduce the problem:

    1. pnpm i puppeteer (or npm)

    Puppeteer version

    _

    Node.js version

    16.16.0

    npm version

    NPM: 9.2.0 - PNPM: 7.22.0

    What operating system are you seeing the problem on?

    Windows

    Configuration file

    No response

    Relevant log output

    PS C:\Users\babak\Desktop\epicgames-webscraper> pnpm add puppeteer
    Packages: +76
    ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
    node_modules/.pnpm/[email protected]/node_modules/puppeteer: Running postinstall script, failed in 799ms
    .../node_modules/puppeteer postinstall$ node install.js
    │ ERROR: Failed to set up Chromium r1069273! Set "PUPPETEER_SKIP_DOWNLOAD" env variable 
    │ Error: Download failed: server returned code 403. URL: https://storage.googleapis.com/
    │     at C:\Users\babak\Desktop\epicgames-webscraper\node_modules\.pnpm\puppeteer-core@1
    │     at ClientRequest.requestCallback (C:\Users\babak\Desktop\epicgames-webscraper\node
    │     at Object.onceWrapper (node:events:642:26)
    │     at ClientRequest.emit (node:events:527:28)
    │     at HTTPParser.parserOnIncomingClient (node:_http_client:631:27)
    │     at HTTPParser.parserOnHeadersComplete (node:_http_common:128:17)
    │     at TLSSocket.socketOnData (node:_http_client:494:22)
    │     at TLSSocket.emit (node:events:527:28)
    │     at addChunk (node:internal/streams/readable:315:12)
    │     at readableAddChunk (node:internal/streams/readable:289:9)
    └─ Failed in 800ms at C:\Users\babak\Desktop\epicgames-webscraper\node_modules\.pnpm\[email protected]\node_modules\puppeteer
    Progress: resolved 76, reused 76, downloaded 0, added 0, done
     ELIFECYCLE  Command failed with exit code 1.
    
    bug 
    opened by babakfp 1
  • [Bug]:  puppeteer.launch() is taking too long

    [Bug]: puppeteer.launch() is taking too long

    Bug description

    When starting puppeteer, it takes too long on execute puppeteer.lanch(). Too long means an arbitrary time above 5 mins, but it can be 5 min, 10min or 30min.

    Run puppeteer by launching the browser with this settings:

    const args = [
      '--no-sandbox',
      '--disable-setuid-sandbox',
      '--window-size=1920,1080',
      '--disable-infobars',
      '--window-position=0,0',
      '--ignore-certifcate-errors',
      '--ignore-certifcate-errors-spki-list',
      '--user-agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3312.0 Safari/537.36"',
    ];
    
      const browser = await puppeteer
        .launch({
          headless: true,
          args,
          timeout: 0,
          ignoreDefaultArgs: [],
          ignoreHTTPSErrors: true,
        })
        .catch((err) =>
          console.log(chalk.red('Error launching puppeteer: ', err))
        );
    

    However, if I ser "headless: false" everything runs instantly

    Puppeteer version

    19.3.0

    Node.js version

    16.18.1

    npm version

    8.19.2

    What operating system are you seeing the problem on?

    macOS

    Configuration file

    No response

    Relevant log output

    No response

    bug 
    opened by fjpedrosa 0
  • [Bug]: Always takes full screen even if ```fullPage``` options is not set as true

    [Bug]: Always takes full screen even if ```fullPage``` options is not set as true

    Bug description

    When I call the function from page.screenshot and it doesn't use any options, and when the response appears the image is always full/long screenshot. This happens when I use the Chrome browser, if I use Chromium it doesn't have a strange problem.

    Steps to reproduce the problem:

    1. Create a script file named index.js
    let puppeteer = require('puppeteer')
    let fs = require('fs')
    
    async function ssweb(url) {
    	let browser = await puppeteer.launch({
    		headless: false,
    		executablePath: '/path/to/chrome'
    	})
    	let page = (await browser.pages())[0]
    	await page.goto(url, { waitUntil: 'networkidle0' })
    	let response = await page.screenshot()
    	await page.close()
    	return response
    }
    
    ssweb('https://www.google.com/search?q=puppeteer')
    .then(buffer => fs.writeFileSync('/path/to/image', buffer))
    .catch(console.error)
    
    1. After I run the script, the image that comes out is always full screen
    2. I tried using the fullPage option as true but the result is same, why is that so?

    Puppeteer version

    19.4.1

    Node.js version

    18.12.1

    npm version

    8.19.2

    What operating system are you seeing the problem on?

    Windows

    Configuration file

    No response

    Relevant log output

    No response

    bug 
    opened by RC047 0
  • [Regression]: Launching Chrome after Firefox breaks due to a regression in version 19.1.0

    [Regression]: Launching Chrome after Firefox breaks due to a regression in version 19.1.0

    Bug description

    During the upgrade of Puppeteer from version 19.0.0 to version 19.1.0 in Mozilla's PDF.js project (https://github.com/mozilla/pdf.js/issues/15865) we ran into the issue that our two browsers (Firefox and Chrome) didn't start correctly anymore. I found out that this is a regression and will provide a reduced test case and what I think is the location of the bug below.

    Steps to reproduce the problem:

    1. Create a file called reduced.js wit the following reduced test case code:
    const puppeteer = require("puppeteer");
    
    async function startBrowser(browserName) {
      const options = {
        product: browserName,
        headless: false,
      };
      await puppeteer.launch(options);
    }
    
    async function startBrowsers() {
      for (const browserName of ["firefox", "chrome"]) {
        await startBrowser(browserName);
      }
    }
    
    startBrowsers();
    
    1. Install Puppeteer 19.0.0.
    2. Run node reduced.js and notice that Firefox and Chrome start.
    3. Install Puppeteer 19.1.0.
    4. Run node reduced.js and notice that Firefox starts but Chrome errors with Tried to find the browser at the configured path (/home/timvandermeij/.cache/puppeteer/chrome/linux-110.0a1/chrome-linux/chrome) for revision 110.0a1, but no executable was found.. Notice that Puppeteer erroneously puts the Firefox revision in the Chrome path.
    5. Switch Firefox and Chrome around in the array in the reduced test case code. Notice that Firefox and Chrome now start just fine again. Therefore, the order suddenly matters.

    I traced this back to a regression in commit https://github.com/puppeteer/puppeteer/commit/ec201744f077987b288e3dff52c0906fe700f6fb which landed in version 19.1.0. I think the bug is here: https://github.com/puppeteer/puppeteer/commit/ec201744f077987b288e3dff52c0906fe700f6fb#diff-545f1eb6b4d2825410b4be2eb6f6baa65ab44841c7ef5ff444bb01a5ece66c2aR198. The FirefoxLauncher class needs to translate the "latest" revision, which is the default for Firefox, to a concrete revision such as "110.0a1". However, this line overrides the revision globally instead of putting it in a private member variable like before, so from now on any Chrome revision that was set during the intialization of Puppeteer is overwritten with the Firefox revision. Therefore, if you start Chrome next, it will try to use the Firefox revision because in the ChromeLauncher code there is no code to set the revision back. If you reverse the browsers in the reduced test case code it doesn't happen anymore because Firefox is last, so any global this.puppeteer object mutations it does are still done, but just not used anymore.

    /cc @jrandolf as the author of the commit who might have an idea on how to best resolve this issue.

    Puppeteer version

    19.1.0

    Node.js version

    19.3.0

    npm version

    8.19.2

    What operating system are you seeing the problem on?

    Linux, Windows

    Configuration file

    No response

    Relevant log output

    No response

    feature confirmed 
    opened by timvandermeij 1
  • [Bug]:  net::ERR_ABORTED when request responds with 204 status

    [Bug]: net::ERR_ABORTED when request responds with 204 status

    Bug description

    Steps to reproduce the problem:

    For ease, test with a server you control. Otherwise test with a URL that you know will respond with 204 status.

    Go to a URL on that server.

    await args.page.goto('http://localhost:8090/test');

    On the server at localhost:8090 make /test responds with 204 status.

    Puppeteer version

    19.2.2

    Node.js version

    16.16.0

    npm version

    8.15.0

    What operating system are you seeing the problem on?

    Linux

    Configuration file

    No response

    Relevant log output

    ERROR [ExceptionsHandler] net::ERR_ABORTED at http://localhost:8090/test
    Error: net::ERR_ABORTED at http://localhost:8090/test
        at navigate (/projects/x/node_modules/puppeteer-core/src/common/Frame.ts:346:13)
        at processTicksAndRejections (node:internal/process/task_queues:96:5)
        at Frame.goto (/projects/x/node_modules/puppeteer-core/src/common/Frame.ts:310:17)
        at CDPPage.goto (/projects/x/node_modules/puppeteer-core/src/common/Page.ts:1523:12)
     at myFunc (/projects/x/src/script.ts:251:27)
    
    bug needs-feedback 
    opened by dan-scott-dev 2
  • [Bug]: When executablePath contains UNC file path for windows, chrome pops up and closes immediately.

    [Bug]: When executablePath contains UNC file path for windows, chrome pops up and closes immediately.

    Bug description

    Steps to reproduce the problem:

    Launch with a UNC path:

    browser = await puppeteer.launch({
          headless: false,
          executablePath: '\\?\C:\some\path\to\chrome.exe'
    })
    
    page = (await browser.pages())[0]
    await page.goto('https://google.com', {
      waitUntil: 'networkidle0',
    });
    

    When launching this way (with \\?\ ) the browser window pops up very briefly and closes immediately leaving the connection waiting (which times out after the default 30 seconds)

    If you remove the \\?\ part of the path, everything works as expected (assuming you use the correct chrome path)

    Puppeteer version

    16.2.0

    Node.js version

    14.19.0

    npm version

    6.14.16

    What operating system are you seeing the problem on?

    Windows

    Configuration file

    No response

    Relevant log output

    No response

    bug needs-feedback 
    opened by webnoob 1
Releases(puppeteer-v19.4.1)
Scriptable Headless Browser

PhantomJS - Scriptable Headless WebKit PhantomJS (phantomjs.org) is a headless WebKit scriptable with JavaScript. The latest stable release is version

Ariya Hidayat 29.1k Jan 5, 2023
End-to-end testing framework written in Node.js and using the Webdriver API

Nightwatch.js Homepage | Getting Started | Developer Guide | API Reference | About Automated end-to-end testing framework powered by Node.js and using

Nightwatch.js 11.3k Jan 7, 2023
Next-gen browser and mobile automation test framework for Node.js

Next-gen browser and mobile automation test framework for Node.js. Homepage | Developer Guide | API Reference | Contribute | Changelog | Roadmap Webdr

WebdriverIO 7.9k Jan 3, 2023
A node.js library for testing modern web applications

Taiko Docs | API reference A Node.js library for testing modern web applications What’s Taiko? Taiko is a free and open source browser automation tool

Gauge 3.2k Dec 30, 2022
Evaluate JavaScript on a URL through headless Chrome browser.

jseval Evaluate JavaScript on a URL through headless Chrome browser. build docker build -t jseval -f jseval.dockerfile . usage docker run --rm jseval

Fumiya A 22 Nov 29, 2022
Reaction is an API-first, headless commerce platform built using Node.js, React, GraphQL. Deployed via Docker and Kubernetes.

Reaction Commerce Reaction is a headless commerce platform built using Node.js, React, and GraphQL. It plays nicely with npm, Docker and Kubernetes. G

Reaction Commerce 11.9k Jan 3, 2023
Next-gen mobile first analytics server (think Mixpanel, Google Analytics) with built-in encryption supporting HTTP2 and gRPC. Node.js, headless, API-only, horizontally scaleable.

Introduction to Awacs Next-gen behavior analysis server (think Mixpanel, Google Analytics) with built-in encryption supporting HTTP2 and gRPC. Node.js

Socketkit 52 Dec 19, 2022
An HTTP Web Server for Chrome (chrome.sockets API)

An HTTP Web Server for Chrome (chrome.sockets API)

Kyle Graehl 1.2k Dec 31, 2022
The most powerful headless CMS for Node.js — built with GraphQL and React

A scalable platform and CMS to build Node.js applications. schema => ({ GraphQL, AdminUI }) Keystone Next is a preview of the next major release of Ke

KeystoneJS 7.3k Jan 4, 2023
The most powerful headless CMS for Node.js — built with GraphQL and React

A scalable platform and CMS to build Node.js applications. schema => ({ GraphQL, AdminUI }) Keystone Next is a preview of the next major release of Ke

KeystoneJS 7.3k Dec 31, 2022
👻 The #1 headless Node.js CMS for professional publishing

Ghost.org | Features | Showcase | Forum | Docs | Contributing | Twitter Love open source? We're hiring Node.js Engineers to work on Ghost full-time Th

Ghost 42.1k Jan 5, 2023
ApostropheCMS is a full-featured, open-source CMS built with Node.js that seeks to empower organizations by combining in-context editing and headless architecture in a full-stack JS environment.

ApostropheCMS ApostropheCMS is a full-featured, open source CMS built with Node.js that seeks to empower organizations by combining in-context editing

Apostrophe Technologies 3.9k Jan 4, 2023
🚀 Open source Node.js Headless CMS to easily build customisable APIs

API creation made simple, secure and fast. The most advanced open-source headless CMS to build powerful APIs with no effort. Try live demo Strapi is a

strapi 50.8k Dec 27, 2022
Insanely fast, full-stack, headless browser testing using node.js

Zombie.js Insanely fast, headless full-stack testing using Node.js The Bite If you're going to write an insanely fast, headless browser, how can you n

Assaf Arkin 5.6k Dec 22, 2022
👻 The #1 headless Node.js CMS for professional publishing

Ghost.org | Features | Showcase | Forum | Docs | Contributing | Twitter Love open source? We're hiring Node.js Engineers to work on Ghost full-time Th

Ghost 37k Apr 5, 2021
ApostropheCMS is a full-featured, open-source CMS built with Node.js that seeks to empower organizations by combining in-context editing and headless architecture in a full-stack JS environment.

ApostropheCMS ApostropheCMS is a full-featured, open source CMS built with Node.js that seeks to empower organizations by combining in-context editing

Apostrophe Technologies 3.9k Jan 4, 2023
Chrome-extension-react-boilerplate - Simple Chrome extension React boilerplate.

Simple Chrome extension React boilerplate This is a simple chrome extension boilerplate made in React to use as a default structure for your future pr

Younes 6 May 25, 2022
A plugin for Strapi Headless CMS that provides the ability to transform the API request or response.

strapi-plugin-transformer A plugin for Strapi that provides the ability to transform the API request and/or response. Requirements The installation re

daedalus 71 Jan 6, 2023
Scriptable Headless Browser

PhantomJS - Scriptable Headless WebKit PhantomJS (phantomjs.org) is a headless WebKit scriptable with JavaScript. The latest stable release is version

Ariya Hidayat 29.1k Jan 5, 2023