Puppeteer

NPM Docs

Puppeteer is a Node library which provides a high-level API to control Chrome or Chromium over the DevTools Protocol

Installation

npm i puppeteer

If 'Failed to download Chromium' error occur during puppeteer installation then use below installation process

PUPPETEER_DOWNLOAD_HOST=https://storage.googleapis.com.cnpmjs.org npm i puppeteer
  • It downloads a recent version of Chromium

  • To skip the download Chromium, use PUPPETEER_SKIP_CHROMIUM_DOWNLOAD Environment Variable

Environment Variables

Link

Environment Variables are used to aid operations on Puppeteer

puppeteer-core

Puppeteer core that doesn't download Chromium by default

npm i puppeteer-core
  • puppeteer-core is intended to be a lightweight version of Puppeteer for launching an existing browser installation or for connecting to a remote one

  • The version of installed puppeteer-core is compatible with the browser which is intend to connect to

puppeteer vs puppeteer-core

Link

  • puppeteer is a product for browser automation. When installed, it downloads a version of Chromium
  • puppeteer-core is a library to help drive anything that supports DevTools protocol. puppeteer-core doesn't download Chromium when installed.

Where to use puppeteer-core

  • you're building another end-user product or library atop of DevTools protocol. For example, one might build a PDF generator using puppeteer-core and write a custom install.js script that downloads headless_shell instead of Chromium to save disk space.

  • you're bundling Puppeteer to use in Chrome Extension / browser with the DevTools protocol where downloading an additional Chromium binary is unnecessary.

  • you're building a set of tools where puppeteer-core is one of the ingredients and you want to postpone install.js script execution until Chromium is about to be used.

Usage

Require

require puppeteer

const puppeteer = require('puppeteer');

require puppeteer-core

const puppeteer = require('puppeteer-core');

Note:

You will then need to call puppeteer.connect([options]) or puppeteer.launch([options]) with an explicit executablePath option if you require 'puppeteer-core'

const browser = await puppeteer.launch({executablePath: '/path/to/Chrome'});

Normally in window machine

await puppeteer.launch({executablePath:
'C:/Program Files (x86)/Google/Chrome/Application/chrome.exe'
});

puppeteer.launch(options)

Link

Set of configurable options to set on the browser.

const browser = await puppeteer.launch();

Open new page

const page = await browser.newPage();

Go To URL

await page.goto('http://xyz.abc');

Take screen shot

await page.screenshot({path: '<image_name>.<extension>'});

page.setViewport()

Link

Puppeteer sets an initial page size to 800×600px, which defines the screenshot size. The page size can be customized with Page.setViewport().

await page.setViewport({
width: 640,
height: 480,
deviceScaleFactor: 1,
});

Create PDF

Link

await page.pdf({path: '<file_name>.pdf', format: 'A4'});

Page.evaluate()

Link

// Get the "viewport" of the page, as reported by the page.
const dimensions = await page.evaluate(() => {
return {
width: document.documentElement.clientWidth,
height: document.documentElement.clientHeight,
deviceScaleFactor: window.devicePixelRatio
};
});
console.log('Dimensions:', dimensions);

Close Browser

await browser.close();

Differences b/w Chromium and Chrome

Link

  • Chromium is an open-source browser project that forms the basis for the Chrome web browser.
  • When Google first introduced Chrome back in 2008, they also released the Chromium source code on which Chrome was based as an open-source project. That open-source code is maintained by the Chromium Project, while Chrome itself is maintained by Google.

Resources

Debugging

  1. Turn off headless mode - sometimes it's useful to see what the browser is displaying. Instead of launching in headless mode, launch a full version of the browser using headless: false:

    const browser = await puppeteer.launch({headless: false});
  2. Slow it down - the slowMo option slows down Puppeteer operations by the specified amount of milliseconds. It's another way to help see what's going on.

    const browser = await puppeteer.launch({
    headless: false,
    slowMo: 250 // slow down by 250ms
    });
  3. Capture console output - You can listen for the console event. This is also handy when debugging code in page.evaluate()

    page.on('console', msg => console.log('PAGE LOG:', msg.text()));
    await page.evaluate(() => console.log(`url is ${location.href}`));
  4. Use debugger in application code browser

    There are two execution context: node.js that is running test code, and the browser running application code being tested. This lets you debug code in the application code browser; ie code inside evaluate().

    • Use {devtools: true} when launching Puppeteer:

      const browser = await puppeteer.launch({devtools: true});
    • Change default test timeout:

      • jest: jest.setTimeout(100000);

      • jasmine: jasmine.DEFAULT_TIMEOUT_INTERVAL = 100000;

      • mocha: this.timeout(100000);

    • Add an evaluate statement with debugger inside / add debugger to an existing evaluate statement:

      await page.evaluate(() => {debugger;});
  5. Use debugger in node.js

    • Add debugger;

      debugger;
      await page.click('a[target=_blank]');
    • Set headless to false

    • Run node --inspect-brk, eg node --inspect-brk node_modules/.bin/jest tests

    • In Chrome open chrome://inspect/#devices and click inspect

    • In the newly opened test browser, type F8 to resume test execution

    • Now your debugger will be hit and you can debug in the test browser

  6. Enable verbose logging - internal DevTools protocol traffic will be logged via the debug module under the puppeteer namespace.

  7. Debug your Puppeteer (node) code easily, using ndb

    • npm install -g ndb (or even better, use npx!)

    • add a debugger to your Puppeteer (node) code

    • add ndb (or npx ndb) before your test command. For example:

      ndb jest or ndb mocha (or npx ndb jest / npx ndb mocha)

    • debug your test inside chromium