API and CLI tool to fetch and query Chome DevTools heap snapshots.

Overview

Puppeteer Heap Snapshot

Capture heap snapshots and query the snapshot for objects matching a set of properties. Read more about it in this blog post.

Install

Install via npm/yarn.

$ npm install puppeteer-heap-snapshot

API

captureHeapSnapshot(target: Puppeteer.Target): HeapSnapshot

Capture a heap snapshot from a Puppeteer page target (obtained via await page.target()). Warning: this data structure can grow very large (1mb - 250mb or more) so be sure you have enough memory available.

Example:

const browser = await Puppeteer.launch();
const page = await browser.newPage();

await page.goTo("https://google.com")

const heapSnapshot = await captureHeapSnapshot(await page.target());

findObjectsWithProperties(heapSnapshot: HeapSnapshot, properties: string[], options?: { ignoreProperties?: string[] }): BuiltHeapValue[]

Find objects in the heap snapshot that include a set of properties. This operation is computationally expensive traversing the large heap snapshot graph and may be slow. To improve performance, you can specify a list of properties in ignoreProperties to skip traversing and serialization. These properties will not be present on the result.

const objects = findObjectsWithProperties(heapSnapshot, ["foo", "bar"]);

// objects = [{
//   "foo": true,
//   "bar": false,
//   "baz": "lorem ipsum"
// }]

Warning: this code is not optimized and may be slow if many objects match your query and/or the matching objects are large.

findObjectWithProperties(heapSnapshot: HeapSnapshot, properties: string[], options?: { ignoreProperties?: string[] }): BuiltHeapValue[]

Identical to findObjectsWithProperties except it throws an error if no object found or more than one object found.

CLI

This package comes with a small CLI that allows you to fetch heap snapshots for URLs and run queries on them.

$ npx puppeteer-heap-snapshot --help
Usage: puppeteer-heap-snapshot [options] [command]

Options:
  --debug               Enable debug mode (non-headless Chrome, debug logging)
  --no-headless         Do not run Chrome in headless mode
  -w, --wait <timeout>  Add a wait time before taking a heap snapshot (default: "10000")
  -h, --help            display help for command

Commands:
  fetch [options]       fetch a heap snapshot for a URL and write to a file
  query [options]       fetch/read a heap snapshot and output the matching objects in JSON
  help [command]        display help for command

For example, fetch from a URL and output matching objects:

$ npx puppeteer-heap-snapshot query -u https://www.instagram.com/p/CVEJmFTgdRw/ -p video_view_count,video_play_count,shortcode,video_url --no-headless | jq .
>> Opening Puppeteer page at: https://www.instagram.com/p/CVEJmFTgdRw/
>> Taking heap snapshot..
[
  {
    "__typename": "GraphVideo",
    "id": "2685313477274358896",
    "shortcode": "CVEJmFTgdRw",
    "dimensions": {
      "height": 1138,
      "width": 640
    },
    "has_audio": true,
    "video_url": "https://scontent-dub4-1.cdninstagram.com/v/t50.2886-16/245967496_255835479890012_5087347215509320349_n.mp4?efg=eyJ2ZW5jb2RlX3RhZyI6InZ0c192b2RfdXJsZ2VuLjY0MC5jbGlwcy5iYXNlbGluZSIsInFlX2dyb3VwcyI6IltcImlnX3dlYl9kZWxpdmVyeV92dHNfb3RmXCJdIn0&_nc_ht=scontent-dub4-1.cdninstagram.com&_nc_cat=108&_nc_ohc=rWf9nUMf15MAX-SnjOC&edm=AABBvjUBAAAA&vs=1007318140116355_4062183883&_nc_vs=HBksFQAYJEdJZ3FxUTVjV09aV3J1Z0FBSjF5Z2ExVzQ1bEdicV9FQUFBRhUAAsgBABUAGCRHSnZtbmc0SUxSVkZlNDBLQU54WGpxeTVyR2M4YnFfRUFBQUYVAgLIAQAoABgAGwAVAAAm6q%2FRme3ItUAVAigCQzMsF0A35mZmZmZmGBJkYXNoX2Jhc2VsaW5lXzFfdjERAHX%2BBwA%3D&ccb=7-4&oe=626815EC&oh=00_AT_n_BkYsvtICC3t_C2HlRaILWv4xsqZAjcZKcRoR36fng&_nc_sid=83d603",
    "video_view_count": 1728940,
    "video_play_count": 3612084,
    "is_video": true,
    "tracking_token": "eyJ2ZXJzaW9uIjo1LCJwYXlsb2FkIjp7ImlzX2FuYWx5dGljc190cmFja2VkIjp0cnVlLCJ1dWlkIjoiYjNmNGRlYjAxMzk1NGZhM2FmNmQ1OWY1YTUwYzEzZmEyNjg1MzEzNDc3Mjc0MzU4ODk2In0sInNpZ25hdHVyZSI6IiJ9",
    "upcoming_event": null,
    "edge_media_to_tagged_user": {
      "edges": []
    },
    "edge_media_to_caption": {
      "edges": [
        {
          "node": {
            "created_at": "1634334356",
            "text": "You can feel the pain through that facial expression! 梁\n @jago.artist\nRome, Italy"
          }
        }
      ]
    },
    "can_see_insights_as_brand": false,
    "caption_is_edited": false,
    "has_ranked_comments": false,
    "like_and_view_counts_disabled": false,
    "comments_disabled": false,
    "commenting_disabled_for_viewer": false,
    "taken_at_timestamp": 1634334355,
    "edge_media_preview_like": {
      "count": 166233,
      "edges": []
    },
    "edge_media_to_sponsor_user": {
      "edges": []
    },
    "is_affiliate": false,
    "is_paid_partnership": false,
    "location": null,
    "nft_asset_info": null,
    "viewer_has_liked": false,
    "viewer_has_saved": false,
    "viewer_has_saved_to_collection": false,
    "viewer_in_photo_of_you": false,
    "viewer_can_reshare": true,
    "owner": {
      "id": "303273692",
      "is_verified": false,
      "profile_pic_url": "https://scontent-dub4-1.cdninstagram.com/v/t51.2885-19/277325903_668349461072817_8676852949764101515_n.jpg?stp=dst-jpg_s150x150&_nc_ht=scontent-dub4-1.cdninstagram.com&_nc_cat=1&_nc_ohc=kh9ga1KrRAMAX9grvVd&edm=AABBvjUBAAAA&ccb=7-4&oh=00_AT_cEiCoW8MI44lLvf9UAyzlx0oFE2nOBKb1fz5egVb36g&oe=626CCED0&_nc_sid=83d603",
      "username": "earthpix",
      "blocked_by_viewer": false,
      "restricted_by_viewer": null,
      "followed_by_viewer": false,
      "full_name": "  EarthPix  ",
      "has_blocked_viewer": false,
      "is_embeds_disabled": false,
      "is_private": false,
      "is_unpublished": false,
      "requested_by_viewer": false,
      "pass_tiering_recommendation": true,
      "edge_owner_to_timeline_media": {
        "count": 8690
      },
      "edge_followed_by": {
        "count": 23020451
      }
    },
    "is_ad": false,
    "coauthor_producers": [],
    "pinned_for_users": [],
    "encoding_status": null,
    "is_published": true,
    "product_type": "clips",
    "title": "",
    "video_duration": 23.9,
    "thumbnail_src": "https://scontent-dub4-1.cdninstagram.com/v/t51.2885-15/245961614_4361781063908112_409992614002041515_n.jpg?stp=c0.249.640.640a_dst-jpg_e35&_nc_ht=scontent-dub4-1.cdninstagram.com&_nc_cat=100&_nc_ohc=n88eii2iM2gAX_UthWA&edm=AABBvjUBAAAA&ccb=7-4&oh=00_AT84-O-4_gUIoKKa2IfsGy4eiw3jCbO09oi4rLA5P_1Nvw&oe=6267DE3F&_nc_sid=83d603",
    // ... <snip>
  }
]
Comments
  • Wait for selector instead of timeout

    Wait for selector instead of timeout

    I think waiting a set amount of time before capturing a snapshot is not ideal.

    Instead I think using Page.waitForSelector() might work better. This means that once the data is loaded and rendered you can capture it without wasting any time. To use this option there'd be a flag like --selector (and perhaps -s) that is passed to the function above directly.

    I would be happy to try implementing this but before I would like some opinions on some details:

    1. Should there also be an option for Page.waitForXPath()? (--xpath?)
    2. Timeout
      1. Should it timeout at some point? After 10s or more?
      2. Should it throw on timeout?

    Personally, I don't see the need for XPath.
    I think it should timeout, but maybe after 30s or so by default and be overridable with --wait.
    I'm not too sure about throwing, but it might be beneficial to know that it timed out.

    opened by melusc 0
  • TypeError: Cannot read properties of undefined (reading 'graph')

    TypeError: Cannot read properties of undefined (reading 'graph')

    Run:

    npx puppeteer-heap-snapshot query -f './heap.heapsnapshot' -p full_name --debug and got:

    TypeError: Cannot read properties of undefined (reading 'graph')
        at compileGraphNodeObject (/usr/local/lib/node_modules/puppeteer-heap-snapshot/dist/cjs/src/build-object.js:88:11)
        at /usr/local/lib/node_modules/puppeteer-heap-snapshot/dist/cjs/src/build-object.js:68:79
        at Array.reduce (<anonymous>)
        at compileGraphNodeObject (/usr/local/lib/node_modules/puppeteer-heap-snapshot/dist/cjs/src/build-object.js:65:26)
        at /usr/local/lib/node_modules/puppeteer-heap-snapshot/dist/cjs/src/build-object.js:68:79
        at Array.reduce (<anonymous>)
        at compileGraphNodeObject (/usr/local/lib/node_modules/puppeteer-heap-snapshot/dist/cjs/src/build-object.js:65:26)
        at /usr/local/lib/node_modules/puppeteer-heap-snapshot/dist/cjs/src/build-object.js:68:79
        at Array.reduce (<anonymous>)
        at compileGraphNodeObject (/usr/local/lib/node_modules/puppeteer-heap-snapshot/dist/cjs/src/build-object.js:65:26)
    

    Heap snapshot https://drive.google.com/file/d/1BHiPsK4bVA-v0vPFvS-2siFrHWFyWS6Y/view?usp=sharing

    opened by ivorpad 0
  • [Error: EISDIR: illegal operation on a directory, open 'C:\puppeteer-heap-snapshot']

    [Error: EISDIR: illegal operation on a directory, open 'C:\puppeteer-heap-snapshot']

    Hi, I'm testing with Win10 and Node and got this error.

    [Error: EISDIR: illegal operation on a directory, open 'C:\puppeteer-heap-snapshot']

    how to fix?

    opened by angelorubin 0
  • feat: add options to set depth and to skip unwanted node names

    feat: add options to set depth and to skip unwanted node names

    This PR brings following changes:

    1. extends options object that can be passed to query (both from CLI and as direct dependency) with depth value (to have ability explicitly limit recursions) (CLI: depth / prop: maxDepth) and unwanted node names list^ (CLI: exclude / prop: unwantedNodeNames);
    2. enables stricter ts rules;
    3. rest of little minor changes.

    ^ These changes can help to overcome situations like one that described in #1.

    Example of usage: npx puppeteer-heap-snapshot.js query -u https://polypane.app/css-specificity-calculator/ -p href -e Location,HTMLAnchorElement,URL,HTMLLinkElement,SVGGradientElement,SVGFilterElement -d 1.

    opened by Nedgeva 0
  • Unknown or unsupported object with type 'Location'

    Unknown or unsupported object with type 'Location'

    Hi,

    I am trying to capture hrefs from the website below :

    https://www.sarenza.com

    with following code snippet :

    const Puppeteer = require("puppeteer");
    const { captureHeapSnapshot, findObjectsWithProperties } = require("puppeteer-heap-snapshot");
    
    const start = async () => {
        const browser = await Puppeteer.launch();
        const page = await browser.newPage();
    
        await page.goto("https://www.sarenza.com");
    
        let heapSnapshot = await captureHeapSnapshot(await page.target());
    
        console.log('heapSnapshot:', findObjectsWithProperties(heapSnapshot, ['href']));
    }
    
    start();
    

    I got this issue :

    (node:38964) UnhandledPromiseRejectionWarning: Error: Unknown or unsupported object with type 'Location'
        at compileGraphNodeObject (C:\ws\white-label\code\test\pupeteer-heap-snapshot\node_modules\puppeteer-heap-snapshot\dist\cjs\src\build-object.js:75:19) 
        at buildObjectFromNodeId (C:\ws\white-label\code\test\pupeteer-heap-snapshot\node_modules\puppeteer-heap-snapshot\dist\cjs\src\build-object.js:34:12)  
        at C:\ws\white-label\code\test\pupeteer-heap-snapshot\node_modules\puppeteer-heap-snapshot\dist\cjs\src\query.js:16:57
        at Array.map (<anonymous>)
        at findObjectsWithProperties (C:\ws\white-label\code\test\pupeteer-heap-snapshot\node_modules\puppeteer-heap-snapshot\dist\cjs\src\query.js:14:20)     
        at start (C:\ws\white-label\code\test\pupeteer-heap-snapshot\index.js:16:34)
        at processTicksAndRejections (internal/process/task_queues.js:95:5)
    (Use `node --trace-warnings ...` to show where the warning was created)
    (node:38964) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). To terminate the node process on unhandled promise rejection, use the CLI flag `--unhandled-rejections=strict` (see https://nodejs.org/api/cli.html#cli_unhandled_rejections_mode). (rejection id: 1)
    (node:38964) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.
    

    Thanks for replies.

    opened by yohikofox 4
Releases(v0.1.7)
Owner
Adrian Cooney
Adrian Cooney
React-query devtools for swr

React-query devtools for swr

Erfan Khadivar 12 Aug 14, 2022
A simple query builder, it will helps to develop DSL query for Elasticsearch

Elasticsearch Dynamic Query Builder A simple query builder, it will helps to develop DSL query for elasticsearch Installation You can start it from np

Hashemi Rafsan 4 Nov 20, 2022
CDK construct to periodically take snapshots of RDS databases, sanitize them, and share with selected accounts.

CDK Construct for RDS Sanitized Snapshots Periodically take snapshots of RDS databases, sanitize them, and share with selected accounts. Use this to a

CloudSnorkel 6 Dec 7, 2022
A TypeScript library for creating dependency snapshots.

Dependency Submission Toolkit @github/dependency-submission-toolkit is a TypeScript library for creating dependency snapshots and submitting them to t

GitHub 19 Nov 22, 2022
On-chain snapshots of the whole blockchain state

?? Snapshop ?? Snapshop is a tool for creating on-chain snapshots of the whole blockchain state. It lets your smart contracts read the storage of any

Igor Żuk 56 Sep 26, 2022
A CLI tool to create a NodeJS project with TypeScript CTSP is a CLI tool to make easier to start a new NodeJS project and configure Typescript on it.

CTSP- Create TS Project A CLI tool to create a NodeJS project with TypeScript CTSP is a CLI tool to make easier to start a new NodeJS project and conf

Jean Rodríguez 7 Sep 13, 2022
基于React开发的新一代web调试工具,支持React组件调试,类似于Chrome Devtools。A Lightweight, Easy To Extend Web Debugging Tool Build With React

English | 简体中文 基于React开发的移动web调试工具 更新日志 简单易用 功能全面 易扩展 高性能 使用cdn方式,一键接入 类Chrome devtools, 内嵌React开发者工具,支持日志,网络,元素,代理,存储,性能等, 具有更好的网络捕获能力和丰富的日志展现形式 暴露内部

腾讯TNTWeb前端团队 236 Dec 25, 2022
This simple project aims to connect to an API to fetch score data and display it on a LeaderBoard box, as well as provide the tool to submit a new score.

Leader Board: Hit the API! This simple project aims to connect to an API to fetch score data and display it on a LeaderBoard box, as well as provide t

Andrés Felipe Arroyave Naranjo 12 Apr 6, 2022
The universal DevTools for LIFF (WebView) browser

LIFF Inspector ?? The universal DevTools for LIFF (WebView) browser LIFF Inspector is the official DevTools for LIFF(LNE Frontend Framework) that is i

LINE 34 Dec 19, 2022
Export Cypress Tests from Google Chrome DevTools' Recorder

@cypress/chrome-recorder This repo provides tools to export Cypress Tests from Google Chrome DevTools' Recordings Installation $ npm install -g @cypre

Cypress.io 162 Dec 20, 2022
Generate WebdriverIO Tests from Google Chrome DevTools Recordings.

WebdriverIO Chrome Recorder This repo provide tools to convert JSON user flows from Google Chrome DevTools Recorder to WebdriverIO test scripts progra

WebdriverIO 10 Sep 28, 2022
"Choose your Pokemon" is a Webpack project meant to fetch data from two different APIs: PokéAPI and Involvement API

"Choose your Pokemon" is a Webpack project meant to fetch data from two different APIs: PokéAPI and Involvement API. Here we display a list of 20 Pokemons for whom one can like, display more info, and comment; all based on the data from these two external resources.

Carlos HerverSolano 19 Mar 31, 2022
Script to fetch all NFT owners using moralis API. This script output is a data.txt file containing all owner addresses of a given NFT and their balances.

?? Moralis NFT Snapshot Moralis NFT API will only return 500 itens at a time when its is called. For that reason, a simple logic is needed to fetch al

Phill Menezes 6 Jun 23, 2022
A script for Obsidian's QuickAdd plugin, to fetch books data using Google Books API.

Books script for Obsidian's Quickadd plugin Demo If this script helped you and you wish to contribute :) Description This script allows you to easily

Elaws 10 Dec 31, 2022
A Fetch API-compatible PlanetScale database driver

PlanetScale Serverless Driver for JavaScript A Fetch API-compatible PlanetScale database driver for serverless and edge compute platforms that require

PlanetScale 255 Dec 27, 2022
A ts/js pkg to query the OnAir Airline Manager's API.

Typescript/Javascript package for querying the OnAir API A Typescript/Javascript wrapper around the OnAir Airline Manager's API. Installation npm i -s

Virtual Airline Management System 2 Dec 15, 2022
This simple project, show how work with async Fetch, function component and class component

Getting Started with Create React App This project was bootstrapped with Create React App. Available Scripts In the project directory, you can run: np

DaliyaAsel 2 Feb 17, 2022
A NodeJs service which allows you to create a movie based on it's title (additional movie details will be fetched) and fetch all created movies.

movies-api A NodeJs service which allows you to create a movie based on it's title (additional movie details will be fetched) and fetch all created mo

Ugochukwu Ejiogu 2 Mar 27, 2022
Search, fetch, and get data regarding United States presidents.

us-presidents Search, fetch, and get data regarding United States presidents. GitHub Documention Discord Examples Installation NPM npm install us-pres

Spen 3 May 7, 2022