An ultra-high performance stream reader for browser and Node.js

Overview

QuickReader

An ultra-high performance stream reader for browser and Node.js, easy-to-use, zero dependency.

NPM Version NPM Install Size GitHub CI

Install

npm i quickreader

Demo

import {QuickReader, A} from 'quickreader'

const res = await fetch('https://unpkg.com/quickreader-demo/demo.bin')
const stream = res.body   // ReadableStream
const reader = new QuickReader(stream)

do {
  const id   = reader.u32() ?? await A
  const name = reader.txt() ?? await A
  const age  = reader.u8()  ?? await A

  console.log(id, name, age)

} while (!reader.eof)

https://jsbin.com/loyuxad/edit?html,console

With a stream reader, you can read the data in the specified format while downloading, which makes the user experience better. You don't have to do the chunk slicing and buffering yourself, the reader does all that.

Without it, you would have to wait for all the data to be downloaded before you could read it (e.g., via DataView). Since JS doesn't support structures, you have to pass in an offset parameter for each read, which is inconvenient to use.

Why Quick

We used two tricks to improve performance:

  • selective await

  • synchronized EOF

selective await

The overhead of await is considerable, here is a test:

let s1 = 0, s2 = 0

console.time('no-await')
for (let i = 0; i < 1e7; i++) {
  s1 += i
}
console.timeEnd('no-await')   // ~15ms

console.time('await')
for (let i = 0; i < 1e7; i++) {
  s2 += await i
}
console.timeEnd('await')      // Chrome: ~800ms, Safari: ~3000ms

https://jsbin.com/gehazin/edit?html,output

The above two cases do the same thing, but the await one is 50x to 200x slower than the no-await. On Chrome it's even ~2000x slower if the console is open (only ~500 await/ms).

This test seems meaningless, but in fact, sometimes we call await heavily in an almost synchronous logic, such as an async query function that will mostly hit the memory cache and return.

async function query(key) {
  if (cacheMap.has(key)) {
    return ...  // 99.9%
  }
  await ...
}

Reading data from a stream has the same issue. For a single integer or tiny text, it takes only a few bytes, in most cases, it can be read directly from the buffer without I/O calls, so await is unnecessary; await is only needed when the buffer is not enough.

If await is called only when needed, the overhead can be reduced many times.

console.time('selective-await')
for (let i = 0; i < 1e7; i++) {
  const value = (i % 1000)  // buffer enough?
    ? i         // 99.9%
    : await i   //  0.1%
}
console.timeEnd('selective-await')  // ~40ms 🚀

For QuickReader, when its buffer is enough, it returns the result immediately; otherwise, it returns nothing (undefined), and the result can be obtained by await A.

function readBytes(len) {
  if (len < availableLen) {
    return buf.subarray(offset, offset + len)   // likely
  }
  A = readBytesAsync(len)
}

async function readBytesAsync(len) {
  await stream.read()
  ...
}

The calling logic can be simplified into one line using the nullish coalescing operator:

result = readBytes(10) ?? await A

This is both high-performance and easy-to-use.

Note: The A is not a global variable in the real code, it's just an imported object that implements thenable, you can rename it on import.

synchronized EOF

QuickReader always keeps its buffer at least 1 byte, this means when the available buffer length is 4 and call reader.u32(), the result will not be returned immediately, but requires await. The buffer will not be fully read until the stream is closed.

In this way, the EOF state can be detected synchronously, nearly zero overhead.

Node.js

NodeStream is also supported:

const stream = fs.createReadStream('/path/to/file')
const reader = new QuickReader(stream)
// ...

More broadly, any AsyncIterable<Uint8Array> can be passed as a stream.

NodeStream has implemented AsyncIterable and the Buffer is a subclass of Uint8Array.

API

By length

  • bytes(len: number) : Uint8Array

  • skip(len: number) : number

  • txtNum(len: number) : string

By delimiter

  • bytesTo(delim: number) : Uint8Array

  • skipTo(delim: number) : number

  • txtTo(delim: number) : string

Helper

  • txt() : string - Equivalent to txtTo(0).

  • txtLn() : string - Equivalent to txtTo(10).

Numbers

  • {u, i}{8, 16, 32, 16be, 32be}() : number

  • {u, i}{64, 64be}() : bigint

  • f{32, 64, 32be, 64be}() : number

More: index.d.ts

Usage Rules

The ?? await A must be executed immediately after each reading, otherwise something will go wrong.

It is better to use TypeScript. When you forget to add ?? await A, the type of result will be unioned with undefined, which makes it easier to expose the issue.

const id = reader.u32()   // number | undefined
id.toString()             // ❌

Since the A is consumed immediately after it is generated, even if there are multiple QuickReader instances, they will not conflict with each other.

Concurrency

The same reader is not allowed to be called by multiple co-routines in parallel, as this would break the waiting order. Therefore, the following logic should not be used:

reader = new QuickReader(stream)

async function routine() {
  do {
    const id = reader.u32() ?? await A
    const name = reader.txt() ?? await A
    // ...
  } while (!reader.eof)
}
// ❌
for (let i = 0; i < 10; i++) {
  routine()
}

Read Line

QuickReader can also be used for line-by-line reading. It reduces the overhead by ~60% compared to the Node.js' native readline module, because its parsing logic is simpler, e.g. using only \n delimiter (ignoring \r).

const stream = fs.createReadStream('log.txt')
const reader = new QuickReader(stream)

// no error if the file does not end with '\n'
reader.eofAsDelim = true

do {
  const line = reader.txtLn() ?? await A
  // ...
} while (!reader.eof)

Of course, as mentioned above, concurrency is not supported. If there are multiple co-routines reading the same file, it is better to use the native readline module:

import * as fs from 'node:fs'
import * as readline from 'node:readline'

const stream = fs.createReadStream('urls.txt')
const rl = readline.createInterface({input: stream})
const iter = rl[Symbol.asyncIterator]()

async function routine() {
  for (;;) {
    const {value: url} = await iter.next()
    if (!url) { 
      break
    }
    const res = await fetch(url)
    // ...
  }
}

for (let i = 0; i < 100; i++) {
  routine()
}

About

The idea of this project was born when the await keyword was introduced. The earliest solution was:

const result = reader.read() || await A

Since the || operator will also short-circuit 0 and '', so it was not perfect, until the ?? was introduced in ES2020.

However, the performance of await had been greatly improved compared to the past, so it was not as meaningful as it was then. Anyway, I still share this idea, even if it is 2022 now, after all, performance optimization is never-ending.

Due to limited time and English, the document and some code comments (e.g. index.d.ts) were translated via Google, hopefully someone will improve it.

License

MIT

You might also like...

Crawler Crypto using NodeJS for performance with Elasticsearch DB for high efficiency.

Coin crawler - Coingecko version Crawler using NodeJS for performance with Elasticsearch DB for high efficiency. Requirements For development, you wil

Jan 20, 2022

High performance (de)compression in an 8kB package

fflate High performance (de)compression in an 8kB package Why fflate? fflate (short for fast flate) is the fastest, smallest, and most versatile pure

Dec 28, 2022

startupDB is an Express middleware function implementing a high-performance in-memory database

startupDB startupDB is a database designed to create REST APIs. It is implemented as an Express middleware function and allows for easy implementation

Jul 26, 2022

high performance、complex interaction table

功能描述 1、高性能、满足复杂交互的编辑表格 2、基于: antd4(https://ant.design/index-cn) ag-grid(https://www.ag-grid.com/) 3、基于原生ag-grid 的API进行封装 一、主要功能 将按下列顺序逐步迭代 1、通用编辑功能 🚧

Feb 15, 2022

High performance JavaScript templating engine

High performance JavaScript templating engine

art-template English document | 中文文档 art-template is a simple and superfast templating engine that optimizes template rendering speed by scope pre-dec

Jan 3, 2023

The brand new @shopify/flash-list high performance list component can be used on TV as well as on phones!

FlashListTV The brand new @shopify/flash-list high performance list component can be used on TV as well as on phones! Quick start: Clone this repo Cha

Oct 27, 2022

A TypeScript implementation of High-Performance Polynomial Root Finding for Graphics (Yuksel 2022)

Nomial Nomial is a TypeScript implementation of Cem Yuksel's extremely fast, robust, and simple root finding algorithm presented in the paper "High-Pe

Aug 3, 2022

🍭 search-buddy ultra lightweight javascript plugin that can help you create instant search and/or facilitate navigation between pages.

🍭 search-buddy ultra lightweight javascript plugin that can help you create instant search and/or facilitate navigation between pages.

🍭 search-buddy search-buddy is an open‑source ultra lightweight javascript plugin (* 1kb). It can help you create instant search and/or facilitate n

Jun 16, 2022

An ultra-lightweight self-hosted CI solution with a dashboard and containerized runners

An ultra-lightweight self-hosted CI solution with a dashboard and containerized runners

An extremely simple containerized CI server. Ecosystem The Candor ecosystem is straightforward, and entirely containerized. Docker runs on the host ma

Nov 20, 2022
Owner
EtherDream
A Web Hacker & Geeker
EtherDream
Hemsida för personer i Sverige som kan och vill erbjuda boende till människor på flykt

Getting Started with Create React App This project was bootstrapped with Create React App. Available Scripts In the project directory, you can run: np

null 4 May 3, 2022
Kurs-repo för kursen Webbserver och Databaser

Webbserver och databaser This repository is meant for CME students to access exercises and codealongs that happen throughout the course. I hope you wi

null 14 Jan 3, 2023
High performance and SEO friendly lazy loader for images (responsive and normal), iframes and more, that detects any visibility changes triggered through user interaction, CSS or JavaScript without configuration.

lazysizes lazysizes is a fast (jank-free), SEO-friendly and self-initializing lazyloader for images (including responsive images picture/srcset), ifra

Alexander Farkas 16.6k Jan 1, 2023
AppRun is a JavaScript library for developing high-performance and reliable web applications using the elm inspired architecture, events and components.

AppRun AppRun is a JavaScript library for building reliable, high-performance web applications using the Elm-inspired architecture, events, and compon

Yiyi Sun 1.1k Dec 20, 2022
A modern ebook manager and reader with sync and backup capacities for Windows, macOS, Linux and Web

简体中文 | English Koodo Reader A cross-platform ebook reader Download | Preview | Roadmap | Document Preview Feature Format support: EPUB (.epub) Scanned

Troye Guo 8.6k Dec 29, 2022
🔑 Keagate is an open-source, high-performance alternative to popular cryptocurrency payment gateways such as Coinbase Commerce, CoinGate, BitPay, NOWPayments, CoinRemitter, CoinsPaid and more.

⛩️ Keagate – A High-Performance Cryptocurrency Payment Gateway ?? This project is actively in development ?? Table of Contents About the Project Purpo

null 76 Jan 3, 2023
👑 A tiny yet powerful tool for high-performance color manipulations and conversions

Colord is a tiny yet powerful tool for high-performance color manipulations and conversions. Features ?? Small: Just 1.7 KB gzipped (3x+ lighter than

Vlad Shilov 1.2k Jan 3, 2023
A scalable, high-performance feature management and progressive experimentation platform

Introduction & Our Philosophy FeatBit is a scalable, high-performance Feature Management and Progressive Experimentation platform. Feature Management

null 345 Jan 1, 2023
High performance personalization & a/b testing example using Next.js, Edge Middleware, and Builder.io

Next.js + Builder.io Personalization & A/B Testing with Edge Middleware This is a fork of Next.js Commerce with Builder.io integrated and using Edge M

Builder.io 7 Sep 6, 2022
High performance JSX web views for S.js applications

Surplus const name = S.data("world"), view = <h1>Hello {name()}!</h1>; document.body.appendChild(view); Surplus is a compiler and runtime to all

Adam Haile 587 Dec 30, 2022