sqlite3 in ur indexeddb

Overview

This is an absurd project.

It implements a backend for sql.js (sqlite3 compiled for the web) that treats IndexedDB like a disk and stores data in blocks there. That means your sqlite3 database is persisted. And not in the terrible way of reading and writing the whole image at once -- it reads and writes your db in small chunks.

It basically stores a whole database into another database. Which is absurd.

See the demo

Why do that?

IndexedDB is not a great database. It's slow, hard to work with, and has very few advantages for small local apps. Most cases are served better with SQL.

... How well does it work?

It works absurdly well. It consistently beats IndexedDB performance 10 fold or even more.

Why? It's simple once you think about it: since we are reading/writing data in 4K chunks (size is configurable), we automatically batch reads and writes. If you want to store 1 million objects into IDB, you need to do 1 million writes. With this absurd backend, it only needs to do ~12500 writes.

Usually when doing this kind of thing, there is a serious downside. But in this case, there isn't really. We get access to tons of features we didn't have before: views, full-text search, proper indexes, anything sqlite3 can do. It's a win-win.

The only real downside is you have to download a 1MB WebAssembly file. That might be a non-starter for you, but for any real apps that's fine.

There's one catch: it requires SharedArrayBuffer. Safari is the last browser to not enable it yet, but it's going to. In other browsers you need some special headers to enable it, but this is fine. In the future it will be available in all browsers.

How do I use it?

This is very early stages, but first you install the packages:

yarn add @jlongster/sql.js absurd-sql.js-backend

Right now you need to use my fork of sql.js, but I'm going to open a PR and hopefully get it merged. The changes are minimal.

The following code will get you up and running:

import initSqlJS from '@jlongster/sql.js';
import { BlockedFS } from 'absurd-sql.js-backend';
import IndexedDBBackend from 'absurd-sql.js-backend/dist/indexeddb-backend');

async function run() {
   // Initialize sql.js (loads wasm)
  let SQL = await initSqlJS();

  // Create the backend and filesystem
  let backend = new IndexedDBBackend(4096);
  let BFS = new BlockedFS(SQL.FS, backend);

  // For now, we need to initialize some internal state. This
  // API will be improved
  SQL.register_for_idb(BFS);
  await BFS.init();

  // Mount the filesystem
  FS.mount(BFS, {}, '/blocked');

  let db = new SQL.Database('/blocked/db.sqlite', { filename: true });

  // Always use a memory journal; writing it makes no sense
  db.exec('PRAGMA journal_mode=MEMORY;')

  // Use sqlite and never lose data!
}

If you look in your IndexedDB database, you should see something like this:

Screen Shot 2021-07-21 at 12 12 26 PM

How does it work?

I will write this out more later, but there are many fun tricks that make this work:

SharedArrayBuffer and Atomics.wait

The biggest problem is when sqlite does a read or write, the API is totally synchronous because it's based on the C API. Accessing IndexedDB is always async, so how do we get around that?

We spawn a read/write process and give it a SharedArrayBuffer and then use the Atomics API to communicate via the buffer. For example, our backends writes a read request into the shared buffer, and the worker reads it, performs the read async, and then writes the result back.

The real magic is the Atomics.wait API. It's a beautiful thing. When you call it, it completely blocks JS until the condition is met. You use it to wait on some data in the SharedArrayBuffer, and this is what enables us to turn the async read/write into a sync one. The backend calls it to wait on the result from the worker and blocks until it's done.

Long-lived IndexedDB transactions

IndexedDB has an awful behavior where it auto-commits transactions once the event loop is done processing. This makes it impossible to use a transaction over time, and requires you to create a new one if you are doing many reads over time. Creating a transaction is super slow and this is a massive perf hit.

However, Atomics.wait is so great. We also use it in the read/write worker to block the process which keeps transactions alive. That means while processing requests from the backend, we can reuse a transaction for all of them. If 1000 reads come through, we will use the same readonly transaction for all of them, which is a massive speedup.

Automatically choosing between get and cursors

Because we can keep a transaction for reads over time, we can use IndexedDB cursors to iterate over data when handling sequential read requests. There's a lot of interesting tradeoffs here because opening a cursor is actually super slow in some browsers, but iterating is a lot faster than many get requests. This backend will intelligently detect when several sequential reads happen and automatically switch to using a cursor

Leverage IndexedDB transaction semantics for locking

We fully embrace IndexedDB transaction semantics to ensure correct ordering of read/writes. We map sqlite's lock/unlock requests to transactions in a way that works (still needs to be 100% verified), and the best thing about this is a database can never leave a lock open.

Browsers already handle terminating IDB transactions in weird situations. Because we only rely on IDB transactions, our locks will get properly terminated as well.

Browser differences

If you look at the demo, we insert 1,000,000 items into a database and scan through all of them with a SELECT COUNT(*) FROM kv). This causes a lot of reads. We've recorded a lot of statistics for how IDB performs across browsers and will write out more soon.

For now, here are a couple things. This is a graph of all the reads recorded during that SQL query. The X axis is the time at which the read finished (ms), and Y axis is the time the read took (ms). There are a total of ~12500 reads.

Chrome

Chrome has a p50 read time of .280ms and a total time of ~4.2s:

Screen Shot 2021-07-21 at 12 24 24 PM

Firefox

Firefox has a p50 read time .101 and a total time of ~.1.8s:

Screen Shot 2021-07-21 at 12 33 12 PM

Look how nicely consistent that is.

Others

The demo works in the latest version of Safari Technical Preview if you enable SharedArrayBuffer, but unfortunately high resolution timers are not available. That's sad because Safari seem to have great perf, similar to Firefox. Chrome is the real slow one here.

I haven't tried other browsers.

We should run these stats with a lot of other types of queries as well, and I'll do that in the future.

Comments
  • IDBMutableFile / webkitRequestFileSystem

    IDBMutableFile / webkitRequestFileSystem

    Chrome and firefox each have a non-standard way of storing and retrieving flat files that can be much faster and less overhead than using IDB for file-like data. In the browser I normally use https://github.com/random-access-storage/random-access-web to pull in all the hacks required and there is an IDB fallback if the faster versions aren't detected.

    Using the random-access-storage API you'd get all of these backends with less work but that would involve replacing or wrapping the storage layer. I'm doing this for a similar wasm project (a spatial database) written in rust when it runs in the browser.

    But all of that would make the project more useful and less absurd.

    opened by ghost 6
  • How does query work (without loading everything into memory)?

    How does query work (without loading everything into memory)?

    Nice work for exploring a new way to improve the persistent storage on web. The performance improvement is very impressive.

    I have a question around querying. In your blog, you mentioned it never loads the database into memory because it only loads whatever SQLite asks for. It confuses me how the library is able to execute SQL statements against ArrayBuffers persisted in idb without loading them into memory first. Can you please share more details on the memory usage of the library? Sharing JS heap snapshots would help a lot.

    Thanks!

    opened by zeroliu 3
  • What is the license for this project?

    What is the license for this project?

    Hello! Thanks for the effort you have put in this very cool project. I was wondering if you thought about licensing and are going to add one on the repo since it's currently missing and people may start using it on live project.

    opened by squallstar 2
  • Discussion regarding performance / absurd

    Discussion regarding performance / absurd

    Since this implementation consistently beats IndexedDB implementation AND provides a consistent syntax for SQLite operation (e.g. you may use same-way defined operations like in an electron app) isn't that implementation is actually absurd?

    Good use cases seem pretty clear to me In case there some another, a well-maintained project for the same purpose, maybe it would be good to add a link to that repo in the description?

    opened by linegel 2
  • Kudos and a question

    Kudos and a question

    First, I just wanted to say: WOW. Thank you for all your hard and brilliant work on this.

    Second, my question: do you think this could replace MySQL as far as Wordpress goes? One of the pain-points in my maintenance of several WP sites is the performance issues and quirks of admining MySQL instances. Obviously, it would be a lot of effort for someone (or a group of someones) to substitute absurd since WP seems to be largely MySQL-specific, but with all the sites out there that use WP as a CMS, the gains could be substantial.

    opened by sydbarrett74 2
  • can't  resolve fs

    can't resolve fs

    I'm using Webpack and somehow it can't resolve 'fs'. Do I need a polyfill?

    Note: I'm trying to use it in background js in Chrome Extension

    ERROR in ./node_modules/@jlongster/sql.js/dist/sql-wasm.js 93:81-94
    Module not found: Error: Can't resolve 'fs' in './node_modules/@jlongster/sql.js/dist'
    resolve 'fs' in './node_modules/@jlongster/sql.js/dist'
      Parsed request is a module
      using description file: ./node_modules/@jlongster/sql.js/package.json (relative path: ./dist)
        Field 'browser' doesn't contain a valid alias configuration
        resolve as module
          ./node_modules/@jlongster/sql.js/dist/node_modules doesn't exist or is not a directory
          ./node_modules/@jlongster/sql.js/node_modules doesn't exist or is not a directory
          ./node_modules/@jlongster/node_modules doesn't exist or is not a directory
          ./node_modules/node_modules doesn't exist or is not a directory
          looking for modules in ./node_modules
            single file module
              using description file: ./package.json (relative path: ./node_modules/fs)
                no extension
     @ ./src/worker/index.ts 43:31-59
    
    opened by Debdut 1
  • [Question] Does this help make it easier to build offline app and also with syncing with a centralized database?

    [Question] Does this help make it easier to build offline app and also with syncing with a centralized database?

    I read the blogpost https://jlongster.com/future-sql-web and i suppose absurd makes it way faster to read and write to IndexedDB for offline web apps.

    But I guess, syncing with centralized database like a postgres, is out of scope for this.

    AM I right?

    opened by simkimsia 1
  • Fix fallback doc

    Fix fallback doc

    The fallback mode was not working in my project due to the fallback code was missed in doc, but present at https://github.com/jlongster/absurd-sql/blob/master/src/examples/bench/main.worker.js#L99

    opened by quolpr 0
  • absurd-sql.js-backend package not found

    absurd-sql.js-backend package not found

    In Readme.md instead of :

    yarn add @jlongster/sql.js absurd-sql.js-backend
    

    Should be:

    yarn add @jlongster/sql.js absurd-sql
    

    I guess.

    Otherwise, the package not found https://www.npmjs.com/package/absurd-sql.js-backend

    opened by quolpr 0
  • Asincify version of SQL.js (no COOP required!)

    Asincify version of SQL.js (no COOP required!)

    I was able to achieve the same performance as absurd-sql has, but with asincify version of SQL.js(using wa-sqlite).

    Here is the comparison:

    https://user-images.githubusercontent.com/7958527/179220916-6d28aaa3-9ca4-453b-bccb-677aec516b83.mp4

    And, you can also run some benchmarks at https://cerulean-bublanina-96bf3a.netlify.app/

    The results are roughly the same, but no COOP is required anymore!

    Here is the source code of backend https://github.com/trong-orm/wa-sqlite-web-backend/blob/main/src/IDBCachedWritesVFS.ts (it's a bit messy). I used the same tricks as absurd-sql do — don't write till transaction finish, use cursor if sequence read.

    The problem now is that wa-sqlite is under GPL3 license, so we need someone who make MIT version of wa-sqlite 😅

    opened by quolpr 2
  • Fork of abusrd-sql with merged PRs and planned improvements

    Fork of abusrd-sql with merged PRs and planned improvements

    I made a fork of absurd-sql — https://github.com/trong-orm/better-absurd-sql/edit/main/README.md. For now, absurd-sql looks more abandoned than live. I also asked James for the access, but he didn't answer, unfortunately 😓. And it's ok, he has not so much free time, I totally understand. So I managed to fork it and to continue work on it. If James starts merging PRs(I hope so!), I will close this repo.

    opened by quolpr 0
  • Fix bulkSet

    Fix bulkSet

    ~I think it may potentially fix https://github.com/jlongster/absurd-sql/issues/30~

    After that fix, the writing performance become slower. But we should await before all the writes finish to finish the transaction. Otherwise, the next read IDB (when you want to execute SELECT for example) will need to await when previous INSERT will finish, which is not fair for the SELECT query. And, it seems the benchmarks are wrong at the article (but not so much! The write performance is still amazing).

    Also, I don't see any performance penalty of Promise.all usage — the new IDB transactions are not spawning, the overall performance at the first glance is the same.

    opened by quolpr 2
  • Run absurd-sql in a serviceworker

    Run absurd-sql in a serviceworker

    I did some quick tests to run absurd-sql in a serviceworker, but it fails like this:

    serviceworker.js:63 Failed to open the database Yb.w.ErrnoError.w.ErrnoError {node: undefined, Oa: 20, message: 'FS error', hd: ƒ}Oa: 20hd: ƒ (c)message: "FS error"node: undefined[[Prototype]]: Error at Object.Yb (webpack://absurd-example-project/./node_modules/@jlongster/sql.js/dist/sql-wasm.js?:160:231) at Object.lc (webpack://absurd-example-project/./node_modules/@jlongster/sql.js/dist/sql-wasm.js?:160:402) at eval (webpack://absurd-example-project/./node_modules/@jlongster/sql.js/dist/sql-wasm.js?:177:15) at new Promise () at initSqlJs (webpack://absurd-example-project/./node_modules/@jlongster/sql.js/dist/sql-wasm.js?:24:24) at openDB (webpack://absurd-example-project/./src/serviceworker.js?:19:77) at eval (webpack://absurd-example-project/./src/serviceworker.js?:47:20) eval @ serviceworker.js:63

    Before diving deeper I would like to know if this is something that is supposed to work in general. My idea was to serve static HTML/JS files and images out of an absurd-sql instance and add some sync capabilities to fetch them from a server-side Sqlite DB. But there's probably a workaround in using the Worker from the sample project and store duplicates of the web resources in the Cache Storage API.

    opened by klehmann 3
  • memory leak in sql.js, new package needed

    memory leak in sql.js, new package needed

    @jlongster It appears that the npm package you have psoed for sql.js does not include the following commit

    https://github.com/sql-js/sql.js/commit/ec44c18a421ea2d9934acbc54fc12b31b35635eb

    We are seeing this memory leak and with large data sets its pretty severe

    Your fork does include this commit but the npm package for 1.6.7 looks like it was created Aug 8th which would not have contained this fix

    Is it possible to get a new package created? I notice you are attempting to merge you sql.js branch back into the upstream but it hasn't been completed.

    opened by arsnyder16 3
  • Use cache api instead of indexeddb

    Use cache api instead of indexeddb

    I have worked with indexeddb several times. Every time it was nothing but pain and suffering. Last time I tried something demanding with indexeddb (implementing a blob store for in-browser ipfs), it completely broke. Not just the indexeddb I was working with, but also all other indexeddb based websites on the same browser.

    I have been looking around for alternatives. Obviously local storage won't work, because it is too coarse grained and in any case limited to 5mb or so.

    But there is the browser cache api which might work: https://developer.mozilla.org/en-US/docs/Web/API/Cache

    At least according to the docs, you have complete control over when a user-generated cache is being purged. "An origin can have multiple, named Cache objects. You are responsible for implementing how your script (e.g. in a ServiceWorker) handles Cache updates. Items in a Cache do not get updated unless explicitly requested; they don't expire unless deleted. Use CacheStorage.open() to open a specific named Cache object and then call any of the Cache methods to maintain the Cache."

    And it will be purged either all at once, or not at all. So kinda similar to indexeddb, but with less overhead and overall insanity. I did some experiments, and it seems to work quite well from wasm.

    Browser caches are essential for functioning, so I think they are quite efficient. For example they are not using one file per cache entry. They offer a key value store interface without transactions, but it seems that you should be able to build a sqlite backend / emscripten compatible file system on top of it, similar to https://github.com/jlongster/absurd-sql/tree/master/src/indexeddb

    One cache entry would correspond to a block of a file. Modifying would work by reading, modifying, writing. This is quite similar to how it seems to be done in the indexeddb backend.

    opened by rklaehn 0
Owner
James Long
James Long
A query builder for PostgreSQL, MySQL and SQLite3, designed to be flexible, portable, and fun to use.

knex.js A SQL query builder that is flexible, portable, and fun to use! A batteries-included, multi-dialect (MSSQL, MySQL, PostgreSQL, SQLite3, Oracle

knex 16.9k Jan 4, 2023
A simple Node.js ORM for PostgreSQL, MySQL and SQLite3 built on top of Knex.js

bookshelf.js Bookshelf is a JavaScript ORM for Node.js, built on the Knex SQL query builder. It features both Promise-based and traditional callback i

Bookshelf.js 6.3k Jan 2, 2023
Same as sqlite-tag but without the native sqlite3 module dependency

sqlite-tag-spawned Social Media Photo by Tomas Kirvėla on Unsplash The same sqlite-tag ease but without the native sqlite3 dependency, aiming to repla

Andrea Giammarchi 17 Nov 20, 2022
AlaSQL.js - JavaScript SQL database for browser and Node.js. Handles both traditional relational tables and nested JSON data (NoSQL). Export, store, and import data from localStorage, IndexedDB, or Excel.

Please use version 1.x as prior versions has a security flaw if you use user generated data to concat your SQL strings instead of providing them as a

Andrey Gershun 6.1k Jan 9, 2023
sqlite3 in ur indexeddb

This is an absurd project. It implements a backend for sql.js (sqlite3 compiled for the web) that treats IndexedDB like a disk and stores data in bloc

James Long 3.6k Jan 4, 2023
sqlite3 in ur indexeddb

This is an absurd project. It implements a backend for sql.js (sqlite3 compiled for the web) that treats IndexedDB like a disk and stores data in bloc

James Long 3.6k Dec 30, 2022
A query builder for PostgreSQL, MySQL and SQLite3, designed to be flexible, portable, and fun to use.

knex.js A SQL query builder that is flexible, portable, and fun to use! A batteries-included, multi-dialect (MSSQL, MySQL, PostgreSQL, SQLite3, Oracle

knex 16.9k Jan 4, 2023
A simple Node.js ORM for PostgreSQL, MySQL and SQLite3 built on top of Knex.js

bookshelf.js Bookshelf is a JavaScript ORM for Node.js, built on the Knex SQL query builder. It features both Promise-based and traditional callback i

Bookshelf.js 6.3k Jan 2, 2023
Asynchronous, non-blocking SQLite3 bindings for Node.js

Asynchronous, non-blocking SQLite3 bindings for Node.js. Supported platforms The sqlite3 module works with: Node.js v11.x, v12.x, v13.x and v14.x. Ele

Mapbox 5.6k Jan 4, 2023
Same as sqlite-tag but without the native sqlite3 module dependency

sqlite-tag-spawned Social Media Photo by Tomas Kirvėla on Unsplash The same sqlite-tag ease but without the native sqlite3 dependency, aiming to repla

Andrea Giammarchi 17 Nov 20, 2022
A tiny cross-platform client for SQLite3, with precompiled binaries as the only third-party dependencies.

Tiny SQLite3 A tiny cross-platform client for SQLite3, with precompiled binaries as the only third-party dependencies. A nice feature about this modul

Fabio Spampinato 19 Oct 27, 2022
AlaSQL.js - JavaScript SQL database for browser and Node.js. Handles both traditional relational tables and nested JSON data (NoSQL). Export, store, and import data from localStorage, IndexedDB, or Excel.

Please use version 1.x as prior versions has a security flaw if you use user generated data to concat your SQL strings instead of providing them as a

Andrey Gershun 6.1k Jan 9, 2023
💾 Offline storage, improved. Wraps IndexedDB, WebSQL, or localStorage using a simple but powerful API.

localForage localForage is a fast and simple storage library for JavaScript. localForage improves the offline experience of your web app by using asyn

localForage 21.5k Jan 4, 2023
db.js is a wrapper for IndexedDB to make it easier to work against

db.js db.js is a wrapper for IndexedDB to make it easier to work against, making it look more like a queryable API. Usage Add a reference to db.js in

Aaron Powell 790 Nov 28, 2022
💾 Offline storage, improved. Wraps IndexedDB, WebSQL, or localStorage using a simple but powerful API.

localForage localForage is a fast and simple storage library for JavaScript. localForage improves the offline experience of your web app by using asyn

localForage 21.5k Jan 1, 2023
Working around a Safari IndexedDB bug

Safari 14 IndexedDB fix Safari on macOS Big Sur 11.4 and iOS 14.6 has a nasty bug where IndexedDB requests get lost and never resolve. This library (w

Jake Archibald 43 Sep 21, 2022
Demo showcasing information leaks resulting from an IndexedDB same-origin policy violation in WebKit.

Safari 15 IndexedDB Leaks Description This demo showcases information leaks resulting from an IndexedDB same-origin policy violation in WebKit (a brow

FingerprintJS 101 Nov 5, 2022
PouchDB for Deno, leveraging polyfill for IndexedDB based on SQLite.

PouchDB for Deno PouchDB for Deno, leveraging polyfill for IndexedDB based on SQLite. Usage import PouchDB from 'https://deno.land/x/[email protected]

Aaron Huggins 19 Aug 2, 2022
IndexedDB with usability and remote syncing

IndexedDB with usability and remote syncing This is a fork of the awesome idb library, which adds the ability to sync an IndexedDB database with a rem

Damien Arrachequesne 8 Dec 14, 2022
fetch and process data in web worker, store in indexedDB.

Query+ install yarn add query-plus or pnpm add query-plus or npm install query-plus import import { useFetch, usePreFetch } from "query-plus" use

Rod Lewis 5 Aug 29, 2022