sqlite3 in ur indexeddb

Last update: Jan 4, 2023

Related tags

Overview

This is an absurd project.

It implements a backend for sql.js (sqlite3 compiled for the web) that treats IndexedDB like a disk and stores data in blocks there. That means your sqlite3 database is persisted. And not in the terrible way of reading and writing the whole image at once -- it reads and writes your db in small chunks.

It basically stores a whole database into another database. Which is absurd.

See the demo

Why do that?

IndexedDB is not a great database. It's slow, hard to work with, and has very few advantages for small local apps. Most cases are served better with SQL.

... How well does it work?

It works absurdly well. It consistently beats IndexedDB performance 10 fold or even more.

Why? It's simple once you think about it: since we are reading/writing data in 4K chunks (size is configurable), we automatically batch reads and writes. If you want to store 1 million objects into IDB, you need to do 1 million writes. With this absurd backend, it only needs to do ~12500 writes.

Usually when doing this kind of thing, there is a serious downside. But in this case, there isn't really. We get access to tons of features we didn't have before: views, full-text search, proper indexes, anything sqlite3 can do. It's a win-win.

The only real downside is you have to download a 1MB WebAssembly file. That might be a non-starter for you, but for any real apps that's fine.

There's one catch: it requires SharedArrayBuffer. Safari is the last browser to not enable it yet, but it's going to. In other browsers you need some special headers to enable it, but this is fine. In the future it will be available in all browsers.

How do I use it?

This is very early stages, but first you install the packages:

yarn add @jlongster/sql.js absurd-sql.js-backend

Right now you need to use my fork of sql.js, but I'm going to open a PR and hopefully get it merged. The changes are minimal.

The following code will get you up and running:

import initSqlJS from '@jlongster/sql.js';
import { BlockedFS } from 'absurd-sql.js-backend';
import IndexedDBBackend from 'absurd-sql.js-backend/dist/indexeddb-backend');

async function run() {
   // Initialize sql.js (loads wasm)
  let SQL = await initSqlJS();

  // Create the backend and filesystem
  let backend = new IndexedDBBackend(4096);
  let BFS = new BlockedFS(SQL.FS, backend);

  // For now, we need to initialize some internal state. This
  // API will be improved
  SQL.register_for_idb(BFS);
  await BFS.init();

  // Mount the filesystem
  FS.mount(BFS, {}, '/blocked');

  let db = new SQL.Database('/blocked/db.sqlite', { filename: true });

  // Always use a memory journal; writing it makes no sense
  db.exec('PRAGMA journal_mode=MEMORY;')

  // Use sqlite and never lose data!
}

If you look in your IndexedDB database, you should see something like this:

How does it work?

I will write this out more later, but there are many fun tricks that make this work:

`SharedArrayBuffer` and `Atomics.wait`

The biggest problem is when sqlite does a read or write, the API is totally synchronous because it's based on the C API. Accessing IndexedDB is always async, so how do we get around that?

We spawn a read/write process and give it a SharedArrayBuffer and then use the Atomics API to communicate via the buffer. For example, our backends writes a read request into the shared buffer, and the worker reads it, performs the read async, and then writes the result back.

The real magic is the Atomics.wait API. It's a beautiful thing. When you call it, it completely blocks JS until the condition is met. You use it to wait on some data in the SharedArrayBuffer, and this is what enables us to turn the async read/write into a sync one. The backend calls it to wait on the result from the worker and blocks until it's done.

Long-lived IndexedDB transactions

IndexedDB has an awful behavior where it auto-commits transactions once the event loop is done processing. This makes it impossible to use a transaction over time, and requires you to create a new one if you are doing many reads over time. Creating a transaction is super slow and this is a massive perf hit.

However, Atomics.wait is so great. We also use it in the read/write worker to block the process which keeps transactions alive. That means while processing requests from the backend, we can reuse a transaction for all of them. If 1000 reads come through, we will use the same readonly transaction for all of them, which is a massive speedup.

Automatically choosing between `get` and cursors

Because we can keep a transaction for reads over time, we can use IndexedDB cursors to iterate over data when handling sequential read requests. There's a lot of interesting tradeoffs here because opening a cursor is actually super slow in some browsers, but iterating is a lot faster than many get requests. This backend will intelligently detect when several sequential reads happen and automatically switch to using a cursor

Leverage IndexedDB transaction semantics for locking

We fully embrace IndexedDB transaction semantics to ensure correct ordering of read/writes. We map sqlite's lock/unlock requests to transactions in a way that works (still needs to be 100% verified), and the best thing about this is a database can never leave a lock open.

Browsers already handle terminating IDB transactions in weird situations. Because we only rely on IDB transactions, our locks will get properly terminated as well.

Browser differences

If you look at the demo, we insert 1,000,000 items into a database and scan through all of them with a SELECT COUNT(*) FROM kv). This causes a lot of reads. We've recorded a lot of statistics for how IDB performs across browsers and will write out more soon.

For now, here are a couple things. This is a graph of all the reads recorded during that SQL query. The X axis is the time at which the read finished (ms), and Y axis is the time the read took (ms). There are a total of ~12500 reads.

Chrome

Chrome has a p50 read time of .280ms and a total time of ~4.2s:

Firefox

Firefox has a p50 read time .101 and a total time of ~.1.8s:

Look how nicely consistent that is.

Others

The demo works in the latest version of Safari Technical Preview if you enable SharedArrayBuffer, but unfortunately high resolution timers are not available. That's sad because Safari seem to have great perf, similar to Firefox. Chrome is the real slow one here.

I haven't tried other browsers.

We should run these stats with a lot of other types of queries as well, and I'll do that in the future.

Comments

IDBMutableFile / webkitRequestFileSystem

Chrome and firefox each have a non-standard way of storing and retrieving flat files that can be much faster and less overhead than using IDB for file-like data. In the browser I normally use https://github.com/random-access-storage/random-access-web to pull in all the hacks required and there is an IDB fallback if the faster versions aren't detected.

Using the random-access-storage API you'd get all of these backends with less work but that would involve replacing or wrapping the storage layer. I'm doing this for a similar wasm project (a spatial database) written in rust when it runs in the browser.

But all of that would make the project more useful and less absurd.

opened by ghost 6
How does query work (without loading everything into memory)?

Nice work for exploring a new way to improve the persistent storage on web. The performance improvement is very impressive.

I have a question around querying. In your blog, you mentioned it never loads the database into memory because it only loads whatever SQLite asks for. It confuses me how the library is able to execute SQL statements against ArrayBuffers persisted in idb without loading them into memory first. Can you please share more details on the memory usage of the library? Sharing JS heap snapshots would help a lot.

Thanks!

opened by zeroliu 3
What is the license for this project?

Hello! Thanks for the effort you have put in this very cool project. I was wondering if you thought about licensing and are going to add one on the repo since it's currently missing and people may start using it on live project.

opened by squallstar 2
Discussion regarding performance / absurd

Since this implementation consistently beats IndexedDB implementation AND provides a consistent syntax for SQLite operation (e.g. you may use same-way defined operations like in an electron app) isn't that implementation is actually absurd?

Good use cases seem pretty clear to me In case there some another, a well-maintained project for the same purpose, maybe it would be good to add a link to that repo in the description?

opened by linegel 2
Kudos and a question

First, I just wanted to say: WOW. Thank you for all your hard and brilliant work on this.

Second, my question: do you think this could replace MySQL as far as Wordpress goes? One of the pain-points in my maintenance of several WP sites is the performance issues and quirks of admining MySQL instances. Obviously, it would be a lot of effort for someone (or a group of someones) to substitute absurd since WP seems to be largely MySQL-specific, but with all the sites out there that use WP as a CMS, the gains could be substantial.

opened by sydbarrett74 2

can't resolve fs

I'm using Webpack and somehow it can't resolve 'fs'. Do I need a polyfill?

Note: I'm trying to use it in background js in Chrome Extension

ERROR in ./node_modules/@jlongster/sql.js/dist/sql-wasm.js 93:81-94
Module not found: Error: Can't resolve 'fs' in './node_modules/@jlongster/sql.js/dist'
resolve 'fs' in './node_modules/@jlongster/sql.js/dist'
  Parsed request is a module
  using description file: ./node_modules/@jlongster/sql.js/package.json (relative path: ./dist)
    Field 'browser' doesn't contain a valid alias configuration
    resolve as module
      ./node_modules/@jlongster/sql.js/dist/node_modules doesn't exist or is not a directory
      ./node_modules/@jlongster/sql.js/node_modules doesn't exist or is not a directory
      ./node_modules/@jlongster/node_modules doesn't exist or is not a directory
      ./node_modules/node_modules doesn't exist or is not a directory
      looking for modules in ./node_modules
        single file module
          using description file: ./package.json (relative path: ./node_modules/fs)
            no extension
 @ ./src/worker/index.ts 43:31-59

opened by Debdut 1

[Question] Does this help make it easier to build offline app and also with syncing with a centralized database?

I read the blogpost https://jlongster.com/future-sql-web and i suppose absurd makes it way faster to read and write to IndexedDB for offline web apps.

But I guess, syncing with centralized database like a postgres, is out of scope for this.

AM I right?

opened by simkimsia 1
Fix fallback doc

The fallback mode was not working in my project due to the fallback code was missed in doc, but present at https://github.com/jlongster/absurd-sql/blob/master/src/examples/bench/main.worker.js#L99

opened by quolpr 0
absurd-sql.js-backend package not found
In Readme.md instead of :

yarn add @jlongster/sql.js absurd-sql.js-backend

Should be:

yarn add @jlongster/sql.js absurd-sql

I guess.

Otherwise, the package not found https://www.npmjs.com/package/absurd-sql.js-backend
opened by quolpr 0
Asincify version of SQL.js (no COOP required!)

I was able to achieve the same performance as absurd-sql has, but with asincify version of SQL.js(using wa-sqlite).

Here is the comparison:

https://user-images.githubusercontent.com/7958527/179220916-6d28aaa3-9ca4-453b-bccb-677aec516b83.mp4

And, you can also run some benchmarks at https://cerulean-bublanina-96bf3a.netlify.app/

The results are roughly the same, but no COOP is required anymore!

Here is the source code of backend https://github.com/trong-orm/wa-sqlite-web-backend/blob/main/src/IDBCachedWritesVFS.ts (it's a bit messy). I used the same tricks as absurd-sql do — don't write till transaction finish, use cursor if sequence read.

The problem now is that wa-sqlite is under GPL3 license, so we need someone who make MIT version of wa-sqlite 😅

opened by quolpr 2
Fork of abusrd-sql with merged PRs and planned improvements

I made a fork of absurd-sql — https://github.com/trong-orm/better-absurd-sql/edit/main/README.md. For now, absurd-sql looks more abandoned than live. I also asked James for the access, but he didn't answer, unfortunately 😓. And it's ok, he has not so much free time, I totally understand. So I managed to fork it and to continue work on it. If James starts merging PRs(I hope so!), I will close this repo.

opened by quolpr 0
Fix bulkSet

~I think it may potentially fix https://github.com/jlongster/absurd-sql/issues/30~

After that fix, the writing performance become slower. But we should await before all the writes finish to finish the transaction. Otherwise, the next read IDB (when you want to execute SELECT for example) will need to await when previous INSERT will finish, which is not fair for the SELECT query. And, it seems the benchmarks are wrong at the article (but not so much! The write performance is still amazing).

Also, I don't see any performance penalty of Promise.all usage — the new IDB transactions are not spawning, the overall performance at the first glance is the same.

opened by quolpr 2
Run absurd-sql in a serviceworker

I did some quick tests to run absurd-sql in a serviceworker, but it fails like this:

serviceworker.js:63 Failed to open the database Yb.w.ErrnoError.w.ErrnoError {node: undefined, Oa: 20, message: 'FS error', hd: ƒ}Oa: 20hd: ƒ (c)message: "FS error"node: undefined[[Prototype]]: Error at Object.Yb (webpack://absurd-example-project/./node_modules/@jlongster/sql.js/dist/sql-wasm.js?:160:231) at Object.lc (webpack://absurd-example-project/./node_modules/@jlongster/sql.js/dist/sql-wasm.js?:160:402) at eval (webpack://absurd-example-project/./node_modules/@jlongster/sql.js/dist/sql-wasm.js?:177:15) at new Promise () at initSqlJs (webpack://absurd-example-project/./node_modules/@jlongster/sql.js/dist/sql-wasm.js?:24:24) at openDB (webpack://absurd-example-project/./src/serviceworker.js?:19:77) at eval (webpack://absurd-example-project/./src/serviceworker.js?:47:20) eval @ serviceworker.js:63

Before diving deeper I would like to know if this is something that is supposed to work in general. My idea was to serve static HTML/JS files and images out of an absurd-sql instance and add some sync capabilities to fetch them from a server-side Sqlite DB. But there's probably a workaround in using the Worker from the sample project and store duplicates of the web resources in the Cache Storage API.

opened by klehmann 3
memory leak in sql.js, new package needed

@jlongster It appears that the npm package you have psoed for sql.js does not include the following commit

https://github.com/sql-js/sql.js/commit/ec44c18a421ea2d9934acbc54fc12b31b35635eb

We are seeing this memory leak and with large data sets its pretty severe

Your fork does include this commit but the npm package for 1.6.7 looks like it was created Aug 8th which would not have contained this fix

Is it possible to get a new package created? I notice you are attempting to merge you sql.js branch back into the upstream but it hasn't been completed.

opened by arsnyder16 3
Use cache api instead of indexeddb

I have worked with indexeddb several times. Every time it was nothing but pain and suffering. Last time I tried something demanding with indexeddb (implementing a blob store for in-browser ipfs), it completely broke. Not just the indexeddb I was working with, but also all other indexeddb based websites on the same browser.

I have been looking around for alternatives. Obviously local storage won't work, because it is too coarse grained and in any case limited to 5mb or so.

But there is the browser cache api which might work: https://developer.mozilla.org/en-US/docs/Web/API/Cache

At least according to the docs, you have complete control over when a user-generated cache is being purged. "An origin can have multiple, named Cache objects. You are responsible for implementing how your script (e.g. in a ServiceWorker) handles Cache updates. Items in a Cache do not get updated unless explicitly requested; they don't expire unless deleted. Use CacheStorage.open() to open a specific named Cache object and then call any of the Cache methods to maintain the Cache."

And it will be purged either all at once, or not at all. So kinda similar to indexeddb, but with less overhead and overall insanity. I did some experiments, and it seems to work quite well from wasm.

Browser caches are essential for functioning, so I think they are quite efficient. For example they are not using one file per cache entry. They offer a key value store interface without transactions, but it seems that you should be able to build a sqlite backend / emscripten compatible file system on top of it, similar to https://github.com/jlongster/absurd-sql/tree/master/src/indexeddb

One cache entry would correspond to a block of a file. Modifying would work by reading, modifying, writing. This is quite similar to how it seems to be done in the indexeddb backend.

opened by rklaehn 0

Owner

James Long

GitHub

A query builder for PostgreSQL, MySQL and SQLite3, designed to be flexible, portable, and fun to use.

knex.js A SQL query builder that is flexible, portable, and fun to use! A batteries-included, multi-dialect (MSSQL, MySQL, PostgreSQL, SQLite3, Oracle

16.9k Jan 4, 2023

A simple Node.js ORM for PostgreSQL, MySQL and SQLite3 built on top of Knex.js

bookshelf.js Bookshelf is a JavaScript ORM for Node.js, built on the Knex SQL query builder. It features both Promise-based and traditional callback i

6.3k Jan 2, 2023

Same as sqlite-tag but without the native sqlite3 module dependency

sqlite-tag-spawned Social Media Photo by Tomas Kirvėla on Unsplash The same sqlite-tag ease but without the native sqlite3 dependency, aiming to repla

17 Nov 20, 2022

AlaSQL.js - JavaScript SQL database for browser and Node.js. Handles both traditional relational tables and nested JSON data (NoSQL). Export, store, and import data from localStorage, IndexedDB, or Excel.

Please use version 1.x as prior versions has a security flaw if you use user generated data to concat your SQL strings instead of providing them as a

6.1k Jan 9, 2023

sqlite3 in ur indexeddb

This is an absurd project. It implements a backend for sql.js (sqlite3 compiled for the web) that treats IndexedDB like a disk and stores data in bloc

3.6k Jan 4, 2023

sqlite3 in ur indexeddb

This is an absurd project. It implements a backend for sql.js (sqlite3 compiled for the web) that treats IndexedDB like a disk and stores data in bloc

3.6k Dec 30, 2022

A query builder for PostgreSQL, MySQL and SQLite3, designed to be flexible, portable, and fun to use.

knex.js A SQL query builder that is flexible, portable, and fun to use! A batteries-included, multi-dialect (MSSQL, MySQL, PostgreSQL, SQLite3, Oracle

16.9k Jan 4, 2023

A simple Node.js ORM for PostgreSQL, MySQL and SQLite3 built on top of Knex.js

bookshelf.js Bookshelf is a JavaScript ORM for Node.js, built on the Knex SQL query builder. It features both Promise-based and traditional callback i

6.3k Jan 2, 2023

Asynchronous, non-blocking SQLite3 bindings for Node.js

Asynchronous, non-blocking SQLite3 bindings for Node.js. Supported platforms The sqlite3 module works with: Node.js v11.x, v12.x, v13.x and v14.x. Ele

5.6k Jan 4, 2023

Same as sqlite-tag but without the native sqlite3 module dependency

sqlite-tag-spawned Social Media Photo by Tomas Kirvėla on Unsplash The same sqlite-tag ease but without the native sqlite3 dependency, aiming to repla

17 Nov 20, 2022

A tiny cross-platform client for SQLite3, with precompiled binaries as the only third-party dependencies.

Tiny SQLite3 A tiny cross-platform client for SQLite3, with precompiled binaries as the only third-party dependencies. A nice feature about this modul

19 Oct 27, 2022

AlaSQL.js - JavaScript SQL database for browser and Node.js. Handles both traditional relational tables and nested JSON data (NoSQL). Export, store, and import data from localStorage, IndexedDB, or Excel.

Please use version 1.x as prior versions has a security flaw if you use user generated data to concat your SQL strings instead of providing them as a

6.1k Jan 9, 2023

sqlite3 in ur indexeddb

Related tags

Overview

Why do that?

... How well does it work?

How do I use it?

How does it work?

SharedArrayBuffer and Atomics.wait

Long-lived IndexedDB transactions

Automatically choosing between get and cursors

Leverage IndexedDB transaction semantics for locking

Browser differences

Chrome

Firefox

Others

Comments

Owner

James Long

A query builder for PostgreSQL, MySQL and SQLite3, designed to be flexible, portable, and fun to use.

A simple Node.js ORM for PostgreSQL, MySQL and SQLite3 built on top of Knex.js

Same as sqlite-tag but without the native sqlite3 module dependency

AlaSQL.js - JavaScript SQL database for browser and Node.js. Handles both traditional relational tables and nested JSON data (NoSQL). Export, store, and import data from localStorage, IndexedDB, or Excel.

sqlite3 in ur indexeddb

sqlite3 in ur indexeddb

A query builder for PostgreSQL, MySQL and SQLite3, designed to be flexible, portable, and fun to use.

A simple Node.js ORM for PostgreSQL, MySQL and SQLite3 built on top of Knex.js

Asynchronous, non-blocking SQLite3 bindings for Node.js

Same as sqlite-tag but without the native sqlite3 module dependency

A tiny cross-platform client for SQLite3, with precompiled binaries as the only third-party dependencies.

AlaSQL.js - JavaScript SQL database for browser and Node.js. Handles both traditional relational tables and nested JSON data (NoSQL). Export, store, and import data from localStorage, IndexedDB, or Excel.

💾 Offline storage, improved. Wraps IndexedDB, WebSQL, or localStorage using a simple but powerful API.

db.js is a wrapper for IndexedDB to make it easier to work against

💾 Offline storage, improved. Wraps IndexedDB, WebSQL, or localStorage using a simple but powerful API.

Working around a Safari IndexedDB bug

Demo showcasing information leaks resulting from an IndexedDB same-origin policy violation in WebKit.

PouchDB for Deno, leveraging polyfill for IndexedDB based on SQLite.

IndexedDB with usability and remote syncing

fetch and process data in web worker, store in indexedDB.

`SharedArrayBuffer` and `Atomics.wait`

Automatically choosing between `get` and cursors