IPLD transaction as CAR buffer [for use in databases]

Overview

car-transaction

IPLD transaction as CAR buffer that can be used as a database transaction.

Usage

import Transaction from 'car-transaction' 

const run = async () => {
  // start a basic transaction
  const t = Transaction.create()

  const subCID = await t.write({ some: 'data' })
  await t.write({ sub: subCID })
  const buffer = await t.commit()

  // read a transaction
  // the last write is always the root
  const { root, get } = await Transaction.load(buffer)
  // root is a cid
  const { sub } = await get(root)
  const { some } = await get(sub)
  // get retrieves the block and decodes it
  if (some !== 'data') throw new Error('data error')
}

run()

Guide to IPLD-over-ObjectStores (S3, R2, etc)

This is how we build a decentralized web of all web3 application data on top of widely available and competitively priced cloud object stores.

IPLD is the data structure layer beneath IPFS. It works in IPFS protocols and outside them, on disc, in memory, etc.

So you can build these little merkle trees with the library above that

  • are encoded in dag-cbor, a fairly efficient format,
  • and all the hash addressed web3 and blockchain stuff works and interops,
  • and it's as easy as working with JSON,
    • with inline binary,
    • and you have these hash links that allow you to make little trees,
    • which is how you can get de-duplication and diffing properties of git, cause it's all merkle graphs,
    • and all the cool graph things you can do with graph databases,
    • and you can also link to data in git, bittorrent, ETH, Bitcoin, etc.
    • you can link to IPFS files,
      • or you can encode those IPFS files into the unixfs block format and include them in the transcations.
  • and those transactions are encoded in a well known format called CAR (kinda like git-pack files for IPFS),
    • and we just released this open source project that is a cloud native implementation of IPFS
    • so all you need is a way to store CAR files and hand the URL to Elastic IPFS
      • which is a cloud native thing and pretty much as hard to operate as all the other cloud native things.
    • but if you don't want to run it yourself, and you really want your data in the public IPFS network just DM me on twitter (@mikeal) and we'll figure something out, because we're already running this and it's not that hard to hook up more data-sources, we just haven't productized it yet, we're just running it to keep ALL the NFT's safe and available (god bless the gifs)

And merkle trees are very cool, you can do all kinds of diffing and CRDT structures, but I won't get into all that yet because, just storing these little trees allow you to build graphs of incredible complexity and if we start there we won't see the basic stuff we also get from leveraging these ObjectStores to store them.

Conceptually, you can think of IPLD-over-ObjectStores as being

  • IPLD databases,
    • that are key/value stores,
    • with a single index,
      • with a fairly powerful query language,
      • that can implement some interesting privacy and access patterns
        • cause hashes,
    • that could represent an IPLD "network,"
      • which you can decide to keep open, closed, limited access, whatever.
    • and they can also represent an IPLD replication set,
      • that could be a filter
      • or an index
      • that can be streaming to another index
        • which has all the same properties as the database we started with recursively until you stop extending this particular network branch and that's how you know you're working with a graph.

And each instance of an ObjectStore can be all of these things simultaneously. These features aren't mutually exclusive, they're combinatory, as long as you follow a few simple patterns.

All these services have a roughly equivalent interface

  • S3 (AWS, DigitalOcean, pretty much every cloud provider has a compatible interface),
  • R2 (Cloudflare's wonderful new product),
  • Also CouchDB, and PouchDB, cause I got roots,
    • and while we're at it, the whole level ecosystem.

Some of them do A LOT more, but they all have at least these properties:

  • You can store a binary value, even if that isn't the default value type.
  • You can store that binary value by a string key.
    • Which forms a stored index,
      • that you can perform range queries against,
    • that, while highly distributed, tends to slow down if you bang on the same keyspace enough.
    • Which is pretty different than some of the databases we're used to. Most open source databases have a local disc optimization in the file writer that, when you group data together like this, will get bulked together.
    • But these big distributed things like S3, the keyspace is distributed across a bunch of machines, so when they do a similar optimization on the read side, adding locality to the keyspace, you lose the distribution of your writes across the bucket.
    • But we're working with hashes!
    • We've got perfectly balanced distribution across a keyspace for days!
    • So if we key things by hash prefixes we'll always evenly distribute across the keyspace.
    • As the keyspace grows, the distribution of writes is even across whatever load balancing any of these vendors are doing which means that the writes just get faster the bigger it gets. It's beautiful to watch.
    • Something I started telling people to do a while back was move from using
      • /$hash to using
      • /$hash/data for their keys in S3.
    • Because S3's performance docs said that performance was only limited "per prefix" which gave an indication into how they were optimizing some of this by looking at the '/' as a key prefix.
    • I pointed about 4K concurrent Lambdas at open data encoding for the Filecoin launch, each one writing thousands of individual blocks this way, so i put a few Billion keys into an S3 bucket as fast as it could take them, and when i went over a billion keys it got noticably faster. I had to ask AWS to raise the cap on our Lambdas (this is way easier now, and is per cloud formation stack)

So we can really blow these things up with IPLD data.

This means that, anything you build on this is something pretty close to the fastest cloud database offering available

  • at whatever price these gigantic companies have driven the price down to in a rapidly commoditizing market.
    • that is now competing with blockchains like Filecoin
      • which you can also store those CAR files in natively.
  • Cloudflare even has free egress w/ R2, and it's cheaper than S3.
  • That's bananas! Free reads!
  • I've been at this a while, I wrote PouchDB in 2010, which apparently you could now configure to write to R2 and get free reads from a CDN!
  • Anyway, you can also write these little graphs into it and they are even more powerful.

And, if you write a cloud function that derives a single string key from the transaction, you've got a query language in all of these vendors for range queries across the index of that key

  • that can return queries with or without the values included,
  • with pagination,
  • and a buncha client libraries that already exist,
  • and HTTP caching infra already built for them and integrated into these vendors.

And if you stick to the rule of only deriving the key from the data itself you never bake outside context into the key that can't be replicated along with the data to another location when you need to solve a new problem.

And of course, you can configure cloud functions to fire on every write,

  • so you can do filtered replication to other buckets and datasources
  • which can create new transactions using the same library above
  • each of which will inherit all the same replication abilities of this database,
  • so there's no longer any differentiation between the capabilties of primary stores and indexes.
  • because we're not building flat databases anymore,
    • this is much more useful, and way cooler,
    • we're just writing branches of gigantic graphs to little (or huge!) transaction tables,
    • so don't think of it as a KV store, the key AND THE VALUE are in or derived from the value data,
    • and that produces a single index over those transactions,
    • and if we want to write multiple indexes for the same data we have two choices,
      • store the data again in two buckets (potentially discarding blocks in the transaction we don't need in the value)
      • or take the hash of the transaction (CAR CID) and put that at the end of the key with a zero byte value.
        • which i don't think S3 knows how to charge you for other than per-request because they don't charge you for keyspace 🀩
      • which gives you the choice between paying for a copy of the data or eating the performance hit of a secondary read when you query those indexes.
  • and since all this data can easily be put on IPFS,
    • all the graphs you write can be read as a single graph by anyone who traverses it
    • and their graphs can link to yours
    • and vice versa
    • and that's why we've been calling it Web3 this whole time 
  • and it's not a blockchain
    • until you put a consensus layer over it.
    • so if you need this to be a blockchain thing,
    • or you really need this NOT to be a blockchain thing,
    • it's whatever one you want it to be.

Because what was looking like a flat database a moment ago is actually

  • an even larger graph database that can travel like a graph itself,
  • mutate into other states,
  • filter out data,
  • combine data from different sources,

And it can do this across

  • different ObjectStores
    • in the same data center,
    • the same cloud provider,
    • different providers,
    • p2p networks,
    • local nodes
      • disc
      • memory
      • browsers
    • that's entirely up to you.

And that string key we're using for the index.

Something I wish more people knew about, is this brilliant library Dominic Tarr wrote

  • who also wrote ssb
  • and is one of the nicest most genuine human beings on the planet
  • who lives on a boat
  • and with an excited smile on his face will describe himself as a "cyber hobo."

He wrote this library that implements the typewise/bytewise encoder/decoder in regular strings. So you can use it for the keys on these ObjectStores and then use the ListObject interfaces to write queries against these more advanced sorting functions.

You might be used to modelling your bucket keys to leverage the "file heirachy" in S3. These products tend to describe this as a feature of '/' to trigger your familiarity with file directory heirarchy, which are more widely understood that something like typewise/bytewise. But under the hood they just implement regular utf8 string sorting, which gives the sorting properties you'd want in a '/' based directory heirarchy.

What you get with charwise is nested sorting within any element of the keyspace to an arbitrary depth. First, just read the rules about which JSON types get ordered where. Now, notice that sorting within objects and arrays enables the nesting of keys.

So you can do something like

[ null, [ 'a' ] ]
[ 'a' , [ 'a' ] ]
[ 'a' , [ 'b', [ 'and on and on' ] ] ]

And since, when you're working with these little graphs, they all have hash addresses, you get to do some cool things within these nested sorting structures.

  • Like if you want to make sure any part of it is evenly distributed, use a hash that describes the item potency of the index.
  • And if you have some security or privacy context you're enforcing over reading the index, you put the hash of something they would need to know into the nested sorting structure, which saves you from maintaining a secondary index of permissions
    • cause cryptography is really cool like that.

Just remember,

  • you'll want to get something with a hash in it into something before / when you derive the key, that'll force distribution across all the key's you're writing.
  • And if you want to optimize for the reader, you'll need to bake that into the entire prefix ahead of each / you stick in the key.

Working within those two constraints you can optimize each index for your particular use case.

And there's more, but I'm tired of typing, and I think that the next set of things I write about this will include examples of cool things others are doing with what is already here. Happy hacking!

Much love to all the old Node.js database hackers who used to bounce around Oakland and Berlin and a hundred JavaScript conferences in the 2010's. I miss ya'll and I forget how much cool stuff we figured out that we haven't shared with everyone who didn't read that code.

You might also like...

Eth-explorers-extension - Chrome extension to open Ethereum addresses & transaction hash from any page on popular explorers + dashboards

Eth-explorers-extension - Chrome extension to open Ethereum addresses & transaction hash from any page on popular explorers + dashboards

eth-explorers-extension(s) This repository contains two folders with two extensions that work for address and transactions respectively. 1. eth-addres

Jan 6, 2023

Transaction tracker for Defi Kingdoms and more!

DFK Balances DFK Balances is a super customizable and extensible Financial tracker for the DeFi Kingdoms ecosystem. It allows users to easily add thei

Feb 22, 2022

Octane is a gasless transaction relayer for Solana.

Octane β›½ What is Octane? How does it work? Is this secure? What does Octane want? How do I use it? What is Octane? Octane is a gasless transaction rel

Jan 1, 2023

Hardhat plugin to track gas on the transaction level

Hardhat plugin to track gas on the transaction level

hardhat-gas-trackooor Hardhat plugin to track gas on the transaction level. Example report Installation npm install hardhat-gas-trackooor --save-dev A

Jan 3, 2023

Bruh. Generalized Meta transaction dispatcher.

metuh A generalized meta-transaction library for Ethereum-flavour chains. Not overengineered like all of OpenZeppelin. Inspired heavily by the Lens Di

Mar 22, 2022

A fast & reliable transaction API for web3 Games, Bridges and other projects

Gelato Relay SDK SDK to integrate into Gelato Multichain Relay Table of Contents Installation Introduction Quick Start Payment Types Request Types Sen

Dec 31, 2022

πŸ”†πŸ”ŽπŸ‘€ Smart Contract Storage Viewer, DataType Guesser, Toolbox & Transaction Decoder

πŸ”†πŸ”ŽπŸ‘€  Smart Contract Storage Viewer, DataType Guesser, Toolbox & Transaction Decoder

πŸ”† πŸ”Ž πŸ‘€ Smart Contract Storage HexViewer Demo Target - the target contract API Endpoint - your infura (or equivalent) api key Retrieves smart contrac

Nov 27, 2022

Cooperative databases using smart contracts.

Cooperative databases using smart contracts.

Vitra β–ˆβ–ˆβ•— β–ˆβ–ˆβ•—β–ˆβ–ˆβ•—β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•—β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•— β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•— β–ˆβ–ˆβ•‘ β–ˆβ–ˆβ•‘β–ˆβ–ˆβ•‘β•šβ•β•β–ˆβ–ˆβ•”β•β•β•β–ˆβ–ˆβ•”β•β•β–ˆβ–ˆβ•—β–ˆβ–ˆβ•”β•β•β–ˆβ–ˆβ•— β–ˆβ–ˆβ•‘ β–ˆβ–ˆβ•‘β–ˆβ–ˆβ•‘ β–ˆβ–ˆβ•‘ β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•”β•β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•‘ β•šβ–ˆβ–ˆβ•— β–ˆβ–ˆβ•”β•β–ˆβ–ˆβ•‘ β–ˆβ–ˆβ•‘ β–ˆβ–ˆβ•”β•β•β–ˆβ–ˆβ•—β–ˆβ–ˆ

Dec 30, 2022

A multipurpose transporting app that contains, loads databases, documentation and more.

TransportingApp This project was generated with Angular CLI version 12.0.3. Development server Run ng serve for a dev server. Navigate to http://local

Jan 21, 2022
Comments
Owner
Mikeal Rogers
Creator of NodeConf and request.
Mikeal Rogers
Grupprojekt fΓΆr kurserna 'Javascript med Ramverk' och 'Agil Utveckling'

JavaScript-med-Ramverk-Laboration-3 Grupprojektet fΓΆr kurserna Javascript med Ramverk och Agil Utveckling. Utvecklingsguide FΓΆr information om hur utv

Svante Jonsson IT-HΓΆgskolan 3 May 18, 2022
Hemsida fΓΆr personer i Sverige som kan och vill erbjuda boende till mΓ€nniskor pΓ₯ flykt

Getting Started with Create React App This project was bootstrapped with Create React App. Available Scripts In the project directory, you can run: np

null 4 May 3, 2022
Kurs-repo fΓΆr kursen Webbserver och Databaser

Webbserver och databaser This repository is meant for CME students to access exercises and codealongs that happen throughout the course. I hope you wi

null 14 Jan 3, 2023
Simple car driving game with birds eye view.

Car Driving Game Simple car driving game with birds eye view. Messing around with vanilla script. Try it out but donΒ΄t hit the ?? Β΄s Live Version Clic

Marvin Scheffold 2 Apr 5, 2022
CarCollectors is website for customers who want to rent a car

CarCollectors is website that consumes the car api and the involvement api. It allows the user to browse a large collection of cars, like them, comment on them, and place a reservation order.

Soufiane Boursen 3 May 3, 2022
Util for kafkajs to buffer messages and send them in batches, inspired by node-rdkafka

kafkjajs-buffer Plugin for kafkajs to buffer messages and send them in batches, inspired by node-rdkafka Overview kafkajs-buffer adds queue/buffer cap

Alberto Juan 7 Sep 7, 2022
Web application to keep your car healthy.

Heal my car Heal my car is a web application which makes it easier to take care of your car. Key Features Book a service for your car, Estimate the co

infoShare Academy 4 Jun 9, 2022
Personal RasPi console for my car for basic diagnostics and media playback.

mycardashpoc Personal RasPi console for my car for basic diagnostics and media playback. Integrates with OBD and MPD and serves a simple dashboard wit

Anders Evenrud 4 Jul 12, 2022
Use better-sqlite3 to give obsidian the ability to manipulate sqlite3 databases

OBSIDIAN-SQLITE3 Use better-sqlite3 to give obisidian the ability to manipulate sqlite3 databases. Intention Currently the linkage between obsidian an

cloud 12 Nov 28, 2022
NodeJs, tron transaction checker

Tron node-Explorer Recommended requirements Node v14.17.5. npm 6.14.14 https://nodejs.org Development Install dependencies npm install Running applica

null 3 Aug 21, 2022