A memory-efficient GeoJSON representation.

Related tags

Maps memory-geojson
Overview

memory-geojson (experimental πŸ§ͺ )

A memory-efficient GeoJSON representation.

This is not a new format. It's not meant to be serialized, and it doesn't add any features on top of the GeoJSON format.

What it attempts to do is provide an in-memory representation of GeoJSON that uses TypedArrays to store flattened coordinates. The main benefits and goals are:

  • Reduce memory requirements of GeoJSON data.
  • Support transferrable or shared array buffers which make communication with WebWorkers much faster.

The GeoJSON format is almost perfect, but the way it represents coordinates with nested arrays can be a performance issue. This is an experiment to support flattened arrays.

Heavy inspiration taken from mapshaper perhaps the only tool that I know of that has a strategy of flattening those arrays.

Storage scheme

This might change, I'm just braindumping what's in the code right now.

This takes GeoJSON as input and stores it in three objects:

  1. An array of objects called "featureProperties". This is where the "properties" data goes, as well as any data like bounding boxes or arbitrary data attached to the Feature object.
  2. A Uint32Array of indexes, which informs the reader of the types of geometries and lengths of coordinate rings.
  3. A Float64Array of coordinates.
  4. A Uint32Array of lookup offsets into the indexes & coordinates arrays.

Basically it's an offset based system. You read the file from the start, and let's say the first feature has a Point geometry. Point has a geometry code of 0, which informs the reader to read the first 3 numbers from the coordinate array and move on.

If the next geometry is a LineString (code 2), then the reader reads the next number from the indexes array, which contains the number of coordinates in the linestring. Given that information, it reads that number of coordinates and produces a LineString geometry.

Discussion

This schema has tradeoffs.

  • Seeking is difficult, there is currently no affordance for randomly jumping to and extracting a geometry. It's not entirely clear that this is necessary - skipping would be simple to implement. However, something like an index-of-indexes could be constructed. Seeking to a specific feature is implemented!
  • It's not clear yet how to encode the z index, the 3rd item in a GeoJSON Position. Right now this defaults that 3rd item to 0, but that is not ideal: a coordinate with z=0 is not the same as a coordinate with no z value. The latter implies that the z value is unknown, not 0. z indexes are encoded as NaN.
  • Could it be done with just one array, instead of separate arrays for indexes and coordinates?
  • Should coordinates be Float32? I suspect that, while this would make encoding lossy (JavaScript floats are 52-bit), it would easily satisfy geospatial accuracy needs while halving the space requirement.
  • Some data updates in this format would be very expensive, and also updates would require some fairly custom operations. For example, adding a new coordinate in the middle of a line, in the a feature in the middle of the dataset would require, probably, intelligently updating the LineString length, slicing the dataset into two TypedArrays, and plopping the new coordinate in the middle. It's doable, but makes updates much less obvious.
  • Is it useful, or beneficial, to get fancy with properties? GeoJSON files certainly tend to share property names and often values, so it's conceivable that a bunch of features with a value for a property like "x" could have their values of "x" encoded as a flat array, hence saving valuable object space. But doing this well and not accidentally increasing the memory requirements of some datasets seems like it would require compression-like logic.

Running it

This repo is using Node's built-in test framework (as of Node v18). So, have Node v18 and run node --test. No deps are required so far.

Future

The future of this would be to use it in Placemark, which would benefit from a more efficient memory encoding of features.

Prior art

You might also like...

Tools for editing Shapefile, GeoJSON, TopoJSON and CSV files

Mapshaper Introduction Mapshaper is software for editing Shapefile, GeoJSON, TopoJSON, CSV and several other data formats, written in JavaScript. Maps

Jan 2, 2023

Write or parse GeoJSON as YAML

geoyaml Write or parse GeoJSON as YAML. Like this: type: FeatureCollection features: - geometry: type: Point coordinates: - 37.9

Dec 1, 2021

Reproject GeoJSON. Works Offline.

reproject-geojson Reproject GeoJSON features Works Offline Pure JavaScript Cross-Platform (NodeJS or Browser) install npm install reproject-geojson us

Jul 22, 2022

Dashboards-maps is a frontend plugin that helps you in uploading custom GeoJSON to OpenSearch and communicates with the geospatial backend plugin for the same.

Welcome! Project Resources Code of Conduct License Copyright Dashboards-Maps Dashboards-maps is a frontend plugin that helps you in uploading custom G

Dec 28, 2022

modern parser & stringifier for WKT, EWKT, and GeoJSON

betterknown betterknown development is supported by 🌎 placemark.io I wrote wellknown, a WKT parser and stringifier, eons ago. It's still sort of popu

Sep 3, 2022

A transparent, in-memory, streaming write-on-update JavaScript database for Small Web applications that persists to a JavaScript transaction log.

JavaScript Database (JSDB) A zero-dependency, transparent, in-memory, streaming write-on-update JavaScript database for the Small Web that persists to

Nov 13, 2022

An in memory postgres DB instance for your unit tests

An in memory postgres DB instance for your unit tests

pg-mem is an experimental in-memory emulation of a postgres database. ❀ It works both in Node or in the browser. ⭐ this repo if you like this package,

Dec 30, 2022

Bluzelle is a smart, in-memory data store. It can be used as a cache or as a database.

SwarmDB ABOUT SWARMDB Bluzelle brings together the sharing economy and token economy. Bluzelle enables people to rent out their computer storage space

Dec 31, 2022

πŸ› Memory leak testing for node.

πŸ› Memory leak testing for node.

Leakage - Memory Leak Testing for Node Write leakage tests using Mocha or another test runner of your choice. Does not only support spotting and fixin

Dec 28, 2022

thetool is a CLI tool to capture different cpu, memory and other profiles for your node app in Chrome DevTools friendly format

thetool is a CLI tool to capture different cpu, memory and other profiles for your node app in Chrome DevTools friendly format

thetool thetool is a CLI tool to capture different cpu, memory and other profiles for your node app in Chrome DevTools friendly format. Quick start np

Oct 28, 2022

Kill all Chrome tabs to improve performance, decrease battery usage, and save memory

Kill all Chrome tabs to improve performance, decrease battery usage, and save memory

kill-tabs Kill all Chrome tabs to improve performance, decrease battery usage, and save memory Works on macOS, Linux, Windows. I'm a tab-abuser and I

Jan 8, 2023

In-memory filesystem with Node's API

In-memory filesystem with Node's API

Jan 4, 2023

A simple tool to help you connect your favorite controllers / Arduino to various train simulator games on Windows using memory hacks.

A simple tool to help you connect your favorite controllers (e.g. Densha de Go! series) / Arduino to various train simulator games on Windows using memory hacks.

Feb 7, 2022

In-memory Object Database

In-memory Object Database

limeDB What is LimeDB LimeDB is object-oriented NoSQL database (OOD) system that can work with complex data objects that is, objects that mirror those

Aug 18, 2022

startupDB is an Express middleware function implementing a high-performance in-memory database

startupDB startupDB is a database designed to create REST APIs. It is implemented as an Express middleware function and allows for easy implementation

Jul 26, 2022

Generate in-memory fake files with custom size

File generator Generate fake in-memory files for varying sizes This package allows you generate fake in-memory files for varying sizes. The generated

Nov 4, 2022

An exercise in building a very minimal (and very stupid) in-memory SQL-like database for educational purposes.

Stupid Database This is an exercise in building a very minimal (and very stupid) in-memory SQL-like database for educational purposes. None of this co

Dec 20, 2022

In-memory abstract-level database for Node.js and browsers.

memory-level In-memory abstract-level database for Node.js and browsers, backed by a fully persistent red-black tree. The successor to memdown and lev

Dec 27, 2022

Brain wallet using both language and visual memory.

Brain wallet using both language and visual memory.

*Check out the big brain on Brett! You're a smart motherfvcker, that's right.* -- Pulp fiction πŸ‘‘ SUPER BRAIN WALLET πŸ‘‘ Use your brain power to the ma

Jun 9, 2022
Comments
  • Better to write to a TypedArray or write to an array and copy to a TypedArray?

    Better to write to a TypedArray or write to an array and copy to a TypedArray?

    Note that the visgl example uses a normal array and converts to a TypedArray as the last step. Is that good? The resizing method here is a little troubling.

    opened by tmcw 0
  • Use Atomics?

    Use Atomics?

    If this works with SharedArrayBuffer, then it may need to use Atomics in order to successfully write to a shared resource: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Atomics

    opened by tmcw 0
  • Think over interop

    Think over interop

    It'd be nice to rely on the memory representation in the "important places". So, for example:

    • geojson-vt internally creates flat arrays for coordinates. could this format handle those with zero-copy? https://github.com/mapbox/geojson-vt/blob/35f4ad75feed64e80ff2cd02994976c6335859cd/src/convert.js#L130
    • Maybe there's a way to implement turf-meta operations that makes it possible to run turf ports on this datastructure without requiring a decode
    opened by tmcw 0
  • Could `coordinates` arrays be TypedArray slices?

    Could `coordinates` arrays be TypedArray slices?

    This wouldn't work if this uses integers and e6 encoding, but could, instead of creating Array objects, coordinate (2 element or 3 element) arrays are slices into the TypedArray?

    opened by tmcw 1
Owner
Tom MacWright
hi!
Tom MacWright
Geokit - is a command-line interface (CLI) tool written in javascript, that contains all the basic functionalities for measurements, conversions and operations of geojson files.

Geokit Geokit is a command-line interface (CLI) tool written in javascript, that contains all the basic functionalities for measurements, conversions

Development Seed 31 Nov 17, 2022
Tools for editing Shapefile, GeoJSON, TopoJSON and CSV files

Mapshaper Introduction Mapshaper is software for editing Shapefile, GeoJSON, TopoJSON, CSV and several other data formats, written in JavaScript. Maps

Matthew Bloch 3.2k Jan 2, 2023
Write or parse GeoJSON as YAML

geoyaml Write or parse GeoJSON as YAML. Like this: type: FeatureCollection features: - geometry: type: Point coordinates: - 37.9

Lou Huang 19 Dec 1, 2021
Reproject GeoJSON. Works Offline.

reproject-geojson Reproject GeoJSON features Works Offline Pure JavaScript Cross-Platform (NodeJS or Browser) install npm install reproject-geojson us

Daniel J. Dufour 8 Jul 22, 2022
Dashboards-maps is a frontend plugin that helps you in uploading custom GeoJSON to OpenSearch and communicates with the geospatial backend plugin for the same.

Welcome! Project Resources Code of Conduct License Copyright Dashboards-Maps Dashboards-maps is a frontend plugin that helps you in uploading custom G

null 9 Dec 28, 2022
modern parser & stringifier for WKT, EWKT, and GeoJSON

betterknown betterknown development is supported by ?? placemark.io I wrote wellknown, a WKT parser and stringifier, eons ago. It's still sort of popu

Placemark 32 Sep 3, 2022
Plugin that lets you create diagrams from textual representation (aka 'Diagrams as Code') within Logseq

Logseq - Diagrams as Code Plugin that lets you create diagrams (and other visualizations) from textual representation (aka 'Diagrams as Code') within

Nicolai P. Großer 80 Dec 21, 2022
Full dynamic tool kit that is capable of deobfuscating and producing a javascript representation of Shape's Virtual Machine obfuscation

Shape Security Decompiler Tool-Kit This tool kit is capable of dynamically deobfuscating all versions of shape security's virtual machine interpreter

null 25 Dec 15, 2022
Converts geojson to svg string given svg viewport size and maps extent.

geojson2svg Converts geojson to svg string given svg viewport size and maps extent. Check world map, SVG scaled map and color coded map examples to de

Gagan Bansal 163 Dec 17, 2022
Geokit - is a command-line interface (CLI) tool written in javascript, that contains all the basic functionalities for measurements, conversions and operations of geojson files.

Geokit Geokit is a command-line interface (CLI) tool written in javascript, that contains all the basic functionalities for measurements, conversions

Development Seed 31 Nov 17, 2022