A lightweight framework for data analysis in JavaScript.

Overview

datakit

About

A lightweight library/framework for data analysis in JavaScript.

Usage

npm install datakitjs --save

Documentation & Examples

Reading, Filtering, & Plotting Data

var dk = require('datakitjs');

//READ A CSV FILE

//file.csv
// COL1, COL2
// val11, val12
// val21, val22

dk.csv('file.csv', function(data) {
  console.log(data);
});

//Output:
//[{ COL1: val11, COL2: val12 }, { COL1: val21, COL2: val22 }]


//GET A COLUMN FROM AN ARRAY OF ROW OBJECTS
dk.csv('file.csv', function(data) {
  var c2 = dk.col(data, 'COL2');
  console.log(c2);
});

//Output:
//[val12, val22]

// By default, dk.csv will convert all values to strings. You can convert select
// columns to numbers by passing an array of column names to 'dk.numeric'.

//file2.csv
// COL1, COL2
// val11, 1
// val21, 2

dk.csv('file2.csv', function(data) {
  var d = dk.numeric(data, ['COL2'], 0) // The third parameter value will be filled
  // in to blank cells. Its default value is 0.
  var c2 = dk.col(d, 'COL2');
  console.log(c2);
});

//Output:
//[1, 2]


//PLOT ARRAY(S) OF DATA

var chart = new dk.Chart({
  //optional config
  height: 500,
  width: 500,
  xLab: 'x-Axis Label',
  yLab: 'y-Axis Label'
});

chart.addDataSet({
  x: [1, 2, 3],
  y: [4, 5, 6],
  z: [2, 3, 5],
  colors: ['blue', 'green', 'red']
}).addDataSet({
  x: [1, 10],
  y: [2, -1],
  type: 'line'
}).addDataSet({
  x: [10, 5, 1],
  y: [4, 5, 2],
  labels: ["first", "second", "third"]
}).plot();

Statistical Methods

var dk = require('datakitjs');

//MEAN OF AN ARRAY
dk.mean([1, 2, 3]); //returns 2

//STANDARD DEVIATION AND VARIANCE OF AN ARRAY
dk.sd([1, 2, 3]); //returns 1
dk.vari([1, 2, 3]); //returns 1

//COVARIANCE OF TWO ARRAYS
dk.cov([1, 2, 3], [3, 2, 1]); //returns -1

//SIMPLE LINEAR REGRESSION

var x = [1, 2, 3];
var y = [2, 1, 3];

var model = dk.reg(x, y);

// model.f is a function that returns the estimated y for an input x (estimated via standard OLS regression)
// model.f = function(x) {
//  return (a + b * x);
// };

// model.pts is an array of the estimated y for each element of x
// model.pts = [1.5, 2, 2.5];

// model.endPoints is an object with the coordinates of the boundary points
// model.endPoints = { x1: 1, x2: 3, y1: 1.5, y2: 2.5 };

Convenience Methods

var dk = require('datakitjs');

//GENERATE AN ARRAY WITH A SEQUENCE OF NUMBERS

dk.seq(1, 5); //returns [1, 2, 3, 4, 5]

dk.seq(0, 1, 0.25); //returns [0, 0.25, 0.5, 0.75, 1]

//GENERATE AN ARRAY WITH REPEATED VALUE

dk.rep(1, 5); //returns [1, 1, 1, 1, 1]

//CHECK IF NUMBERS ARE CLOSE
dk.isclose(0, Math.pow(10, -15)); //returns true

dk.isclose(0, Math.pow(10, -5)); //returns false

//SUM AN ARRAY OF NUMBERS
//uses Kahan summation

dk.sum([1, 2, 3]); //returns 6

//PRODUCT OF AN ARRAY OF NUMBERS
//implementation from 'Accurate Floating Point Product' - Stef Graillat

dk.prod([1, 2, 3]); //returns 6

//MAX AND MIN OF AN ARRAY
var x = [1, 2, 3];
dk.min(x); //returns 1
dk.max(x); //returns 3

Random Numbers

var dk = require('datakitjs');

//GET AN ARRAY OF EXPONENTIALLY DISTRIBUTED VALUES

dk.exp(3, 1); //returns [0.3584189321510761, 1.0466439500242446, 0.08887770301056963]


//GET AN ARRAY OF NORMALLY DISTRIBUTED VALUES

dk.norm(3, 0, 1); //returns [-1.709768103193772, 0.23530041388459744, 0.4431320382580479]

//GET AN ARRAY OF UNIFORMLY DISTRIBUTED VALUES

dk.uni(3); //returns [0.30658303829841316, 0.1601463456172496, 0.8538850131444633]

Testing

Just run npm test to run the tests.

Contributing

Additional methods for random number generation, data filtration, convenience functions, and common statistical analyses are welcome additions. Just add tests following the structure in spec/test/testSpec.js.

License

The MIT License (MIT)

Copyright (c) 2015 Nathan Epstein

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

You might also like...

A crawler that crawls the site's internal links, fetching information of interest to any SEO specialist to perform appropriate analysis on the site.

Overview 📝 It is a module that crawls sites and extracts basic information on any web page of interest to site owners in general, and SEO specialists

Apr 22, 2022

A Cli that handles the creation of a basic express App that supports Husky configuration & static analysis tools

A Cli that handles the creation of a basic express App that supports Husky configuration & static analysis tools

@phazero/create-express-app · Create express app is a CLI that can generate boiler plate code for setting up an express app. Installation & Usage npx

Oct 29, 2022

⚡ Archive of all Zotero Translators co-created by participants of the Information Analysis course in 2018 to date.

awesome-translators 1. awesome-translators 维护小组 1.1 Translators 更新流程 1.2 Zotero 安装流程 1.3 Zotero 进阶资料 2. Translators 2.1 Translators 总览表 2.2 Translator

Dec 30, 2022

Webpack dev tools to make performance analysis, error investigation and loader development more convenient

Webpack dev tools to make performance analysis, error investigation and loader development more convenient

build-tool-inspector Introduction Webpack dev tools to make performance analysis, error investigation and loader development more convenient. Provide

Nov 17, 2022

Chess Engine Battles & Analysis using UCI Engines

Chess Engine Battles & Analysis Written in TypeScript, using UCI Engine, managed from NodeJS Tested using Stockfish, LCZero, Beserk and SmallBrain for

Oct 10, 2022

Code examples for my TypeScript Static Analysis Hidden Gems talk. 💎

Code examples for my TypeScript Static Analysis Hidden Gems talk. 💎

TypeScript Static Analysis Hidden Gems Code Code samples for the talk, formed from my template-typescript-node-package. ✨ 👉 Slides available here! 👈

Nov 2, 2022

Scrape tweets from Twitter search results based on keywords and date range using Playwright. Save scraped tweets in a CSV file for easy analysis

Tweet Harvest (Twitter Crawler) Tweet Harvest is a command-line tool that uses Playwright to scrape tweets from Twitter search results based on specif

Aug 9, 2023

EggyJS is a Javascript micro Library for simple, lightweight toast popups focused on being dependency-less, lightweight, quick and efficient.

EggyJS EggyJS is a Javascript micro Library for simple, lightweight toast popups. The goal of this library was to create something that meets the foll

Jan 8, 2023

JSON Visio is data visualization tool for your json data which seamlessly illustrates your data on graphs without having to restructure anything, paste directly or import file.

JSON Visio is data visualization tool for your json data which seamlessly illustrates your data on graphs without having to restructure anything, paste directly or import file.

JSON Visio is data visualization tool for your json data which seamlessly illustrates your data on graphs without having to restructure anything, paste directly or import file.

Jan 4, 2023
Comments
  • Numeric cols

    Numeric cols

    Hi Nathan,

    Don't know if you're open to pull requests, but I've been looking around at options for data analysis in javascript, which led me to looking through Datakit. Since the way csv in datakit reads csv files turns all values into strings, I thought it would be useful to have a method that turns select column values into numbers, so that functions like mean can be immediately called on them. The function I added is numeric and takes three parameters, first is the parsed csv object, second is an array of column heading names (the ones to convert to numbers), and last is a default value, which is what all empty cells in the given columns will be set to (defaults to 0). I've also added tests to testSpec.js and another sameple csv file (test2.csv) for testing the new method.

    In any case, you've made a really useful library. Thanks for making it.

    -Aaron

    opened by aaronnorby 3
  • Broken link to

    Broken link to "this blog post"

    The link to "this blog post" at the very top of the README file (section "About") is broken. It seems this is the one in archive.is: http://archive.is/tEFj9.

    opened by fgeorges 1
Owner
Nathan Epstein
Nathan Epstein
Hemsida för personer i Sverige som kan och vill erbjuda boende till människor på flykt

Getting Started with Create React App This project was bootstrapped with Create React App. Available Scripts In the project directory, you can run: np

null 4 May 3, 2022
Kurs-repo för kursen Webbserver och Databaser

Webbserver och databaser This repository is meant for CME students to access exercises and codealongs that happen throughout the course. I hope you wi

null 14 Jan 3, 2023
JavaScript package for predictive data analysis and machine learning

scikit.js JavaScript package for predictive data analysis and machine learning. Generic math operations are powered by Tensorflowjs core layer. We are

JSdata 74 Jan 6, 2023
A github action that provides detailed bundle analysis on PRs for next.js apps

Next.js Bundle Analysis Github Action Analyzes each PR's impact on your next.js app's bundle size and displays it using a comment. Optionally supports

HashiCorp 369 Dec 27, 2022
Team project within the course of Software System Design and Analysis.

?? InnoBookCrossing - Application for sharing books at Innopolis gh-md-toc ?? General Information Description The application is designed to help peop

Dariya 33 Oct 22, 2022
Code Scanning/SAST/Static Analysis/Linting using many tools/Scanners with One Report - Scanmycode Community Edition (CE)

Star it If you like it, please give it a GitHub star/fork/contribute. This will ensure continous development ⭐ TLDR; To install it. Install docker and

Marcin Kozlowski 351 Dec 29, 2022
Nouns On-Chain Proposal Simulation and Analysis

Nouns Diligence Nouns On-Chain Proposal Simulation and Analysis For Voters Technical reports for all reviewed proposals can be found in the reports fo

Nouns 23 Dec 26, 2022
ec0lint - a static code analysis tool

ec0lint is a static code analysis tool that provides the users with useful hints on how to reduce the digital footprint of their webpages during the development process. Applying code changes suggested by ec0lint will make result with webpages that emit less carbon per visit, load quicker and are more space- efficient. The tool is open-source and community-driven.

ec0lint 127 Dec 5, 2022
A crawler that crawls the site's internal links, fetching information of interest to any SEO specialist to perform appropriate analysis on the site.

Overview ?? It is a module that crawls sites and extracts basic information on any web page of interest to site owners in general, and SEO specialists

Yazan Zoghbi 2 Apr 22, 2022
A quick and easy to use security reconnaissance webapp tool, does OSINT, analysis and red-teaming in both passive and active mode. Written in nodeJS and Electron.

ᵔᴥᵔ RedJoust A quick and easy to use security reconnaissance webapp tool, does OSINT, analysis and red-teaming in both passive and active mode. Writte

Dave 17 Oct 31, 2022