Machine-learning for Node.js

Overview

Limdu.js

Limdu is a machine-learning framework for Node.js. It supports multi-label classification, online learning, and real-time classification. Therefore, it is especially suited for natural language understanding in dialog systems and chat-bots.

Limdu is in an "alpha" state - some parts are working (see this readme), but some parts are missing or not tested. Contributions are welcome.

Limdu currently runs on Node.js 0.12 and later versions.

Installation

npm install limdu

Demos

You can run the demos from this project: limdu-demo.

Table of Contents generated with DocToc

Binary Classification

Batch Learning - learn from an array of input-output pairs:

var limdu = require('limdu');

var colorClassifier = new limdu.classifiers.NeuralNetwork();

colorClassifier.trainBatch([
	{input: { r: 0.03, g: 0.7, b: 0.5 }, output: 0},  // black
	{input: { r: 0.16, g: 0.09, b: 0.2 }, output: 1}, // white
	{input: { r: 0.5, g: 0.5, b: 1.0 }, output: 1}   // white
	]);

console.log(colorClassifier.classify({ r: 1, g: 0.4, b: 0 }));  // 0.99 - almost white

Credit: this example uses brain.js, by Heather Arthur.

Online Learning

var birdClassifier = new limdu.classifiers.Winnow({
	default_positive_weight: 1,
	default_negative_weight: 1,
	threshold: 0
});

birdClassifier.trainOnline({'wings': 1, 'flight': 1, 'beak': 1, 'eagle': 1}, 1);  // eagle is a bird (1)
birdClassifier.trainOnline({'wings': 0, 'flight': 0, 'beak': 0, 'dog': 1}, 0);    // dog is not a bird (0)
console.dir(birdClassifier.classify({'wings': 1, 'flight': 0, 'beak': 0.5, 'penguin':1})); // initially, penguin is mistakenly classified as 0 - "not a bird"
console.dir(birdClassifier.classify({'wings': 1, 'flight': 0, 'beak': 0.5, 'penguin':1}, /*explanation level=*/4)); // why? because it does not fly.

birdClassifier.trainOnline({'wings': 1, 'flight': 0, 'beak': 1, 'penguin':1}, 1);  // learn that penguin is a bird, although it doesn't fly 
birdClassifier.trainOnline({'wings': 0, 'flight': 1, 'beak': 0, 'bat': 1}, 0);     // learn that bat is not a bird, although it does fly
console.dir(birdClassifier.classify({'wings': 1, 'flight': 0, 'beak': 1, 'chicken': 1})); // now, chicken is correctly classified as a bird, although it does not fly.  
console.dir(birdClassifier.classify({'wings': 1, 'flight': 0, 'beak': 1, 'chicken': 1}, /*explanation level=*/4)); // why?  because it has wings and beak.

Credit: this example uses Modified Balanced Margin Winnow (Carvalho and Cohen, 2006).

The "explanation" feature is explained below.

Binding

Using Javascript's binding capabilities, it is possible to create custom classes, which are made of existing classes and pre-specified parameters:

var MyWinnow = limdu.classifiers.Winnow.bind(0, {
	default_positive_weight: 1,
	default_negative_weight: 1,
	threshold: 0
});

var birdClassifier = new MyWinnow();
...
// continue as above

Explanations

Some classifiers can return "explanations" - additional information that explains how the classification result has been derived:

var colorClassifier = new limdu.classifiers.Bayesian();

colorClassifier.trainBatch([
	{input: { r: 0.03, g: 0.7, b: 0.5 }, output: 'black'}, 
	{input: { r: 0.16, g: 0.09, b: 0.2 }, output: 'white'},
	{input: { r: 0.5, g: 0.5, b: 1.0 }, output: 'white'},
	]);

console.log(colorClassifier.classify({ r: 1, g: 0.4, b: 0 }, 
		/* explanation level = */1));

Credit: this example uses code from classifier.js, by Heather Arthur.

The explanation feature is experimental and is supported differently for different classifiers. For example, for the Bayesian classifier it returns the probabilities for each category:

{ classes: 'white',
	explanation: [ 'white: 0.0621402182289608', 'black: 0.031460948468170505' ] }

While for the winnow classifier it returns the relevance (feature-value times feature-weight) for each feature:

{ classification: 1,
	explanation: [ 'bias+1.12', 'r+1.08', 'g+0.25', 'b+0.00' ] }

WARNING: The internal format of the explanations might change without notice. The explanations should be used for presentation purposes only (and not, for example, for extracting the actual numbers).

Other Binary Classifiers

In addition to Winnow and NeuralNetwork, version 0.2 includes the following binary classifiers:

This library is still under construction, and not all features work for all classifiers. For a full list of the features that do work, see the "test" folder.

Multi-Label Classification

In binary classification, the output is 0 or 1;

In multi-label classification, the output is a set of zero or more labels.

var MyWinnow = limdu.classifiers.Winnow.bind(0, {retrain_count: 10});

var intentClassifier = new limdu.classifiers.multilabel.BinaryRelevance({
	binaryClassifierType: MyWinnow
});

intentClassifier.trainBatch([
	{input: {I:1,want:1,an:1,apple:1}, output: "APPLE"},
	{input: {I:1,want:1,a:1,banana:1}, output: "BANANA"},
	{input: {I:1,want:1,chips:1}, output: "CHIPS"}
	]);

console.dir(intentClassifier.classify({I:1,want:1,an:1,apple:1,and:1,a:1,banana:1}));  // ['APPLE','BANANA']

Other Multi-label classifiers

In addition to BinaryRelevance, version 0.2 includes the following multi-label classifier types (see the multilabel folder):

This library is still under construction, and not all features work for all classifiers. For a full list of the features that do work, see the "test" folder.

Feature engineering

Feature extraction - converting an input sample into feature-value pairs:

// First, define our base classifier type (a multi-label classifier based on winnow):
var TextClassifier = limdu.classifiers.multilabel.BinaryRelevance.bind(0, {
	binaryClassifierType: limdu.classifiers.Winnow.bind(0, {retrain_count: 10})
});

// Now define our feature extractor - a function that takes a sample and adds features to a given features set:
var WordExtractor = function(input, features) {
	input.split(" ").forEach(function(word) {
		features[word]=1;
	});
};

// Initialize a classifier with the base classifier type and the feature extractor:
var intentClassifier = new limdu.classifiers.EnhancedClassifier({
	classifierType: TextClassifier,
	featureExtractor: WordExtractor
});

// Train and test:
intentClassifier.trainBatch([
	{input: "I want an apple", output: "apl"},
	{input: "I want a banana", output: "bnn"},
	{input: "I want chips", output:    "cps"},
	]);

console.dir(intentClassifier.classify("I want an apple and a banana"));  // ['apl','bnn']
console.dir(intentClassifier.classify("I WANT AN APPLE AND A BANANA"));  // []

As you can see from the last example, by default feature extraction is case-sensitive. We will take care of this in the next example.

Instead of defining your own feature extractor, you can use those already bundled with limdu:

limdu.features.NGramsOfWords
limdu.features.NGramsOfLetters
limdu.features.HypernymExtractor

You can also make 'featureExtractor' an array of several feature extractors, that will be executed in the order you include them.

Input Normalization

//Initialize a classifier with a feature extractor and a case normalizer:
intentClassifier = new limdu.classifiers.EnhancedClassifier({
	classifierType: TextClassifier,  // same as in previous example
	normalizer: limdu.features.LowerCaseNormalizer,
	featureExtractor: WordExtractor  // same as in previous example
});

//Train and test:
intentClassifier.trainBatch([
	{input: "I want an apple", output: "apl"},
	{input: "I want a banana", output: "bnn"},
	{input: "I want chips", output: "cps"},
	]);

console.dir(intentClassifier.classify("I want an apple and a banana"));  // ['apl','bnn']
console.dir(intentClassifier.classify("I WANT AN APPLE AND A BANANA"));  // ['apl','bnn'] 

Of course you can use any other function as an input normalizer. For example, if you know how to write a spell-checker, you can create a normalizer that corrects typos in the input.

You can also make 'normalizer' an array of several normalizers. These will be executed in the order you include them.

Feature lookup table - convert custom features to integer features

This example uses the quadratic SVM implementation svm.js, by Andrej Karpathy. This SVM (like most SVM implementations) works with integer features, so we need a way to convert our string-based features to integers.

var limdu = require('limdu');

// First, define our base classifier type (a multi-label classifier based on svm.js):
var TextClassifier = limdu.classifiers.multilabel.BinaryRelevance.bind(0, {
	binaryClassifierType: limdu.classifiers.SvmJs.bind(0, {C: 1.0})
});

// Initialize a classifier with a feature extractor and a lookup table:
var intentClassifier = new limdu.classifiers.EnhancedClassifier({
	classifierType: TextClassifier,
	featureExtractor: limdu.features.NGramsOfWords(1),  // each word ("1-gram") is a feature  
	featureLookupTable: new limdu.features.FeatureLookupTable()
});

// Train and test:
intentClassifier.trainBatch([
	{input: "I want an apple", output: "apl"},
	{input: "I want a banana", output: "bnn"},
	{input: "I want chips", output: "cps"},
	]);

console.dir(intentClassifier.classify("I want an apple and a banana"));  // ['apl','bnn']

The FeatureLookupTable takes care of the numbers, while you may continue to work with texts!

Serialization

Say you want to train a classifier on your home computer, and use it on a remote server. To do this, you should somehow convert the trained classifier to a string, send the string to the remote server, and deserialize it there.

You can do this with the "serialization.js" package:

npm install serialization

On your home machine, do the following:

var serialize = require('serialization');

// First, define a function that creates a fresh  (untrained) classifier.
// This code should be stand-alone - it should include all the 'require' statements
//   required for creating the classifier.
function newClassifierFunction() {
	var limdu = require('limdu');
	var TextClassifier = limdu.classifiers.multilabel.BinaryRelevance.bind(0, {
		binaryClassifierType: limdu.classifiers.Winnow.bind(0, {retrain_count: 10})
	});

	var WordExtractor = function(input, features) {
		input.split(" ").forEach(function(word) {
			features[word]=1;
		});
	};
	
	// Initialize a classifier with a feature extractor:
	return new limdu.classifiers.EnhancedClassifier({
		classifierType: TextClassifier,
		featureExtractor: WordExtractor,
		pastTrainingSamples: [], // to enable retraining
	});
}

// Use the above function for creating a new classifier:
var intentClassifier = newClassifierFunction();

// Train and test:
var dataset = [
	{input: "I want an apple", output: "apl"},
	{input: "I want a banana", output: "bnn"},
	{input: "I want chips", output: "cps"},
	];
intentClassifier.trainBatch(dataset);

console.log("Original classifier:");
intentClassifier.classifyAndLog("I want an apple and a banana");  // ['apl','bnn']
intentClassifier.trainOnline("I want a doughnut", "dnt");
intentClassifier.classifyAndLog("I want chips and a doughnut");  // ['cps','dnt']
intentClassifier.retrain();
intentClassifier.classifyAndLog("I want an apple and a banana");  // ['apl','bnn']
intentClassifier.classifyAndLog("I want chips and a doughnut");  // ['cps','dnt']

// Serialize the classifier (convert it to a string)
var intentClassifierString = serialize.toString(intentClassifier, newClassifierFunction);

// Save the string to a file, and send it to a remote server.

On the remote server, do the following:

// retrieve the string from a file and then:

var intentClassifierCopy = serialize.fromString(intentClassifierString, __dirname);

console.log("Deserialized classifier:");
intentClassifierCopy.classifyAndLog("I want an apple and a banana");  // ['apl','bnn']
intentClassifierCopy.classifyAndLog("I want chips and a doughnut");  // ['cps','dnt']
intentClassifierCopy.trainOnline("I want an elm tree", "elm");
intentClassifierCopy.classifyAndLog("I want doughnut and elm tree");  // ['dnt','elm']

CAUTION: Serialization was not tested for all possible combinations of classifiers and enhancements. Test well before use!

Cross-validation

// create a dataset with a lot of input-output pairs:
var dataset = [ ... ];

// Decide how many folds you want in your   k-fold cross-validation:
var numOfFolds = 5;

// Define the type of classifier that you want to test:
var IntentClassifier = limdu.classifiers.EnhancedClassifier.bind(0, {
	classifierType: limdu.classifiers.multilabel.BinaryRelevance.bind(0, {
		binaryClassifierType: limdu.classifiers.Winnow.bind(0, {retrain_count: 10})
	}),
	featureExtractor: limdu.features.NGramsOfWords(1),
});

var microAverage = new limdu.utils.PrecisionRecall();
var macroAverage = new limdu.utils.PrecisionRecall();

limdu.utils.partitions.partitions(dataset, numOfFolds, function(trainSet, testSet) {
	console.log("Training on "+trainSet.length+" samples, testing on "+testSet.length+" samples");
	var classifier = new IntentClassifier();
	classifier.trainBatch(trainSet);
	limdu.utils.test(classifier, testSet, /* verbosity = */0,
		microAverage, macroAverage);
});

macroAverage.calculateMacroAverageStats(numOfFolds);
console.log("\n\nMACRO AVERAGE:"); console.dir(macroAverage.fullStats());

microAverage.calculateStats();
console.log("\n\nMICRO AVERAGE:"); console.dir(microAverage.fullStats());

Back-classification (aka Generation)

Use this option to get the list of all samples with a given class.

var intentClassifier = new limdu.classifiers.EnhancedClassifier({
	classifierType: limdu.classifiers.multilabel.BinaryRelevance.bind(0, {
		binaryClassifierType: limdu.classifiers.Winnow.bind(0, {retrain_count: 10})
	}),
	featureExtractor: limdu.features.NGramsOfWords(1),
	pastTrainingSamples: [],
});

// Train and test:
intentClassifier.trainBatch([
	{input: "I want an apple", output: "apl"},
	{input: "I want a banana", output: "bnn"},
	{input: "I really want an apple", output: "apl"},
	{input: "I want a banana very much", output: "bnn"},
	]);

console.dir(intentClassifier.backClassify("apl"));  // [ 'I want an apple', 'I really want an apple' ]

SVM wrappers

The native svm.js implementation takes a lot of time to train - quadratic in the number of training samples. There are two common packages that can be trained in time linear in the number of training samples. They are:

The limdu.js package provides wrappers for these implementations. In order to use the wrappers, you must have the binary file used for training in your path, that is:

Once you have any one of these installed, you can use the corresponding classifier instead of any binary classifier used in the previous demos, as long as you have a feature-lookup-table. For example, with SvmPerf:

var intentClassifier = new limdu.classifiers.EnhancedClassifier({
	classifierType: limdu.classifiers.multilabel.BinaryRelevance.bind(0, {
		binaryClassifierType: limdu.classifiers.SvmPerf.bind(0, 	{
			learn_args: "-c 20.0" 
		})
	}),
	featureExtractor: limdu.features.NGramsOfWords(1),
	featureLookupTable: new limdu.features.FeatureLookupTable()
});

and similarly with SvmLinear.

See the files classifiers/svm/SvmPerf.js and classifiers/svm/SvmLinear.js for a documentation of the options.

Undocumented featuers

Some advanced features are working but not documented yet. If you need any of them, open an issue and I will try to document them.

  • Custom input normalization, based on regular expressions.
  • Input segmentation for multi-label classification - both manual (with regular expressions) and automatic.
  • Feature extraction for model adaptation.
  • Spell-checker features.
  • Hypernym features.
  • Classification based on a cross-lingual language model.
  • Format conversion - ARFF, JSON, svm-light, TSV.

License

LGPL

Contributions

Code contributions are welcome. Reasonable pull requests, with appropriate documentation and unit-tests, will be accepted.

Do you like limdu? Remember that you can star it :-)

Comments
  • SVM wrapper doesn't work

    SVM wrapper doesn't work

    HI. I'm trying to use external SVM wrappers, as suggested by you, like svm_perf_learn, but even if I put in PATH, I've got this issue: Cannot find the executable 'svm_perf_learn'. Please download it from the SvmPerf website, and put a link to it in your path. C:\Users\a\node_modules\limdu\classifiers\svm\SvmPerf.js:29 throw new Error(msg); It's quite weird because in SvmPerf.js the check it's always true. Thanks

    opened by corradodebari 10
  • limdu.utils.test does not exist.

    limdu.utils.test does not exist.

    The cross-validation example uses this, and the utils/index.js has it commented out and trainAndTest.js has never been committed. Could you commit a copy of it (or post it here) so I can try to add it to the library?

    opened by gburtini 9
  • svmlinear fix

    svmlinear fix

    The svmlinear doesn't work. Since I don't have permission to push diff, I just post here:

    diff --git a/classifiers/svm/SvmLinear.js b/classifiers/svm/SvmLinear.js
    index 4688a71..8d2d272 100644
    --- a/classifiers/svm/SvmLinear.js
    +++ b/classifiers/svm/SvmLinear.js
    @@ -16,46 +16,41 @@
      *  <li>multiclass - if true, the 'classify' function returns an array [label,score]. If false (default), it returns only a score.
      */
     
    +var util  = require('util')
    +  , child_process = require('child_process')
    +  , exec = require('child_process').exec
    +  , fs   = require('fs')
    +  , svmcommon = require('./svmcommon')
    +  , _ = require('underscore')._
    +
    +var FIRST_FEATURE_NUMBER=1;  // in lib linear, feature numbers start with 1
    +
     function SvmLinear(opts) {
    -	if (!SvmLinear.isInstalled()) {
    -		var msg = "Cannot find the executable 'liblinear_train'. Please download it from the LibLinear website, and put a link to it in your path.";
    -		console.error(msg)
    -		throw new Error(msg);
    -	}
     	this.learn_args = opts.learn_args || "";
     	this.model_file_prefix = opts.model_file_prefix || null;
     	this.bias = opts.bias || 1.0;
     	this.multiclass = opts.multiclass || false;
     	this.debug = opts.debug||false;
    -  	this.train_command = opts.train_command || 'liblinear_train'
    -  	this.test_command = opts.test_command || 'liblinear_test'
    -  	this.timestamp = ""
    +	this.train_command = opts.train_command || 'liblinear_train'
    +	this.test_command = opts.test_command || 'liblinear_test'
    +	this.timestamp = ""
     
     	if (!SvmLinear.isInstalled()) {
    -                var msg = "Cannot find the executable 'liblinear_train'. Please download it from the LibLinear website, and put a link to it in your path.";
    -                console.error(msg)
    -                throw new Error(msg);
    -        }
    +			var msg = "Cannot find the executable 'liblinear_train'. Please download it from the LibLinear website, and put a link to it in your path.";
    +			console.error(msg)
    +			throw new Error(msg);
    +	}
     }
     
     SvmLinear.isInstalled = function() {
     	try {
    -	    var result = execSync(this.train_command);
    +	    var result = child_process.execSync(this.train_command);
     	} catch (err) {
     	    return false
     	}
     	return true
     };
     
    -var util  = require('util')
    -  , child_process = require('child_process')
    -  , exec = require('child_process').exec
    -  , fs   = require('fs')
    -  , svmcommon = require('./svmcommon')
    -  , _ = require('underscore')._
    -
    -var FIRST_FEATURE_NUMBER=1;  // in lib linear, feature numbers start with 1
    -
     
     SvmLinear.prototype = {
     		trainOnline: function(features, expected) {
    
    opened by jc-fireball 8
  • This package is no longer usable because

    This package is no longer usable because "brain" library has been removed

    Hey !

    First thank you for your awesome module :)

    I'm the founder of Gladys, an open-source home automation assistant written in Node.js.

    We are using limdu as a dependency, and since this week the installation of Gladys is broken because the dependency brain that is used in limdu has been deprecated and is no longer possible to install.

    I saw that there is a community fork called brain.js, I don't know if it supports all features used in limdu, but if yes maybe limdu could switch from the deprecated "brain" to "brain.js".

    If you think that's a good idea, I can submit a PR to help you make this module work again :)

    opened by Pierre-Gilles 7
  • Serialization & Deserialization

    Serialization & Deserialization

    I'm having issues with the Serialisation and Deserialisation for combination of limdu + serialization npm packages.

    Normally, we will serialise into a .json file. Using serialize.toString(), I don't where is the serialized .json file.

    Q1. Maybe limdu/serialization will not generate the .json file?

    Anyone can enlighten? Thanks.

    opened by kafechew 7
  • How do I know the progress of Training?

    How do I know the progress of Training?

    First of all, great work with this library. While it works for small data sets. I wanted to do the same for large datasets. It looks like it is stuck in trainbatch and I am not sure about the progress of the training completed so far.

    How do I know how many records have been trained so far or get some feedback that its processing?

    opened by tvvignesh 6
  • Languages supported?

    Languages supported?

    What are the languages supported for the limdu classifications? I assume all the language using a-z alphabet.

    How's about the Hebrew, Chinese, Hindi, Korea... those are not a-z alphabet?

    Thanks :-)

    opened by kafechew 5
  • Visualize Correlations between data?

    Visualize Correlations between data?

    Hi I'm using the Online Learning module. I want to see somehow how data is connected, I'm happy to make charts/diagrams in D3 etc, but just working out what kind of output I can get,

    Any advice would be great, Thankyou. Vince.

    opened by vince-lynch 3
  • Security Notice & Bug Bounty - Command Injection - huntr.dev

    Security Notice & Bug Bounty - Command Injection - huntr.dev

    Overview

    limdu is an A machine learning framework for Node.js. Supports multi-level classification and online learning.

    This package is vulnerable to Command Injection through the trainBatch function.

    Bug Bounty

    We have opened up a bounty for this issue on our bug bounty platform. Want to solve this vulnerability and get rewarded 💰? Go to https://huntr.dev/

    We will submit a pull request directly to your repository with the fix as soon as possible. Want to learn more? Go to https://github.com/418sec/huntr 📚

    Automatically generated by @huntr-helper...

    opened by huntr-helper 2
  • work in browser?

    work in browser?

    can limdu work in a browser? i get this error inside webpack

    ERROR Failed to compile with 10 errors 14:49:43

    These dependencies were not found:

    • fs in C:/Users/franc/Documents/pror/~/limdu/classifiers/kNN/kNN.js, C:/Users/franc/Documents/pror/~/limdu/classifiers/svm/SvmPerf.js and 5 others
    • child_process in C:/Users/franc/Documents/pror/~/limdu/classifiers/svm/SvmPerf.js, C:/Users/franc/Documents/pror/~/limdu/classifiers/svm/SvmLinear.js and 1 other

    To install them, you can run: npm install --save fs child_process

    opened by francescoagati 2
  • online training behaviour for missclassification

    online training behaviour for missclassification

    I'm making a simple search where the user can evaluate the result as useful or not. Each time an result is evaluated I "onlineTrain" on server. If the user change its mind, is possible to change the evaluation.

    The problem is that sometimes the classify function return an empty value... Why this happend ? It looks like the online classifier is confused... ( just guessing)

    Like:

    birdClassifier.trainOnline({'wings': 1, 'flight': 1, 'beak': 1, 'eagle': 1}, 1); 
    birdClassifier.trainOnline({'wings': 0, 'flight': 0, 'beak': 0, 'dog': 1}, 0); 
    

    Then:

    birdClassifier.trainOnline({'wings': 1, 'flight': 1, 'beak': 1, 'eagle': 1},0); 
    birdClassifier.trainOnline({'wings': 0, 'flight': 0, 'beak': 0, 'dog': 1}, 1);  
    

    It starts to being confused, obviously, and some times it does not return nothing on "classify" ... I want to know why?

    opened by calebebrim 2
  • Feature suggestions

    Feature suggestions

    • Randomising the partitions (e.g.: like how train-test-split does it for training/test sets or tvt-split for training/validation/test sets)?
    • SVG/Canvas representation of the model (or any other way to visualise the model being used)
    opened by Berkmann18 3
  • Confusion in README

    Confusion in README

    I've noticed that the cross-validation example uses macroAverage:

    var macroAverage = new limdu.utils.PrecisionRecall();
    
    limdu.utils.partitions.partitions(dataset, numOfFolds, function(trainSet, testSet) {
    	console.log("Training on "+trainSet.length+" samples, testing on "+testSet.length+" samples");
    	var classifier = new IntentClassifier();
    	classifier.trainBatch(trainSet);
    	limdu.utils.test(classifier, testSet, /* verbosity = */0,
    		microAverage, macroAverage);
    });
    
    macroAverage.calculateMacroAverageStats(numOfFolds);
    console.log("\n\nMACRO AVERAGE:"); console.dir(macroAverage.fullStats());
    

    But utils.testAndTrain's test function uses macroSum which is confusing. Is it meant to be macroSum in the README or is the function not using the right term?

    Also, not related to this but would it be a good idea to add (an optional) randomization to the partitions (e.g.: like how train-test-split does it)?

    opened by Berkmann18 0
  • Incorrect Accuracy

    Incorrect Accuracy

    After doing some checks on my limbdu wrapper, and seeing that the Accuracy fields don't return what one would except knowing TP, TN and count's values (for both microAverage and macroAverage). I was wondering, why is the TRUE field used for calculating Accuracy when it's not equal to TP which means it's not using the standard (TP + TN) / count formula which applies to both 2-class and multi-class classifications?

    opened by Berkmann18 0
  • Missed fields on cross-validation

    Missed fields on cross-validation

    After following the example on cross-validation on a project, I've noticed that both microAverage and macroAverage have some fields left as empty or (whichever was the default value in the PrecisionRecall's constructor).

    Here's an example (taken from ac-learn)

    const Learner = require('./ac-learn`)
    const learner = new Learner(); //creates and object where the classifier is the `intentClassifier` from the examples
    learner.crossValidate(/*folds=*/ 5);
    

    Which outputs this (seemingly incomplete) object:

    {
      macroAvg: {
        count: 79,
        TP: 69,
        TN: 0,
        FP: 3.6,
        FN: 10,
        TRUE: 68.2,
        startTime: 2019-05-02T11:17:15.451Z,
        dep: {},
        confusion: {},
        macroPrecision: 0,
        macroRecall: 0,
        macroF1: 0,
        Accuracy: 0.8632911392405064,
        HammingLoss: 0.17215189873417722,
        HammingGain: 0.8278481012658228,
        Precision: 0.9235869565217393,
        Recall: 0.8734177215189874,
        F1: 0.89154213836478,
        endTime: 'Thu May 02 2019 11:17:15 GMT+0000 (GMT)', //Shouldn't this be in the EPOCH time format like the rest?
        timeMillis: 8.6,
        timePerSampleMillis: 0.1088607594936709,
        shortStatsString: 'Accuracy=86% HammingGain=83% Precision=92% Recall=87% F1=89% timePerSample=0[ms]'
      },
      microAvg: {
        count: 395,
        TP: 345,
        TN: 0,
        FP: 18,
        FN: 50,
        TRUE: 341,
        startTime: 2019-05-02T11:17:15.451Z,
        dep: {},
        confusion: {},
        macroPrecision: 0,
        macroRecall: 0,
        macroF1: 0,
        Accuracy: 0.8632911392405064,
        HammingLoss: 0.17215189873417722,
        HammingGain: 0.8278481012658228,
        Precision: 0.9504132231404959,
        Recall: 0.8734177215189873,
        F1: 0.9102902374670185,
        endTime: 2019-05-02T11:17:16.325Z,
        timeMillis: 874,
        timePerSampleMillis: 2.212658227848101,
        shortStatsString: 'Accuracy=341/395=86% HammingGain=1-68/395=83% Precision=95% Recall=87% F1=91% timePerSample=2[ms]'
      }
    }
    

    I had a glance at the code and it seems that labels, dep, confusion, macroPrecision, macroRecall and macroF1 should be filled instead of being {} or 0 so I was wondering if it was a bug?

    opened by Berkmann18 0
  • Dataset mutation

    Dataset mutation

    After running unit tests on a project that uses limdu, I noticed that once the classifiers train method is called, the dataset is being mutated. What I mean by this is that I have a class Learner that has a dataset field and a classifier one (which is similar to intentClassifier in the examples). dataset has the [ { input: 'string', output: 'category string' }, ...] structure, after train() is called on say Learner.classifier, the dataset (so both the training and testing sets) has outputs being arrays with the strings.

    I'm not sure if it's intended or if the format (post-mutation) is what should be used instead of what's in the docs.

    Ref: https://github.com/all-contributors/ac-learn/tree/limdu

    opened by Berkmann18 0
  • Big multi label classifier on db

    Big multi label classifier on db

    Hi, Can multi label classifiers with big data writed to a SQL DB and streamed in classify phase? Or the classifier must work always in memory? Form big classifier is best split the data in many classificators, serialize it and calo sequentually or in parallel?

    opened by francescoagati 5
Owner
Erel Segal-Halevi
Lecturer at Ariel University
Erel Segal-Halevi
Machine Learning library for node.js

shaman Machine Learning library for node.js Linear Regression shaman supports both simple linear regression and multiple linear regression. It support

Luc Castera 108 Feb 26, 2021
Machine learning tools in JavaScript

ml.js - Machine learning tools in JavaScript Introduction This library is a compilation of the tools developed in the mljs organization. It is mainly

ml.js 2.3k Jan 1, 2023
Train and test machine learning models for your Arduino Nano 33 BLE Sense in the browser.

Tiny Motion Trainer Train and test IMU based TFLite models on the Web Overview Since 2009, coders have created thousands of experiments using Chrome,

Google Creative Lab 59 Nov 21, 2022
JavaScript Machine Learning Toolkit

The JavaScript Machine Learning Toolkit, or JSMLT, is an open source JavaScript library for education in machine learning.

JSMLT 25 Nov 23, 2022
Friendly machine learning for the web! 🤖

Read our ml5.js Code of Conduct and software licence here! This project is currently in development. Friendly machine learning for the web! ml5.js aim

ml5 5.9k Jan 2, 2023
machinelearn.js is a Machine Learning library written in Typescript

machinelearn.js is a Machine Learning library written in Typescript. It solves Machine Learning problems and teaches users how Machine Learning algorithms work.

machinelearn.js 522 Jan 2, 2023
Unsupervised machine learning with multivariate Gaussian mixture model which supports both offline data and real-time data stream.

Gaussian Mixture Model Unsupervised machine learning with multivariate Gaussian mixture model which supports both offline data and real-time data stre

Luka 26 Oct 7, 2022
Automated machine learning for analytics & production

auto_ml Automated machine learning for production and analytics Installation pip install auto_ml Getting started from auto_ml import Predictor from au

Preston Parry 1.6k Dec 26, 2022
A JavaScript deep learning and reinforcement learning library.

neurojs is a JavaScript framework for deep learning in the browser. It mainly focuses on reinforcement learning, but can be used for any neural networ

Jan 4.4k Jan 4, 2023
Support Vector Machine (SVM) library for nodejs

node-svm Support Vector Machine (SVM) library for nodejs. Support Vector Machines Wikipedia : Support vector machines are supervised learning models t

Nicolas Panel 296 Nov 6, 2022
Fork, customize and deploy your Candy Machine v2 super quickly

Candy Machine V2 Frontend This is a barebones implementation of Candy Machine V2 frontend, intended for users who want to quickly get started selling

AL 107 Oct 24, 2022
Deep Learning in Javascript. Train Convolutional Neural Networks (or ordinary ones) in your browser.

ConvNetJS ConvNetJS is a Javascript implementation of Neural networks, together with nice browser-based demos. It currently supports: Common Neural Ne

Andrej 10.4k Dec 31, 2022
K-nearest neighbors algorithm for supervised learning implemented in javascript

kNear Install npm install knear --save About kNear is a javascript implementation of the k-nearest neighbors algorithm. It is a supervised machine lea

Nathan Epstein 45 Mar 7, 2022
🤖chat discord bot powered by Deep learning algorithm🧠

✨ Akaya ✨ ❗ Discord integration functionality not implemented yet! Only the deep-learning module working. Install git clone https://github.com/LyeZinh

Pedro Kaleb! 3 Jun 23, 2022
architecture-free neural network library for node.js and the browser

Synaptic Important: Synaptic 2.x is in stage of discussion now! Feel free to participate Synaptic is a javascript neural network library for node.js a

Juan Cazala 6.9k Dec 27, 2022
general natural language facilities for node

natural "Natural" is a general natural language facility for nodejs. It offers a broad range of functionalities for natural language processing. Docum

null 10k Jan 9, 2023
Run XGBoost model and make predictions in Node.js

XGBoost-Node eXtreme Gradient Boosting Package in Node.js XGBoost-Node is a Node.js interface of XGBoost. XGBoost is a library from DMLC. It is design

暖房 / nuan.io 31 Nov 15, 2022
Powerful Neural Network for Node.js

NeuralN Powerful Neural Network for Node.js NeuralN is a C++ Neural Network library for Node.js with multiple advantages compared to existing solution

TOTEMS::Tech 275 Dec 15, 2022
Bayesian bandit implementation for Node and the browser.

#bayesian-bandit.js This is an adaptation of the Bayesian Bandit code from Probabilistic Programming and Bayesian Methods for Hackers, specifically d3

null 44 Aug 19, 2022