A speech recognition library running in the browser thanks to a WebAssembly build of Vosk

Overview

Vosk-Browser

A somewhat opinionated speech recognition library for the browser using a WebAssembly build of Vosk

This library picks up the work done by Denis Treskunov and packages an updated Vosk WebAssembly build as an easy-to-use browser library.

Note: WebAssembly builds can target NodeJS, the browser's main thread or web workers. This library explicitly compiles Vosk to be used in a WebWorker context. If you want to use Vosk in a NodeJS application it is recommended to use the official node bindings.

Live Demo

Checkout the demo running in-browser speech recognition of microphone input or audio files in 13 languages.

Installation

You can install vosk-browser as a module:

$ npm i vosk-browser

You can also use a CDN like jsdelivr to add the library to your page, which will be accessible via the global variable Vosk:


Usage

See the README in ./lib for API reference documentation or check out the examples folder for some ways of using the library

Basic example

One of the simplest examples that assumes vosk-browser is loaded via a script tag. It loads the model named model.tar.gzlocated in the same path as the script and starts listening to the microphone. Recognition results are logged to the console.

async function init() {
    const model = await Vosk.createModel('model.tar.gz');

    const recognizer = new model.KaldiRecognizer();
    recognizer.on("result", (message) => {
        console.log(`Result: ${message.result.text}`);
    });
    recognizer.on("partialresult", (message) => {
        console.log(`Partial result: ${message.result.partial}`);
    });
    
    const mediaStream = await navigator.mediaDevices.getUserMedia({
        video: false,
        audio: {
            echoCancellation: true,
            noiseSuppression: true,
            channelCount: 1,
            sampleRate: 16000
        },
    });
    
    const audioContext = new AudioContext();
    const recognizerNode = audioContext.createScriptProcessor(4096, 1, 1)
    recognizerNode.onaudioprocess = (event) => {
        try {
            recognizer.acceptWaveform(event.inputBuffer)
        } catch (error) {
            console.error('acceptWaveform failed', error)
        }
    }
    const source = audioContext.createMediaStreamSource(mediaStream);
    source.connect(recognizerNode);
}

window.onload = init;

Todos

  • Support for word/phrase lists in KaldiRecognizer
  • Add example with word/phrase list
  • Write tests
  • Automate npm publish
  • Update to OpenFST 1.8.0
Comments
  • Unable to load model

    Unable to load model

    Hi,

    Thanks for this work. I am using Chrome. The model file model.tar.gz is placed in the same folder. It never moves past "Loading..." message!

    how do you start the demo locally?

    Navigated to modern-vanilla directory and launched python3 -m http.server

    can you share the output of the browser console?

    ERROR (VoskAPI:Model():src/model.cc:122) Folder '/vosk/model_tar_gz' does not contain model files. Make sure you specified the model path properly in Model constructor. If you are not sure about relative path, use absolute path specification. put_char @ 82049aad-16de-4cf3-9fcf-0c277f01fe02:41

    opened by raghavendrajain 8
  • Build broken by kaldi repo

    Build broken by kaldi repo

    The kaldi repo no longer has an upstream-1.8.0 branch nor a revision 75ecaef39 (thanks, git, for allowing erasing history). Right now, vosk-browser doesn't build because of these issues.

    opened by Yahweasel 8
  • Online demo created

    Online demo created

    Not sure if this is of any use, but I created a small online demo using this tool when I was experimenting with it. You can view it online at

    https://captioner.richardson.co.nz/

    And the source code for it is at: https://github.com/Rodeoclash/captioner

    It might be possible to adapt this for an official demo if you're interested (although it is lacking a few things at the moment, i.e. it only works on video and currently the videos have no audio when playing).

    opened by Rodeoclash 7
  • Recognizer.removeEventListener

    Recognizer.removeEventListener

    I am currently using Vue Js to run Vosk-browser and manage to call the ASR model and Kaldi recognizer by using

    this.recognizer.on("result", (message) => {
        const result = message.result;
        this.full.textContent += result.text + " "
    })
    

    The model is working well, however, I am trying to remove the event listener by using:

    this.recognizer.removeEventListener("result", (message) => {
        const result = message.result;
        this.full.textContent += result.text + " "
    })
    

    Is this the way of doing it?

    opened by stevenlimcorn 7
  • Vosk model

    Vosk model

    I am new to javascript. I want to see how the vosk-browser script worked using the sample script. I downloaded a vosk model, zipped it as tar.gz and put it in the same folder as the script. I tried to just check for errors using a button onclick event on a html page. I got this on visual studio code: Setting up persistent storage at /vosk null/4ccd8af6-9ac1-407c-9f6a-436d83146d69:147 File system synced from host to runtime null/4ccd8af6-9ac1-407c-9f6a-436d83146d69:40 Am I to create a folder named "vosk". I really do not understand. Thank you for responding.

    opened by temitopefunmi 6
  • Webpage is not loading

    Webpage is not loading

    I have little coding experience, but I followed all guidelines to launch the demo app from examples/react folder. I ran npm install, npm build and a few other commands to resolve errors for webpack 5. However, finally when I ran npm run start, the vosk-browser fails to launch, even though no errors are detected. The page is empty.

    C:\Users\CNata\Downloads\vosk-browser-master\examples\react>npm run start
    
    > [email protected] start
    > react-scripts start
    
    (node:17120) [DEP_WEBPACK_DEV_SERVER_ON_AFTER_SETUP_MIDDLEWARE] DeprecationWarning: 'onAfterSetupMiddleware' option is deprecated. Please use the 'setupMiddlewares' option.
    (Use `node --trace-deprecation ...` to show where the warning was created)
    (node:17120) [DEP_WEBPACK_DEV_SERVER_ON_BEFORE_SETUP_MIDDLEWARE] DeprecationWarning: 'onBeforeSetupMiddleware' option is deprecated. Please use the 'setupMiddlewares' option.
    Starting the development server...
    Compiled successfully!
    
    You can now view vosk-browser-react-demo in the browser.
    
      Local:            http://localhost:3000/vosk-browser
      On Your Network:  http://192.168.56.1:3000/vosk-browser
    
    Note that the development build is not optimized.
    To create a production build, use npm run build.
    
    webpack compiled successfully
    Files successfully emitted, waiting for typecheck results...
    Issues checking in progress...
    No issues found.
    
    opened by Nata0801 5
  • Result event not triggered on file upload

    Result event not triggered on file upload

    Hello, I am working on a way to pass audio file to the recognizer all at once.

    I took the react example and edited file-upload.tsx to send the whole file as buffer to the AudioStreamer "_write" method. The problem reside on the "result" event of the recognizer not being fired after the process. The "partialresult" event is called with every words but misses timestamps.

    Here is the implementation of the "onChange" function in file-upload.tsx:

    const onChange = useCallback(
        async ({ file }: UploadChangeParam<UploadFile<any>>) => {
    
          if (
            recognizer &&
            file.originFileObj &&
            file.percent === 100
          ) {
            const fileUrl = URL.createObjectURL(file.originFileObj);
            const _audioContext = audioContext ?? new AudioContext();
            const arr = await fetch(fileUrl).then((res) => res.arrayBuffer());
    
            _audioContext.decodeAudioData(arr, (buffer) => {
              let audioStreamer = new AudioStreamer(recognizer);
              audioStreamer._write(buffer, {
                objectMode: true,
              }, () => {
                console.log('done')
              });
            });
          }
        },
        [audioContext, recognizer]
      );
    

    I have also noticed when uploading a second file it works well, the result event is triggered and includes both files data.

    What I am missing? Is there a way to dispatch a "result" event?

    opened by Clement-mim 4
  • Recognizer listens before the event 'result' or 'partialresult' is added

    Recognizer listens before the event 'result' or 'partialresult' is added

    Hello! If I say "Hello" and then run the code below, I get the result "Hello".

    this.recognizer.addEventListener('partialresult', this.getPartialResult);
    this.recognizer.addEventListener('result', this.getResult);
    

    Expected: recognizer starts listening when got event listener.

    I am creating a feature in which users press and speak.

    I thought to write when users don't need a microphone like this, but then the recognizer pauses.

    this.mediaStream.getAudioTracks (). forEach (track => {
       track.enabled = false;
    });
    

    So if after a long time user will press my button again, code will run track.enabled = true and recognizer will continue to recognize previous (not actual) voice.

    Tested on Vue.js

    opened by timpuyda 4
  • Failed to sync file system: Error: FS error

    Failed to sync file system: Error: FS error

    I am getting the following error in both Chrome and Firefox...

    Failed to sync file system: Error: FS error
    (anonymous) @ fcedf841-34f4-40cb-8bb0-17f857a1d44c:127
    Promise.catch (async)
    handleMessage @ fcedf841-34f4-40cb-8bb0-17f857a1d44c:126
    (anonymous) @ fcedf841-34f4-40cb-8bb0-17f857a1d44c:107
    

    fcedf841-34f4-40cb-8bb0-17f857a1d44c:127 links to the following code:

        class RecognizerWorker {
            constructor() {
                this.recognizers = new Map();
                ctx.addEventListener("message", (event) => this.handleMessage(event));
            }
            handleMessage(event) {
                const message = event.data;
                if (!message) {
                    return;
                }
                if (ClientMessage.isLoadMessage(message)) {
                    console.debug(JSON.stringify(message));
                    const { modelUrl } = message;
                    if (!modelUrl) {
                        ctx.postMessage({
                            error: "Missing modelUrl parameter",
                        });
                    }
                    this.load(modelUrl)
                        .then((result) => {
                        ctx.postMessage({ event: "load", result });
                    })
                        .catch((error) => {                                                       // --- IT'S THIS ERROR THAT IS CATCHING  
                        console.error(error);
                        ctx.postMessage({ error: error.message });
                    });
                    return;
    
    ... etc
    

    Do let me know if more details to reproduce the error are needed.

    Thankyou!!

    opened by mattmegarry 4
  • Unable to load model in nodejs

    Unable to load model in nodejs

    When I run the following code:

    let Vosk = require("vosk-browser");
    let url = "model.tar.gz";
    async function init() {
      const model = await Vosk.createModel(url);
    }
    init();
    

    I get this error:

    this.worker.addEventListener("message", (event) => this.handleMessage(event));
                       ^
    
    TypeError: this.worker.addEventListener is not a function
        at EventTarget.initialize (/Users/bobby/Desktop/vosk-browser/node_modules/vosk-browser/dist/vosk.js:238:25)
        at new Model (/Users/bobby/Desktop/vosk-browser/node_modules/vosk-browser/dist/vosk.js:235:18)
        at Object.<anonymous> (/Users/bobby/Desktop/vosk-browser/node_modules/vosk-browser/dist/vosk.js:354:27)
        at Generator.next (<anonymous>)
        at /Users/bobby/Desktop/vosk-browser/node_modules/vosk-browser/dist/vosk.js:28:75
        at new Promise (<anonymous>)
        at __awaiter (/Users/bobby/Desktop/vosk-browser/node_modules/vosk-browser/dist/vosk.js:24:16)
        at Object.createModel (/Users/bobby/Desktop/vosk-browser/node_modules/vosk-browser/dist/vosk.js:353:16)
        at init (/Users/bobby/Desktop/vosk-browser/index.js:5:28)
        at Object.<anonymous> (/Users/bobby/Desktop/vosk-browser/index.js:7:1)
    

    My folder structure is

    |
    |-- index.js
    |-- model.tar.gz
    |-- node_modules/
    

    So I would thing the program could load the model, but I also get the same error when I set the url to be complete gibberish.

    Thanks for your help

    opened by LittleRobertTables 4
  • Attribution difficult

    Attribution difficult

    The NOTICES file doesn't include all dependent software, but every piece of dependent software requires attribution. This makes it extremely difficult for anyone to put together a correct (and legally mandatory) attribution and license notice. I put this one together, which I believe includes all dependencies: https://raw.githubusercontent.com/Yahweasel/ennuicastr/master/src/vosk-browser-license.js .

    Moreover, I was surprised to find GSL in the mix. GSL is under the GPL (not the LGPL), so if it's being used, then vosk-browser as a whole is licensed under the GPL. That's no problem for my use, but it should be documented somewhere. Weirdly, though, as far as I can tell, it's not actually using GSL. The kaldi patch seems to add GSL to the configure, but doesn't add any uses of GSL as far as I can tell. If it was some experiment (perhaps from the original porter of vosk?) it should just be removed, to fix this licensing snafu.

    opened by Yahweasel 4
  • information available in the User Agent string will be reduced

    information available in the User Agent string will be reduced

    A page or script is accessing at least one of navigator.userAgent, navigator.appVersion, and navigator.platform. Starting in Chrome 101, the amount of information available in the User Agent string will be reduced. To fix this issue, replace the usage of navigator.userAgent, navigator.appVersion, and navigator.platform with feature detection, progressive enhancement, or migrate to navigator.userAgentData. Note that for performance reasons, only the first access to one of the properties is shown

    opened by praksun 0
  • Delays when transcribing streaming audio

    Delays when transcribing streaming audio

    First of all, excellent work. Vosk is great as it is, and this library makes it even better.

    I am experiencing a heavy delay on transcription when pulling in a stream from webRTC (partials and fulls).

    I suspect maybe it is because of the deprecated "createScriptProcessor" and "onaudioprocess" pieces, but I am unsure.

    Here is how I am processing things. If you have any ideas as to why things would be delayed, please let me know. Thank you.

    this.recognizeSpeech = async () => {
        console.log("starting recognizeSpeech");
        let audioContext = this.remoteAudioContext;
        let remoteStream = this.incomingAudioStream;
        //
        const recognizerNode = audioContext.createScriptProcessor(4096, 1, 1);
        const model = await createModel("./softphone/model.tar.gz");
        const recognizer = new model.KaldiRecognizer(48000);
        recognizer.setWords(true);
        recognizer.on("partialresult", function (message) {
          console.log("PARTIAL: " + message.result.partial);
        });
        recognizerNode.onaudioprocess = async (event) => {
          try {
            recognizer.acceptWaveform(event.inputBuffer);
          } catch (error) {
            console.error("acceptWaveform failed", error);
          }
        };
        this.remoteTrack.connect(recognizerNode).connect(audioContext.destination);
      };
    
    opened by scott-vector 3
  • Build output location

    Build output location

    I am able to get the build to complete (when using the modification made in #56), but I cannot find the output files. I run the build by running make in the vosk-browser directory. Where does the build output its files? Do the output files need to be manually extracted from the Docker container?

    opened by stevennyman 2
  • How to create an example of the X-vector of the speaker (voice fingerprint)?

    How to create an example of the X-vector of the speaker (voice fingerprint)?

    Hello. First of all very big thank you for this project.

    I am trying to create an example with a speaker model to get the X-vector of the speaker (voice fingerprint).

    I am using this example: https://github.com/ccoreilly/vosk-browser/blob/master/examples/words-vanilla/index.js

    const model = await Vosk.createModel('vosk-model-small-en-in-0.4.tar.gz');
    const speakerModel = await Vosk.createSpeakerModel('vosk-model-spk-0.4.zip');
    
    ...
    
    const recognizer = new model.KaldiRecognizer(sampleRate, JSON.stringify(['[unk]', 'encen el llum', 'apaga el llum']));
    recognizer.setSpkModel(speakerModel);
    recognizer.on("result", (message) => {
    	const result = message.result;
    	if(result.hasOwnProperty('spk'))
    		console.info("X-vector:", result.spk);
    });
    

    Speaker identification model: https://alphacephei.com/vosk/models/vosk-model-spk-0.4.zip

    Node.js example: https://github.com/alphacep/vosk-api/blob/master/nodejs/demo/test_speaker.js

    Could you offer some advice, please:

    1. How to load vosk-model-spk-0.4.zip
    2. How to implement methods createSpeakerModel and setSpkModel
    3. How to fetch the X-vector of the speaker (voice fingerprint)? Thank you for your answer.
    opened by arbdevml 5
  • AudioWorklet support via SEPIA Web Audio?

    AudioWorklet support via SEPIA Web Audio?

    Hi everybody,

    I just saw this project and thought it was very interesting and fits quite well to a library I've just released :slightly_smiling_face: . For my SEPIA Open Assistant project I've built the SEPIA Web Audio Library that can handle custom audio pipelines with AudioWorklet and Web-Worker support. There is pretty good WASM support as well since the resampler for example can use Speex via a WASM module.

    The library has a module that interfaces with Vosk via the SEPIA STT-Server (a WebSocket streaming STT server). Currently I prefer to host Vosk on a Raspberry Pi 4 instead of running it on the client, but I'm pretty sure much of the code could be reused :smiley: .

    Let me know if this sounds interesting to you and I can help to get started!

    opened by fquirin 6
Owner
Ciaran O'Reilly
Ciaran O'Reilly
The Fastest DNN Running Framework on Web Browser

WebDNN: Fastest DNN Execution Framework on Web Browser WebDNN is an open source software framework for executing deep neural network (DNN) pre-trained

Machine Intelligence Laboratory (The University of Tokyo) 1.9k Jan 1, 2023
architecture-free neural network library for node.js and the browser

Synaptic Important: Synaptic 2.x is in stage of discussion now! Feel free to participate Synaptic is a javascript neural network library for node.js a

Juan Cazala 6.9k Dec 27, 2022
A library for prototyping realtime hand detection (bounding box), directly in the browser.

Handtrack.js View a live demo in your browser here. Handtrack.js is a library for prototyping realtime hand detection (bounding box), directly in the

Victor Dibia 2.7k Jan 3, 2023
This is a JS/TS library for accelerated tensor computation intended to be run in the browser.

TensorJS TensorJS How to use Tensors Tensor operations Reading values Data types Converting between backends Onnx model support Optimizations Running

Frithjof Winkelmann 32 Jun 26, 2022
Deep Learning in Javascript. Train Convolutional Neural Networks (or ordinary ones) in your browser.

ConvNetJS ConvNetJS is a Javascript implementation of Neural networks, together with nice browser-based demos. It currently supports: Common Neural Ne

Andrej 10.4k Dec 31, 2022
Run Keras models in the browser, with GPU support using WebGL

**This project is no longer active. Please check out TensorFlow.js.** The Keras.js demos still work but is no longer updated. Run Keras models in the

Leon Chen 4.9k Dec 29, 2022
Train and test machine learning models for your Arduino Nano 33 BLE Sense in the browser.

Tiny Motion Trainer Train and test IMU based TFLite models on the Web Overview Since 2009, coders have created thousands of experiments using Chrome,

Google Creative Lab 59 Nov 21, 2022
Bayesian bandit implementation for Node and the browser.

#bayesian-bandit.js This is an adaptation of the Bayesian Bandit code from Probabilistic Programming and Bayesian Methods for Hackers, specifically d3

null 44 Aug 19, 2022
Simple Javascript implementation of the k-means algorithm, for node.js and the browser

#kMeans.js Simple Javascript implementation of the k-means algorithm, for node.js and the browser ##Installation npm install kmeans-js ##Example (JS)

Emil Bay 44 Aug 19, 2022
Clustering algorithms implemented in Javascript for Node.js and the browser

Clustering.js ####Clustering algorithms implemented in Javascript for Node.js and the browser Examples License Copyright (c) 2013 Emil Bay github@tixz

Emil Bay 29 Aug 19, 2022
A neural network library built in JavaScript

A flexible neural network library for Node.js and the browser. Check out a live demo of a movie recommendation engine built with Mind. Features Vector

Steven Miller 1.5k Dec 31, 2022
A lightweight library for neural networks that runs anywhere

Synapses A lightweight library for neural networks that runs anywhere! Getting Started Why Sypapses? It's easy Add one dependency to your project. Wri

Dimos Michailidis 65 Nov 9, 2022
A JavaScript deep learning and reinforcement learning library.

neurojs is a JavaScript framework for deep learning in the browser. It mainly focuses on reinforcement learning, but can be used for any neural networ

Jan 4.4k Jan 4, 2023
A WebGL accelerated JavaScript library for training and deploying ML models.

TensorFlow.js TensorFlow.js is an open-source hardware-accelerated JavaScript library for training and deploying machine learning models. ⚠️ We recent

null 16.9k Jan 4, 2023
Linear Regression library in pure Javascript

Lyric Linear Regression library in pure Javascript Lyric can help you analyze any set of x,y series data by building a model that can be used to: Crea

Flurry, Inc. 43 Dec 22, 2020
Machine Learning library for node.js

shaman Machine Learning library for node.js Linear Regression shaman supports both simple linear regression and multiple linear regression. It support

Luc Castera 108 Feb 26, 2021
Support Vector Machine (SVM) library for nodejs

node-svm Support Vector Machine (SVM) library for nodejs. Support Vector Machines Wikipedia : Support vector machines are supervised learning models t

Nicolas Panel 296 Nov 6, 2022
machinelearn.js is a Machine Learning library written in Typescript

machinelearn.js is a Machine Learning library written in Typescript. It solves Machine Learning problems and teaches users how Machine Learning algorithms work.

machinelearn.js 522 Jan 2, 2023
FANN (Fast Artificial Neural Network Library) bindings for Node.js

node-fann node-fann is a FANN bindings for Node.js. FANN (Fast Artificial Neural Network Library) is a free open source neural network library, which

Alex Kocharin 186 Oct 31, 2022