Browser-compatible JS library for running language models

Overview

Huggingface Transformers Running in the Browser

Azure Static Web Apps CI/CD

This library enables you to run huggingface transformer models directly in the browser. It accomplishes this by running the models using the ONNX Runtime JavaScript API and by implementing its own JavaScript-only tokenization library.

At the moment, it is compatible with Google's T5 models, but it was designed to be expanded. I hope to support GPT2, Roberta, and InCoder in the future.

Live Demo

https://transformers-js.praeclarum.org

This demo is a static website hosted on Azure Static Web Apps. No code is executed on the server. Instead, the neural network is downloaded and executed in the browser.

See the Makefile demo rule to see how the demo is built.

Usage

This example shows how to use the library to load the T5 neural network to translate from English to French.

// Load the tokenizer and model.
const tokenizer = await AutoTokenizer.fromPretrained("t5-small", "/models");
const model = await AutoModelForSeq2SeqLM.fromPretrained("t5-small", "/models");

// Translate "Hello, world!"
const english = "Hello, world!";
const inputTokenIds = tokenizer.encode("translate English to French: " + english);
const outputTokenIds = await model.generate(inputTokenIds, {maxLength:50,topK:10});
const french = tokenizer.decode(outputTokenIds, true);
console.log(french); // "Bonjour monde!"

To run this demo, you need to have converted the model to ONNX format using the Model Conversion Tool.

python3 tools/convert_model.py t5-small models

Library

The library contains several components:

  1. Tokenizers to load and execute pretrained tokenizers from their huggingface JSON representation.
  2. Transformers to load and execute pretrained models from their ONNX representation.
  3. Model Converter to convert huggingface models to ONNX to be served by your static web server.

Tokenizers

tokenizers.js

Transformers

transformers.js

Models

Currently only the T5 network is supported.

Sampling

The neural network outputs the logarithm of the probability of each token. In order to get a token, a probabilistic sample has to be taken. The following algorithms are implemented:

  • Greedy: Take the token with the highest probability.
  • Top-k: Take the top-k tokens with the highest probability.

Model Converter

The ONNX Runtime for the Web is used to run models in the browser.

You can run the conversion from the command line:

python3 tools/convert_model.py <modelid> <outputdir> <quantize> <testinput>

For example:

python3 tools/convert_model.py praeclarum/cuneiform ./models true "Translate Akkadian to English: lugal"

Or you can run it from Python:

from convert_model import t5_to_onnx

onnx_model = t5_to_onnx("t5-small", output_dir="./models", quantized=True)

Developer Note: The model conversion script is a thin wrapper over the amazing fastT5 library by @Ki6an. The wrapper exists because I hope to support more model types in the future.

You might also like...

Javascript library for generating identicons. Running in the browser and on Node.js.

Javascript library for generating identicons. Running in the browser and on Node.js.

Jdenticon JavaScript library for generating highly recognizable identicons using HTML5 canvas or SVG. Live demo https://jdenticon.com Getting started

Jan 3, 2023

Labels issues using OpenAI's Classification API powered by GPT-3 models!

Labels issues using OpenAI's Classification API powered by GPT-3 models!

OpenAI Issue Labeler 🤖 This GitHub action labels issues using OpenAI's Classification API powered by GPT-3 models! We are using curie as our completi

Dec 21, 2022

A tool to modify onnx models in a visualization fashion, based on Netron and flask.

A tool to modify onnx models in a visualization fashion, based on Netron and flask.

English | 简体中文 Introduction To edit an ONNX model, One common way is to visualize the model graph, and edit it using ONNX Python API. This works fine.

Jan 4, 2023

🧙 Mage is an open-source data management platform that helps you clean data and prepare it for training AI/ML models.

🧙 Mage is an open-source data management platform that helps you clean data and prepare it for training AI/ML models.

Intro Mage is an open-source data management platform that helps you clean data and prepare it for training AI/ML models. What does this do? The curre

Jan 4, 2023

Interactpedia is a frontend web application that models how engagement

Interactpedia is a frontend web application that models how engagement

Frontend web app modeling the integration of engagement features into crowdsourcing platforms to improve user info retention made as research for Modeling the Effects of Engagement Methods in Online Crowd-sourcing Platforms from Governor’s School of New Jersey Program in Engineering & Technology 2022.

Aug 25, 2022

Auto-preload multiple relationships when retrieving Lucid models

Auto-preload multiple relationships when retrieving Lucid models

Adonis Auto-Preload Auto-preload multiple relationships when retrieving Lucid models Pre-requisites Node.js = 16.17.0 Installation npm install @melch

Nov 26, 2022

Javascript implementation of flasher tool for Espressif chips, running in web browser using WebSerial.

Javascript implementation of esptool This repository contains a Javascript implementation of esptool, a serial flasher utility for Espressif chips. Un

Dec 22, 2022

A hackable C# based scripting environment for 3D modeling running in the web browser.

A hackable C# based scripting environment for 3D modeling running in the web browser.

A hackable C# based scripting environment for 3D modeling running in the web browser. Background Script based 3D modeling software running in the web

Nov 28, 2022
Comments
  • What would it take to allow further training of pretrained models in the browser?

    What would it take to allow further training of pretrained models in the browser?

    First of all, let me start by complimenting your amazing work.

    I am trying to learn AI/ML for the job market. Transformer models seem to be the state-of-the-art AI models, and your work just made it more accessible to everyone.

    This is not a bug report, rather it is a question.

    I think that it will be great if these pretrained models could be further trained using the browser interface as well. I am wondering whether an additional few series of training data, perhaps a series of questions and answers (may be like initially some text around 10kb in size) can be used to further train a model with some reasonable amount of time and compute power, in the browser itself.

    Do you think it is possible, and if yes then do you have a plan to make this feature and user interface, a part of this project?

    question 
    opened by ai-nonbeliever 1
Owner
Frank A. Krueger
I am a mobile app developer specializing in iOS and .NET programming. I am into graphics, AI, robotics, and programming languages.
Frank A. Krueger
Hemsida för personer i Sverige som kan och vill erbjuda boende till människor på flykt

Getting Started with Create React App This project was bootstrapped with Create React App. Available Scripts In the project directory, you can run: np

null 4 May 3, 2022
Kurs-repo för kursen Webbserver och Databaser

Webbserver och databaser This repository is meant for CME students to access exercises and codealongs that happen throughout the course. I hope you wi

null 14 Jan 3, 2023
Jugglr is a tool for managing test data and running tests with a dedicated database running in a Docker container.

Jugglr Jugglr is a tool for managing test data and running tests with a lightweight, dedicated database. Jugglr enables developers, testers, and CI/CD

OSLabs Beta 74 Aug 20, 2022
i18n-language.js is Simple i18n language with Vanilla Javascript

i18n-language.js i18n-language.js is Simple i18n language with Vanilla Javascript Write by Hyun SHIN Demo Page: http://i18n-language.s3-website.ap-nor

Shin Hyun 21 Jul 12, 2022
When a person that doesn't know how to create a programming language tries to create a programming language

Kochanowski Online Spróbuj Kochanowskiego bez konfiguracji projektu! https://mmusielik.xyz/projects/kochanowski Instalacja Stwórz nowy projekt przez n

Maciej Musielik 18 Dec 4, 2022
Write "hello world" in your native language, code "hello world" in your favorite programming language!

Hello World, All languages! ?? ?? Write "hello world" in your native language, code "hello world" in your favorite language! #hacktoberfest2022 How to

Carolina Calixto 6 Dec 13, 2022
An authorization library that supports access control models like ACL, RBAC, ABAC in modern JavaScript platforms

Casbin-Core ?? Looking for an open-source identity and access management solution like Okta, Auth0, Keycloak ? Learn more about: Casdoor News: still w

Casbin 6 Oct 20, 2022
A regular table library, for async and virtual data models.

A Javascript library for the browser, regular-table exports a custom element named <regular-table>, which renders a regular HTML <table> to a sticky p

J.P. Morgan Chase & Co. 285 Dec 16, 2022
Fnon is a client-side JavaScript library for models, loading indicators, notifications, and alerts which makes your web projects much better.

???????? Fnon is the name of my late mother, It's an Arabic word which means Art, I created this library in honor of her name. Fnon is a client-side J

Adel N Al-Awdy 5 Sep 11, 2022
Browser library compatible with Node.js request package

Browser Request: The easiest HTTP library you'll ever see Browser Request is a port of Mikeal Rogers's ubiquitous and excellent [request][req] package

Iris Couch 357 Nov 11, 2022