A lexical analyzer based on DFA that made by JS and supports multi-language extension

Overview

lexer

一个基于DFA法的支持多语言扩展的JS版开源词法分析器,快速了解与体验请查看线上网站

It is a lexical analyzer based on DFA that made by JS and supports multi-language extension. For quick understanding and experience , please check the english document and online website .

img   Build Status   img   GitHub

目录

1、项目背景

目前常见的词法分析器与语言耦合较为紧密且代码量较为庞大,难以关注词法分析器其本质原理,所以萌生了写一个完全与语言解耦且精简的词法分析器的想法,以把关注重心放在词法分析器的工作原理上,不再需要考虑由不同语言造成的细枝末节差异,于是就有了此lexer 项目。

lexer主要通过以下两个JS文件,实现词法分析器与语言的解耦

  • lexer.js文件是词法分析器的核心,主要分为ISR(输入流读取器)和DFA(有限状态自动机),代码会保持在300行内
  • lang/{lang}-define.js文件是词法分析器的扩展,支持不同语言的接入,如lang/c-define.js文件

2、功能介绍

(1) 完整的词法分析

从输入字符序列,到分析结束后生成tokenlexer具备了完整的词法分析功能,如内置的C语言版lexer共支持11种类型的token

img

(2) 支持多语言扩展

lexer支持接入如PythonGo等不同的语言,实现对不同语言进行词法分析的需求,扩展接入方式见贡献部分,目前已支持如下语言的词法分析

  • C :一种比较底层的编程语言,点击查看 它的词法分析
  • SQL :一种数据库查询语言,点击查看 它的词法分析
  • Goal :一个来自LeetCode的Goal解析器题目,点击查看 它的词法分析

(3) 记录状态流转信息

词法分析器的核心机制是基于DFA的状态流转,为此lexer记录了详细的状态流转信息,以实现使用方的以下需求

  • 功能调试模式
  • 自动生成DFA状态流转图

3、获取项目

使用git clone获取本项目后,不需要任何依赖的安装,也不需多余的安装步骤

4、使用方式

(1) 在代码中使用

如果有在代码中使用lexer的需求(如Web版代码编辑器),需要依次引入以下文件

  • /lang/{lang}-define.js
  • lexer.js

然后直接访问lexer变量即可获取到词法分析器对象,其中tokens数据可以通过访问lexer.DFA.result.tokens获取

// 1. 需要词法分析的代码
let stream = "int a = 10;";

// 2. 开始词法分析
lexer.start(strem);

// 3. 词法分析结束后, 获取生成的tokens
let parsedTokens = lexer.DFA.result.tokens;

// 4. 做你想做的
parsedTokens.forEach((token) => {
    // ... ...
});

功能介绍中所描述的记录状态流转信息,通过访问flowModel.result.paths即可获取到lexer内部状态机在每次状态流转时的详细信息,数据格式如下所示

[
    {
        state: 0, // 当前状态
        ch: "a", // 当前读入的字符
        nextSstate: 2, // 下一个状态
        match: true, // 是否匹配
        end: false, // 是否是最后一个字符
    },
    // ... ...
]

(2) 可视化预览与测试

lexer的自动化测试会在页面打开前自动完成,打开浏览器控制台查看自动化测试的具体情况

为了实时查看lexer的工作效果,也方便对其进行开发测试,在项目根目录下有一个index.html文件,直接在浏览器中打开,输入代码后会自动输出经过lexer分析后生成的Token,如下图所演示

int a = 10;
int b =20;
int c = 20;

float f = 928.2332;
char b = 'b';

if(a == b){
    printf("Hello, World!");
}else if(b!=c){
    printf("Hello, World! Hello, World!");
}else{
    printf("Hello!");
}

img

或者请查看线上网站

5、参与贡献

  • 提供更多语言的 /lang/{lang}-define.js
  • 源码分析以及如何接入不同语言的扩展,请见实现原理文档
  • 如果有使用问题或疑问需要反馈,欢迎互动交流,点击提交issue

6、协议说明

GitHub

You might also like...

VSCode extension for the rickroll-lang programming language (incomplete)

Rickroll-Lang VSCode Extension The Rick Roll programming language is a rickroll based, process oriented, dynamic, strong, esoteric programming languag

Oct 10, 2022

Alpine.js Language Features (Volar) extension for coc.nvim

[Experimental] coc-volar-alpinejs fork from vscode-alpine-language-features Alpine Language Features extension for coc.nvim Note @volar/alpine-languag

Oct 12, 2022

An enhanced VSCode extension for the Move programming language.

Move Analyzer Plus Provides language support for the Move programming language. Install on the VSCode Extension Marketplace: Move Analyzer Plus on the

Aug 12, 2022

Let's build a VS Code extension for AL Language!

Let's build a VS Code extension for AL Language!

AL Pragma Explorer README Features This extension enables a view in the Explorer. The view shows a treeview of workspaces pragmas files Line where-use

Oct 14, 2022

Visual Studio Code extension for supporting jaksel-language

Visual Studio Code extension for supporting jaksel-language

jaksel-language-support for Visual Studio Code Visual Studio Code extension for supporting jaksel-language coding experience. Features Syntax Highligh

Oct 23, 2022

Image hosting based on Cloudflare R2. Supports PicGo.

Cloudflare R2 ImageBed English | 简体中文 CF-R2-ImageBed is a image hosting service based on Cloudflare R2 object storage. PicGo supported. Cloudflare R2

Oct 4, 2022

Browser extension for generating HOTP passcodes for Duo Security Multi-Factor Authentication

Browser extension for generating HOTP passcodes for Duo Security Multi-Factor Authentication

duo-extension Browser extension for generating HOTP passcodes for Duo Security multi-factor authentication. Compatible with Firefox and Chromium-based

Oct 25, 2022

Leader Board is a simple project based on JavaScript programing language. The purpose of this project is to work with APIs and ASYNC & AWAIT methods. I have used vanilla JavaScript with web pack to implement this project

Leader Board is a simple project based on JavaScript programing language. The purpose of this project is to work with APIs and ASYNC & AWAIT methods. I have used vanilla JavaScript with web pack to implement this project

Leader Board - JavaScript Project Table of contents Overview The challenge Screenshot Links Project Setup commands My process Built with What I learne

Oct 21, 2022

A Flow-based programming language for universal applications.

A Flow-based programming language for universal applications.

Hlang A Flow-based programming language for universal applications. Hlang aims to make programming easier, faster and more comfortable. It avoids codi

Dec 25, 2022
Comments
Releases(v1.8.2)
  • v1.8.2(Sep 19, 2022)

  • 1.8.1(Dec 19, 2021)

  • 1.8.0(Oct 16, 2021)

    Version :1.8.0

    Release date :2021-10-16

    desc :Added linefeed token and website support for real-time parsing

    Features

    • Feat: add a new token type to linefeed
    • Feat: support real-time parsing on the website
    Source code(tar.gz)
    Source code(zip)
  • 1.6.1(Oct 13, 2021)

    Version :1.6.1

    Release date :2021-10-13

    desc :Change property name and remove goal_lexer property of chain-lexer, the package of NPM

    Features

    • Refactor: Change property name of chain-lexer.
      • c_lexer: cLexer
      • sql_lexer: sqlLexer
    • Remove: remove goal_lexer property of chain-lexer
    Source code(tar.gz)
    Source code(zip)
  • 1.6.0(Oct 12, 2021)

    Version :1.6.0

    Release date :2021-10-12

    desc :Publish project to NPM, named chain-lexer

    Features

    • NPM support: You can use lexer(chain-lexer) in your project by npm
    Source code(tar.gz)
    Source code(zip)
  • 1.5.0(Oct 9, 2021)

    Version :1.5.0

    Release date :2021-10-10

    desc :Major upgrades to the project structure, such as package, shell, test, etc. You only need to import a /package/{lang}-lexer.min.js file in your project.

    Features

    • Add pack feature: pack /src/lexer.js and /src/lang/{lang}-define.js
    • Update testing: Decouple the testing from the /src/* file
    Source code(tar.gz)
    Source code(zip)
  • 1.0.0(Oct 3, 2021)

    Version :1.0.0

    Release date :2021-09-31

    desc :the first version 1.0.0 of lexer is released.

    Features

    • Complete lexical analysis
    • Support multi-language extension
    • Provide state flow log
    Source code(tar.gz)
    Source code(zip)
CSS selectors complexity and performance analyzer

analyze-css CSS selectors complexity and performance analyzer. analyze-css is built as a set of rules bound to events fired by CSS parser. Each rule c

Maciej Brencz 680 Dec 16, 2022
A Laravel Blade parser, compiler, and static analyzer written in TypeScript.

Blade Parser This library provides a Laravel Blade parser written in TypeScript. In addition to being able to parse Blade template files, this library

Stillat 7 Jan 4, 2023
A GitHub action to run Dart analyzer with annotation support.

⚒️ GitHub Action for Dart Analyzer A GitHub action to run Dart analyzer with annotation support. License Usage name: "analyze" on: # rebuild any PRs a

Invertase 149 Dec 28, 2022
Nepali Multi Date Picker for jQuery. Supports both single date selections and multiple date selection.

Nepali Multi Date Picker A simple yet powerful date picker based in Nepali calendar. Supports both single date selections and multiple date selection.

Sanil Shakya 4 May 23, 2022
i18n-language.js is Simple i18n language with Vanilla Javascript

i18n-language.js i18n-language.js is Simple i18n language with Vanilla Javascript Write by Hyun SHIN Demo Page: http://i18n-language.s3-website.ap-nor

Shin Hyun 21 Jul 12, 2022
When a person that doesn't know how to create a programming language tries to create a programming language

Kochanowski Online Spróbuj Kochanowskiego bez konfiguracji projektu! https://mmusielik.xyz/projects/kochanowski Instalacja Stwórz nowy projekt przez n

Maciej Musielik 18 Dec 4, 2022
Write "hello world" in your native language, code "hello world" in your favorite programming language!

Hello World, All languages! ?? ?? Write "hello world" in your native language, code "hello world" in your favorite language! #hacktoberfest2022 How to

Carolina Calixto 6 Dec 13, 2022
A refined tool for exploring open-source projects on GitHub with a file tree, rich Markdown and image previews, multi-pane multi-tab layouts and first-class support for Ink syntax highlighting.

Ink codebase browser, "Kin" ?? The Ink codebase browser is a tool to explore open-source code on GitHub, especially my side projects written in the In

Linus Lee 20 Oct 30, 2022
proxy 🦄 yxorp is your Web Proxy as a Service (SAAS) Multi-tenant, Multi-Threaded, with Cache & Article Spinner

proxy ?? yxorp is your Web Proxy as a Service (SAAS) Multi-tenant, Multi-Threaded, with Cache & Article Spinner. Batteries are included, Content Spinning and Caching Engine, all housed within a stunning web GUI. A unique high-performance, plug-and-play, multi-threaded website mirror and article spinner

4D/ҵ.com Dashboards 13 Dec 30, 2022
Weather app made using openweather api that supports over 200,000 cities

About This is an open-source weather app built using React.js, and you are welcome to add your unique touch to this project by contributing to the rep

Nikhil Mishra 6 Oct 17, 2022