Functional JS Arithmetic Expression Compiler

Compiler exercise for education. Parsing infix arithmetic expressions via LL(1) recursive descent, adhering to functional approaches.

Features

Pure functions only: no side effects (apart from STDOUT), no mutations, all const etc.
Uses daggy for tagged unions (aka sum types)
- Enables type checking e.g. Token.Number.is(x) and Token.is(x)
- Enables simple pattern matching / catamorphisms e.g. someToken.cata({ Mul: () => true, Div: () => false }) with automatic errors in case of match failure
Uses Immutable.js List for functional collections
- Enables efficient unshift, push, and slice
- Immutability provides strong reasoning benefits

Use

yarn # installs
yarn test # passes
yarn start '-9 * 2 / -(3 + 7) + ((-4 * 1/2) - -21)' # outputs RPN string
yarn start '-9 * 2 / -(3 + 7) + ((-4 * 1/2) - -21)' --eval # outputs num

Supports

Integers
Negation
Addition
Subtraction
Multiplication
Division
Grouping (parens)

Note that a CST / PT preserves all or almost all of the syntax as represented by the language's grammar, whereas an Abstract Syntax Tree (AST) reduces the information to the bare minimum necessary for a given use case. Currently our CST simplifies parenthetical expressions, though we could make it preserve parens.

Generator

Dispatches based on node type to recursively process the parse tree. Two generators included:

an infix -> RPN compiler
a numerical evaluator

The latter generator can be chosen from the command line by appending the --eval flag.

Grammar

A grammar is the set of production rules for a language; that is, substitutions which eventually generate any valid string in the language. A simple grammar for expressions (starting from E) is:

E -> (E)
E -> E + E
E -> E * E
E -> number

Unfortunately, while the above grammar is valid (producing sentences in the language), it is also ambiguous (could be represented by different parse trees, affecting the semantic meaning of the expression). The following is an unambiguous grammar:

E -> E + T
E -> T
T -> T ∗ F
T -> F
F -> (E)
F -> number

Unfortunately, this grammar is left-recursive, which makes it impossible to parse via LL recursive descent (one of the easiest parsers to implement). The following refactors the grammar to be parsed via an LL(1) algorithm:

E  ->   T T2
T2 -> + T T2
T2 -> ε
T  ->   F F2
F2 -> * F F2
F2 -> ε
F  -> (E)
F  -> number

A formal EBNF notation of a similar grammar is included in the docs folder, with the augmentation that we allow for multiplication, subtraction, and negative factors.

Resources

Stanford CS143 Notes on Parsing (PDF)
Online RPN <-> Infix Compiler / Evaluator (useful for double-checking)

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.github		.github
docs		docs
src		src
.editorconfig		.editorconfig
.eslintrc.yml		.eslintrc.yml
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
README.md		README.md
package.json		package.json
yarn.lock		yarn.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.github

.github

docs

docs

src

src

.editorconfig

.editorconfig

.eslintrc.yml

.eslintrc.yml

.gitignore

.gitignore

CODE_OF_CONDUCT.md

CODE_OF_CONDUCT.md

LICENSE

LICENSE

README.md

README.md

package.json

package.json

yarn.lock

yarn.lock

Repository files navigation

Functional JS Arithmetic Expression Compiler

Features

Use

Supports

Contents

Lexer

Parser

Generator

Grammar

Resources

About

Releases

Packages

Languages

License

glebec/functional-math-compiler

Folders and files

Latest commit

History

Repository files navigation

Functional JS Arithmetic Expression Compiler

Features

Use

Supports

Contents

Lexer

Parser

Generator

Grammar

Resources

About

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Languages