tokenizer

A grammar describes the syntax of a programming language, and might be defined in Backus-Naur form (BNF). A lexer performs lexical analysis, turning text into tokens. A parser takes tokens and builds a data structure like an abstract syntax tree (AST). The parser is concerned with context: does the sequence of tokens fit the grammar? A compiler is a combined lexer and parser, built for a specific grammar.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tokenizer

Here are 1,075 public repositories matching this topic...

vinhkhuc / Twitter-Tokenizer

huangsam / e2cprog

roy-a / Roy_VnTokenizer

lahaxearnaud / laravel-token

SumeetSinha / Tokenizer.cpp

romruben / TFM

tqtg / nlp-tokenizer

felipensp / liblex

duytri / SplitAndTokenization

polygonplanet / Chiffon

ows-ali / languageTranslator

maheshambiga / Token-based-NodeJS-Login-App

duytri / Docs2WordsJava

poojithansl / POS_Tagging

accraze / text2token

rishabhindoria / USC-Artificial-Intelligence

zambonin / rltools

oroszgy / spaCy-tokenizer-benchmark

ThrusterIO / tokenizer

hanusri / Tokenization-and-Stemming

Related Topics