Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom template tokenizers #148

Merged
merged 18 commits into from Apr 14, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
23 changes: 23 additions & 0 deletions README.md
Expand Up @@ -194,6 +194,29 @@ If set to `true`, to parse expressions in `v-bind` CSS functions inside `<style>

See also to [here](https://github.com/vuejs/rfcs/blob/master/active-rfcs/0043-sfc-style-variables.md).

### parserOptions.templateTokenizer

You can use `parserOptions.templateTokenizer` property to specify custom tokenizers to parse `<template lang="...">` tags.

For example to enable parsing of pug templates:

```jsonc
{
"parser": "vue-eslint-parser",
"parserOptions": {
"templateTokenizer": {
// template tokenizer for `<template lang="pug">`
"pug": "vue-eslint-parser-template-tokenizer-pug",
}
}
}
```

This option is only intended for plugin developers. **Be careful** when using this option directly, as it may change behaviour of rules you might have enabled.
If you just want **pug** support, use [eslint-plugin-vue-pug](https://github.com/rashfael/eslint-plugin-vue-pug) instead, which uses this option internally.

See [implementing-custom-template-tokenizers.md](./docs/implementing-custom-template-tokenizers.md) for information on creating your own template tokenizer.

## 🎇 Usage for custom rules / plugins

- This parser provides `parserServices` to traverse `<template>`.
Expand Down
70 changes: 70 additions & 0 deletions docs/implementing-custom-template-tokenizers.md
@@ -0,0 +1,70 @@
# Implementing custom template tokenizers

A custom template tokenizer needs to create two types of tokens from the text it is given:

- Low level [tokens](https://github.com/vuejs/vue-eslint-parser/blob/master/src/ast/tokens.ts), which can be of an [existing HTML type](https://github.com/vuejs/vue-eslint-parser/blob/master/src/html/tokenizer.ts#L59) or even new types.
- Intermediate tokens, which **must** be of type `StartTag`, `EndTag`, `Text` or `Mustache` (see [IntermediateTokenizer](https://github.com/vuejs/vue-eslint-parser/blob/master/src/html/intermediate-tokenizer.ts#L33)).

Token ranges and locations must count from the start of the document. To help with this, custom tokenizers are initialized with a starting line and column.

## Interface

```ts
class CustomTokenizer {
/**
* The tokenized low level tokens, excluding comments.
*/
tokens: Token[]
/**
* The tokenized low level comment tokens
*/
comments: Token[]
errors: ParseError[]

/**
* Used to control tokenization of {{ expressions }}. If false, don't produce VExpressionStart/End tokens
*/
expressionEnabled: boolean = true

/**
* The current namespace. Set and used by the parser. You probably can ignore this.
*/
namespace: string = "http://www.w3.org/1999/xhtml"

/**
* The current tokenizer state. Set by the parser. You can probably ignore this.
*/
state: string = "DATA"

/**
* The complete source code text. Used by the parser and set via the constructor.
*/
text: string

/**
* Initialize this tokenizer.
* @param templateText The contents of the <template> tag.
* @param text The complete source code
* @param {startingLine, startingColumn} The starting location of the templateText. Your token positions need to include this offset.
*/
constructor (templateText: string, text: string, { startingLine: number, startingColumn: number }) {
this.text = text
}

/**
* Get the next intermediate token.
* @returns The intermediate token or null.
*/
nextToken (): IntermediateToken | null {

}
}
```

## Behaviour

When the html parser encounters a `<template lang="...">` tag that matches a configured custom tokenizer, it will initialize a new instance of this tokenizer with the contents of the template tag. It will then call the `nextToken` method of this tokenizer until it returns `null`. After having consumed all intermediate tokens it will copy the low level tokens, comments and errors from the tokenizer instance.

## Examples

For a working example, see [vue-eslint-parser-template-tokenizer-pug](https://github.com/rashfael/vue-eslint-parser-template-tokenizer-pug/).
2 changes: 2 additions & 0 deletions src/common/parser-options.ts
Expand Up @@ -40,6 +40,8 @@ export interface ParserOptions {

// others
// [key: string]: any

templateTokenizer?: { [key: string]: string }
}

export function isSFCFile(parserOptions: ParserOptions) {
Expand Down
68 changes: 66 additions & 2 deletions src/html/parser.ts
Expand Up @@ -15,6 +15,7 @@ import type {
VDocumentFragment,
VElement,
VExpressionContainer,
VLiteral,
} from "../ast"
import { NS, ParseError } from "../ast"
import { debug } from "../common/debug"
Expand Down Expand Up @@ -51,6 +52,8 @@ import {
getScriptParser,
getParserLangFromSFC,
} from "../common/parser-options"
import sortedIndexBy from "lodash/sortedIndexBy"
import sortedLastIndexBy from "lodash/sortedLastIndexBy"

const DIRECTIVE_NAME = /^(?:v-|[.:@#]).*[^.:@#]$/u
const DT_DD = /^d[dt]$/u
Expand Down Expand Up @@ -474,6 +477,48 @@ export class Parser {
}
}

/**
* Process the given template text token with a configured template tokenizer, based on language.
* @param token The template text token to process.
* @param lang The template language the text token should be parsed as.
*/
private processTemplateText(token: Text, lang: string): void {
// eslint-disable-next-line @typescript-eslint/no-require-imports
const TemplateTokenizer = require(this.baseParserOptions
.templateTokenizer![lang])
const templateTokenizer = new TemplateTokenizer(
token.value,
this.text,
{
startingLine: token.loc.start.line,
startingColumn: token.loc.start.column,
},
)

// override this.tokenizer to forward expressionEnabled and state changes
const rootTokenizer = this.tokenizer
this.tokenizer = templateTokenizer

let templateToken: IntermediateToken | null = null
while ((templateToken = templateTokenizer.nextToken()) != null) {
;(this as any)[templateToken.type](templateToken)
}

this.tokenizer = rootTokenizer

const index = sortedIndexBy(
this.tokenizer.tokens,
token,
(x) => x.range[0],
)
const count =
sortedLastIndexBy(this.tokenizer.tokens, token, (x) => x.range[1]) -
index
this.tokenizer.tokens.splice(index, count, ...templateTokenizer.tokens)
this.tokenizer.comments.push(...templateTokenizer.comments)
this.tokenizer.errors.push(...templateTokenizer.errors)
}

/**
* Handle the start tag token.
* @param token The token to handle.
Expand Down Expand Up @@ -575,11 +620,12 @@ export class Parser {
const lang = langAttr?.value?.value

if (elementName === "template") {
this.expressionEnabled = true
if (lang && lang !== "html") {
// It is not an HTML template.
this.tokenizer.state = "RAWTEXT"
this.expressionEnabled = false
}
this.expressionEnabled = true
} else if (this.isSFC) {
// Element is Custom Block. e.g. <i18n>
// Referred to the Vue parser. See https://github.com/vuejs/vue-next/blob/cbaa3805064cb581fc2007cf63774c91d39844fe/packages/compiler-sfc/src/parse.ts#L127
Expand Down Expand Up @@ -639,8 +685,26 @@ export class Parser {
*/
protected Text(token: Text): void {
debug("[html] Text %j", token)

const parent = this.currentNode
if (
token.value &&
parent.type === "VElement" &&
parent.name === "template" &&
parent.parent.type === "VDocumentFragment"
) {
const langAttribute = parent.startTag.attributes.find(
(a) => a.key.name === "lang",
)
const lang = (langAttribute?.value as VLiteral)?.value
if (
lang &&
lang !== "html" &&
this.baseParserOptions.templateTokenizer?.[lang]
) {
this.processTemplateText(token, lang)
return
}
}
parent.children.push({
type: "VText",
range: token.range,
Expand Down
3 changes: 2 additions & 1 deletion src/index.ts
Expand Up @@ -108,13 +108,14 @@ function parseAsSFC(code: string, options: ParserOptions) {
const scripts = rootAST.children.filter(isScriptElement)
const template = rootAST.children.find(isTemplateElement)
const templateLang = getLang(template) || "html"
const hasTemplateTokenizer = options?.templateTokenizer?.[templateLang]
const concreteInfo: AST.HasConcreteInfo = {
tokens: rootAST.tokens,
comments: rootAST.comments,
errors: rootAST.errors,
}
const templateBody =
template != null && templateLang === "html"
template != null && (templateLang === "html" || hasTemplateTokenizer)
? Object.assign(template, concreteInfo)
: undefined

Expand Down
10 changes: 10 additions & 0 deletions test/ast.js
Expand Up @@ -186,6 +186,16 @@ describe("Template AST", () => {
const services = fs.existsSync(servicesPath)
? JSON.parse(fs.readFileSync(servicesPath, "utf8"))
: null
if (parserOptions.templateTokenizer) {
parserOptions.templateTokenizer = Object.fromEntries(
Object.entries(parserOptions.templateTokenizer).map(
([key, value]) => [
key,
path.resolve(__dirname, "../", value),
],
),
)
}
const options = Object.assign(
{ filePath: sourcePath },
PARSER_OPTIONS,
Expand Down