Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enh(parser) multi-class in a single mode #3081

Merged
merged 20 commits into from
Apr 4, 2021
Merged
Show file tree
Hide file tree
Changes from 13 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
8 changes: 7 additions & 1 deletion CHANGES.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,14 @@
## Version next

Parser:

- enh(parser) support multi-class matchers (#3081) [Josh Goebel][]
- enh(parser) Detect comments based on english like text, rather than keyword list [Josh Goebel][]

Grammars:

- chore(properties) disable auto-detection #3102 [Josh Goebel][]
- fix(properties) fix incorrect handling of non-alphanumeric keys #3102 [Egor Rogov][]
- enh(parser) Detect comments based on english like text, rather than keyword list [Josh Goebel][]
- enh(shell) add alias ShellSession [Ryan Mulligan][]
- enh(shell) consider one space after prompt as part of prompt [Ryan Mulligan][]
- fix(nginx) fix bug with $ and @ variables [Josh Goebel][]
Expand Down
28 changes: 26 additions & 2 deletions docs/mode-reference.rst
Original file line number Diff line number Diff line change
Expand Up @@ -154,16 +154,40 @@ for one thing like string in single or double quotes.
begin
^^^^^

- **type**: regexp
- **type**: regexp or array of regexp

Regular expression starting a mode. For example a single quote for strings or two forward slashes for C-style comments.
If absent, ``begin`` defaults to a regexp that matches anything, so the mode starts immediately.


You can also pass an array when you need to individually highlight portions of the match with different classes:

::

{
begin: [
/function!/,
/\s+/,
hljs.IDENT_RE
],
className: {
1: "keyword",
3: "title"
},
}

This would highlight ``function!`` as a ``keyword`` while highlighting the name
of the function as ``title``. The space(s) between would be matched, but not
highlighted.

*Note: Internally this magic happens by wrapping each regex within it's own
capture group. If your regexes include any groups they must all be
non-capturing, ie:* ``(?:regex)``.
allejo marked this conversation as resolved.
Show resolved Hide resolved
joshgoebel marked this conversation as resolved.
Show resolved Hide resolved

match
^^^^^

- **type**: regexp
- **type**: regexp or array of regexp

This is simply syntactic sugar for a ``begin`` when no ``end`` expression is
necessary. It may not be used with ``begin`` or ``end`` keys (that would make
Expand Down
31 changes: 25 additions & 6 deletions src/highlight.js
Original file line number Diff line number Diff line change
Expand Up @@ -247,10 +247,29 @@ const HLJS = function(hljs) {
}

/**
* @param {Mode} mode - new mode to start
* @param {Mode} mode
joshgoebel marked this conversation as resolved.
Show resolved Hide resolved
* @param {RegExpMatchArray} match
*/
function startNewMode(mode) {
if (mode.className) {
function emitMultiClass(mode, match) {
let i = 1;
while (match[i]) {
const klass = mode.className[i];
const text = match[i];
if (klass) { emitter.addKeyword(text, klass); } else { emitter.addText(text); }
i++;
}
}

/**
* @param {CompiledMode} mode - new mode to start
* @param {RegExpMatchArray} match
*/
function startNewMode(mode, match) {
if (mode.isMultiClass) {
// at this point modeBuffer should just be the match
modeBuffer = "";
emitMultiClass(mode, match);
} else if (mode.className) {
emitter.openNode(language.classNameAliases[mode.className] || mode.className);
}
top = Object.create(mode, { parent: { value: top } });
Expand Down Expand Up @@ -340,7 +359,7 @@ const HLJS = function(hljs) {
modeBuffer = lexeme;
}
}
startNewMode(newMode);
startNewMode(newMode, match);
// if (mode["after:begin"]) {
// let resp = new Response(mode);
// mode["after:begin"](match, resp);
Expand Down Expand Up @@ -373,7 +392,7 @@ const HLJS = function(hljs) {
}
}
do {
if (top.className) {
if (top.className && !top.isMultiClass) {
emitter.closeNode();
}
if (!top.skip && !top.subLanguage) {
Expand All @@ -385,7 +404,7 @@ const HLJS = function(hljs) {
if (endMode.endSameAsBegin) {
endMode.starts.endRe = endMode.endRe;
}
startNewMode(endMode.starts);
startNewMode(endMode.starts, match);
}
return origin.returnEnd ? 0 : lexeme.length;
}
Expand Down
12 changes: 9 additions & 3 deletions src/languages/vim.js
Original file line number Diff line number Diff line change
Expand Up @@ -99,12 +99,18 @@ export default function(hljs) {
begin: /[bwtglsav]:[\w\d_]+/
},
{
className: 'function',
beginKeywords: 'function function!',
begin: [
/\b(?:function|function!)/,
/\s+/,
hljs.IDENT_RE
],
className: {
1: "keyword",
3: "title"
},
end: '$',
relevance: 0,
contains: [
hljs.TITLE_MODE,
{
className: 'params',
begin: '\\(',
Expand Down
27 changes: 27 additions & 0 deletions src/lib/ext/multi_class.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
/* eslint-disable no-throw-literal */
import * as logger from "../../lib/logger.js";
import * as regex from "../regex.js";
joshgoebel marked this conversation as resolved.
Show resolved Hide resolved

const MultiClassError = new Error();

/**
*
* @param {CompiledMode} mode
*/
export function MultiClass(mode) {
if (!Array.isArray(mode.begin)) return;
joshgoebel marked this conversation as resolved.
Show resolved Hide resolved

if (mode.skip || mode.excludeBegin || mode.returnBegin) {
logger.error("skip, excludeBegin, returnBegin not compatible with multi-class");
throw MultiClassError;
}

if (typeof mode.className !== "object") {
logger.error("className must be object");
throw MultiClassError;
}

const items = mode.begin.map(x => regex.concat("(", x, ")"));
mode.begin = regex.concat(...items);
mode.isMultiClass = true;
}
4 changes: 3 additions & 1 deletion src/lib/mode_compiler.js
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ import * as regex from './regex.js';
import { inherit } from './utils.js';
import * as EXT from "./compiler_extensions.js";
import { compileKeywords } from "./compile_keywords.js";
import { MultiClass } from "./ext/multi_class.js";

// compilation

Expand Down Expand Up @@ -284,7 +285,8 @@ export function compileLanguage(language, { plugins }) {
[
// do this early so compiler extensions generally don't have to worry about
// the distinction between match/begin
EXT.compileMatch
EXT.compileMatch,
MultiClass
].forEach(ext => ext(mode, parent));

language.compilerExtensions.forEach(ext => ext(mode, parent));
Expand Down
1 change: 1 addition & 0 deletions test/api/index.js
Original file line number Diff line number Diff line change
Expand Up @@ -15,4 +15,5 @@ describe('hljs', function() {
require('./unregisterLanguage');
require('./starters');
require('./underscoreIdent');
require('./multiClassMatch');
});
80 changes: 80 additions & 0 deletions test/api/multiClassMatch.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
'use strict';

const hljs = require('../../build');

describe('multi-class matchers', () => {
before(() => {
const grammar = function() {
return {
contains: [
{
begin: ["a", "b", "c"],
className: {
1: "a",
3: "c"
},
contains: [
{
match: "def",
className: "def"
}
]
},
{
className: "carrot",
begin: /\^\^\^/,
end: /\^\^\^/,
contains: [
{
begin: ["a", "b", "c"],
className: {
1: "a",
3: "c"
}
}
]
},
{
match: [
/func/,
/\(\)/,
/{.*}/
],
className: {
1: "keyword",
2: "params",
3: "body"
}
}
]
};
};
hljs.registerLanguage("test", grammar);
});
after(() => {
hljs.unregisterLanguage("test");
});
it('should support begin', () => {
const code = "abcdef";
const result = hljs.highlight(code, { language: 'test' });

result.value.should.equal(`<span class="hljs-a">a</span>b<span class="hljs-c">c</span><span class="hljs-def">def</span>`);
result.relevance.should.equal(2);
});
it('basic functionality', () => {
const code = "func(){ test }";
const result = hljs.highlight(code, { language: 'test' });

result.value.should.equal(`<span class="hljs-keyword">func</span><span class="hljs-params">()</span><span class="hljs-body">{ test }</span>`);
result.relevance.should.equal(1);
});
it('works inside a classified parent mode', () => {
const code = "^^^what abc now^^^";
const result = hljs.highlight(code, { language: 'test' });

result.value.should.equal(
`<span class="hljs-carrot">^^^what ` +
`<span class="hljs-a">a</span>b<span class="hljs-c">c</span>` +
` now^^^</span>`);
})
});
2 changes: 1 addition & 1 deletion test/markup/vim/default.expect.txt
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
<span class="hljs-keyword">set</span> autoindent

<span class="hljs-comment">&quot; switch on highlighting</span>
<span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">UnComment</span><span class="hljs-params">(fl, ll)</span></span>
<span class="hljs-keyword">function</span> <span class="hljs-title">UnComment</span><span class="hljs-params">(fl, ll)</span>
<span class="hljs-keyword">while</span> idx &gt;= <span class="hljs-variable">a:ll</span>
<span class="hljs-keyword">let</span> srclines=<span class="hljs-built_in">getline</span>(idx)
<span class="hljs-keyword">let</span> dstlines=<span class="hljs-keyword">substitute</span>(srclines, <span class="hljs-variable">b:comment</span>, <span class="hljs-string">&quot;&quot;</span>, <span class="hljs-string">&quot;&quot;</span>)
Expand Down
3 changes: 2 additions & 1 deletion types/index.d.ts
Original file line number Diff line number Diff line change
Expand Up @@ -206,7 +206,8 @@ type CompiledMode = Omit<Mode, 'contains'> &
endRe: RegExp
illegalRe: RegExp
matcher: any
isCompiled: true
isCompiled: true,
isMultiClass?: boolean,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
isMultiClass?: boolean,
isMultiClass?: true,

If there's no time it'll ever false, should the type be true instead of boolean?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps technically, but that bugs me though i can't explain it. This isn't the same as isCompiled IMHO.

starts?: CompiledMode
parent?: CompiledMode
}
Expand Down