Add support for MDX comment syntax #1368

3w36zj6 · 2024-03-12T11:43:26Z

What version of textlint are you using?

v14.0.3

What file type (Markdown, plain text, etc.) are you using?

MDX via @textlint/textlint-plugin-markdown

What did you do? Please include the actual source code causing the issue.

The comment syntax in Markdown is the same as in HTML, but MDX does not support this comment syntax and instead uses {/* */}.

@textlint/textlint-plugin-markdown does not recognize this as a comment, and to treat it as a comment syntax, must do as follows:

{/* <!-- --> */}

What did you expect to happen?

I expect the MDX comment syntax to be supported and that the range enclosed by the MDX comment syntax would not be subject to linting.

What actually happened? Please include the actual, raw output from textlint.

The range enclosed by the MDX comment syntax is subject to linting.

The text was updated successfully, but these errors were encountered:

azu · 2024-03-12T14:06:40Z

I think the best way to make a textlint-plugin-mdx.
Actually, MDX and Markdown are not completely compatible.

https://github.com/micromark/mdx-state-machine#72-deviations-from-markdown

textlint-plugin-markdown treat <p>{/* .... */}</p> as HTML fragments.

https://textlint.github.io/astexplorer/#/snippet/woXCqHBhcnNlcklEwrh0ZXh0bGludDptxINrZG93bi10by1hc3TCqMSFdHTEkGdzwoHEisSMxI7EkMSSxJRyxJbEmMSaxJzEnsSgw4DCqHbEhnNpb27EqMSqxI3Ej8SRxJPElcSXxJnEm8SdxJ90wqYxNC4wLjPCqGZpbGVuYW1lwrBzb3VyY2UudW5kZcWUbmVkwqRjb8Wmwr88ZGl2PgogIHsvKiA8IS0tIMW_PiAqL30KPC_FssW0}

textlint-filter-rule-comments extract HTML comment from the fragment and filter it.

https://github.com/textlint/textlint-filter-rule-comments/blob/755c4402cc4995cd7ca37764c138840811d53515/src/textlint-filter-rule-comments.js#L40-L55

If textlint-plugin-mdx parse <p>{/* .... */}</p> as <HTML><Comment /></HTML>, textlint-filter-rule-comments just works without workaround.

However, It's hard to make a plugin for MDX, so there may be a better way.

azu · 2024-03-12T14:21:14Z

I think it is possible to improve the regexp here.

https://github.com/textlint/textlint-filter-rule-comments/blob/755c4402cc4995cd7ca37764c138840811d53515/src/parse-comment.js#L3

~~However, it is necessary to verify a little whether there is false-negative.~~
(I don't think many users will write such comments {/* textlint-disable */} unexpectlly, There is no need to worry.)

Actually, {/* */} is not a comment in HTML, so I don't know if we should change this regexp.

3w36zj6 · 2024-03-12T15:53:26Z

Indeed, my goal is to use textlint-filter-rule-comments with MDX comment syntax only. Therefore, I considered making the comment format customizable from .textlintrc (3w36zj6/textlint-filter-rule-comments@8ae4e5a) or making the regular expression representing the comment syntax a union of Markdown and MDX (3w36zj6/textlint-filter-rule-comments@45c4e5b), but neither gave me the desired result.

azu · 2024-03-13T03:44:41Z

Probabely, 3w36zj6/textlint-filter-rule-comments@45c4e5b reuse g ann it is something wrong.

Maybe, we can create a RegeExp dynamically like this.

            const regExp = new RegExp([
                String.raw`<!--\s*${enablingComment}*(?:.|\s)*?-->`,
                String.raw`<!--\s*${disablingComment}*(?:.|\s)*?-->`,
                String.raw`{/\*\s*${enablingComment}*(?:.|\s)*?\*/}`,
                String.raw`{/\*\s*${disablingComment}*(?:.|\s)*?\*/}`
            ].join("|"), "g");
            const matches = nodeValue.matchAll(regExp);
            const comments = Array.from(matches, m => {
                return m[0].replace(/<!--|-->|{\/\*|\*\/}/g, "")
            });
            comments.forEach(commentValue => {
                if (commentValue.indexOf(enablingComment) !== -1) {
                    const configValue = commentValue.replace(enablingComment, "");
                    statusManager.enableReporting(node, parseRuleIds(configValue));
                } else if (commentValue.indexOf(disablingComment) !== -1) {
                    const configValue = commentValue.replace(disablingComment, "");
                    statusManager.disableReporting(node, parseRuleIds(configValue));
                }
            });

3w36zj6 · 2024-03-13T06:59:57Z

I have reverted the changes to HTML_COMMENT_REGEXP and implemented the suggested changes (3w36zj6/textlint-filter-rule-comments@5ec2076).
All tests in Markdown pass, but in MDX, comments continue to be subject to lint, and rule disabling is not performed.

3w36zj6 · 2024-03-25T20:00:43Z

Thank you for your continuous support. We considered creating mdx-to-ast and textlint-plugin-mdx and conducted the following simple experiments.

First, I added remark-mdx to the Processor.

textlint/packages/@textlint/markdown-to-ast/src/parse-markdown.ts:

import unified from "unified";
// @ts-ignore
import autolinkLiteral from "mdast-util-gfm-autolink-literal/from-markdown";
// FIXME: Disable auto link literal transforms that break AST node
// https://github.com/remarkjs/remark-gfm/issues/16
// Need to override before import gfm plugin
autolinkLiteral.transforms = [];
// Load plugins
import remarkGfm from "remark-gfm";
+import remarkMdx from "remark-mdx";
import remarkParse from "remark-parse";
import frontmatter from "remark-frontmatter";
import footnotes from "remark-footnotes";
import type { Node } from "unist";

-const remark = unified().use(remarkParse).use(frontmatter, ["yaml"]).use(remarkGfm).use(footnotes, {
+const remark = unified().use(remarkParse).use(frontmatter, ["yaml"]).use(remarkGfm).use(remarkMdx).use(footnotes, {
    inlineNotes: true
});
export const parseMarkdown = (text: string): Node => {
    return remark.parse(text);
};

Next, I added processing for MDX-specific mdast nodes.

textlint/packages/@textlint/markdown-to-ast/src/index.ts:

import { SyntaxMap } from "./mapping/markdown-syntax-map";
import type { TxtDocumentNode } from "@textlint/ast-node-types";
import { ASTNodeTypes } from "@textlint/ast-node-types";
import traverse from "traverse";
import debug0 from "debug";
import { parseMarkdown } from "./parse-markdown";

const debug = debug0("@textlint/markdown-to-ast");

export { ASTNodeTypes as Syntax };

/**
 * parse Markdown text and return ast mapped location info.
 * @param {string} text
 */
export function parse(text: string): TxtDocumentNode {
    // remark-parse's AST does not consider BOM
    // AST's position does not +1 by BOM
    // So, just trim BOM and parse it for `raw` property
    // textlint's SourceCode also take same approach - trim BOM and check the position
    // This means that the loading side need to consider BOM position - for example fs.readFile and text slice script.
    // https://github.com/micromark/micromark/blob/0f19c1ac25964872a160d8b536878b125ddfe393/lib/preprocess.mjs#L29-L31
    const hasBOM = text.charCodeAt(0) === 0xfeff;
    const textWithoutBOM = hasBOM ? text.slice(1) : text;
    const ast = parseMarkdown(textWithoutBOM);
    traverse(ast).forEach(function (node) {
        // eslint-disable-next-line no-invalid-this
+        const isMdx =
+            node &&
+            Object.prototype.hasOwnProperty.call(node, "type") &&
+            /^(mdxJsxFlowElement|mdxJsxTextElement|mdxjsEsm|mdxFlowExpression|mdxTextExpression)$/.test(node.type);
        if (this.notLeaf) {
+            if (isMdx) {
+                if (node.type === "mdxJsxFlowElement") {
+                    node.type = ASTNodeTypes.Html;
+                } else if (node.type === "mdxJsxTextElement") {
+                    node.type = ASTNodeTypes.Paragraph;
+                } else if (
+                    node.type === "mdxjsEsm" ||
+                    node.type === "mdxFlowExpression" ||
+                    node.type === "mdxTextExpression"
+                ) {
+                    if (Object.prototype.hasOwnProperty.call(node, "value") && /^(\/\*(.|\s)*\*\/)$/.test(node.value)) {
+                        node.type = ASTNodeTypes.Comment;
+                        node.value = node.value.replace(/^(\/\*)/, "").replace(/\*\/$/, "");
+                    } else {
+                        node.type = node.type === "mdxTextExpression" ? ASTNodeTypes.Code : ASTNodeTypes.CodeBlock;
+                    }
+                }
+                if (Object.prototype.hasOwnProperty.call(node, "attributes")) {
+                    delete node.attributes;
+                }
+                if (Object.prototype.hasOwnProperty.call(node, "name")) {
+                    delete node.name;
+                }
+                if (Object.prototype.hasOwnProperty.call(node, "data")) {
+                    delete node.data;
+                }
-            if (node.type) {
+            } else if (node.type) {
                const replacedType = SyntaxMap[node.type as keyof typeof SyntaxMap];
                if (!replacedType) {
                    debug(`replacedType : ${replacedType} , node.type: ${node.type}`);
                } else {
                    node.type = replacedType;
                }
            }
            // map `range`, `loc` and `raw` to node
            if (node.position) {
                const position = node.position;
                // line start with 1
                // column start with 0
                const positionCompensated = {
                    start: { line: position.start.line, column: Math.max(position.start.column - 1, 0) },
                    end: { line: position.end.line, column: Math.max(position.end.column - 1, 0) }
                };
                const range = [position.start.offset, position.end.offset] as const;
                node.loc = positionCompensated;
                node.range = range;
                node.raw = textWithoutBOM.slice(range[0], range[1]);
                // Compatible for https://github.com/syntax-tree/unist, but it is hidden
                Object.defineProperty(node, "position", {
                    enumerable: false,
                    configurable: false,
                    writable: false,
                    value: position
                });
            }
        }
    });
    return ast as TxtDocumentNode;
}

This approach will likely succeed. However, the latest remark-mdx only supports ESM, so I believe updates to configuration files and dependencies will be necessary before proceeding.

(I updated the configuration files and dependencies and confirmed operation on local environment, I could not figure out how to make both the TypeScript transpile and the test with mocha succeed)

azu added the help wanted Extra attention is needed label Mar 12, 2024

3w36zj6 mentioned this issue Mar 24, 2024

feat: add support for JS comment syntax for MDX #1372

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for MDX comment syntax #1368

Add support for MDX comment syntax #1368

3w36zj6 commented Mar 12, 2024

azu commented Mar 12, 2024 •

edited

azu commented Mar 12, 2024

3w36zj6 commented Mar 12, 2024

azu commented Mar 13, 2024

3w36zj6 commented Mar 13, 2024

3w36zj6 commented Mar 25, 2024

Add support for MDX comment syntax #1368

Add support for MDX comment syntax #1368

Comments

3w36zj6 commented Mar 12, 2024

azu commented Mar 12, 2024 • edited

azu commented Mar 12, 2024

3w36zj6 commented Mar 12, 2024

azu commented Mar 13, 2024

3w36zj6 commented Mar 13, 2024

3w36zj6 commented Mar 25, 2024

azu commented Mar 12, 2024 •

edited