Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add position information to tokens #52

Open
segevfiner opened this issue Sep 11, 2023 · 5 comments
Open

Add position information to tokens #52

segevfiner opened this issue Sep 11, 2023 · 5 comments

Comments

@segevfiner
Copy link

It could be cool to add position information as specified by https://github.com/syntax-tree/unist to the resulting HAST.

@wooorm
Copy link
Owner

wooorm commented Sep 14, 2023

Heya!

What do you plan on using that for?

To do that, we’d have to only add it to the text nodes, because the text is preset in the original file, but we wouldn’t add them to the elements, as those aren‘t written in the original source. In unist terms, the elements are generated

@smith-kyle
Copy link

smith-kyle commented Dec 12, 2023

This would be helpful for me too! I'd be able to apply diff highlights to the tree if I knew the source positions

@wooorm
Copy link
Owner

wooorm commented Dec 12, 2023

Can you explain?

@smith-kyle
Copy link

smith-kyle commented Dec 13, 2023

Sure thing, not sure how much context you need so I'll lean towards oversharing:

I'm building a notebook diff review tool and use lowlight to do syntax highlighting in the cells. Prior to syntax highlighting the diff is calculated using the raw source code. The diffs are a list of range mappings indicating the start line, end line, start column, and end column in the raw source code. In order to find out where these diff highlights belong in the HAST I need to know what the source positions of each node is.

Specifically for markdown rendering, I use this chain:

  const processor = unified()
    .use(remarkParse, { position: true })
    .use(remarkRehype, { allowDangerousHtml: true })
    .use(raw, { position: true })

I'd like to add syntax highlighting for the code cells within the markdown using lowlight, but when I do so, the resulting nodes have no position information. Without position information I don't know what source code the nodes map to so I can't apply the diff highlights.

I have a workaround where I split the code HAST into lines then count the characters, but it'd be a big help if position was an option

@wooorm
Copy link
Owner

wooorm commented Dec 14, 2023

The notebook diff tool, with just lowlight, that sounds like an okay reason to have this.

But the second part, using lowlight inside unified for markdown -> html, and transforming it, in that case the positional info should not be there.
Positions refer to places in the source file. But the <span>s being generated are not in the source file. Across unified, when tools generate nodes, they don’t have positional info.
From earlier, the “In unist terms, the elements are generated

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants