-
-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Standard YAML frontmatter syntax causes lists to not be detected #5
Comments
So first, for the code sandbox: Frontmatter starts at the first line. You can test this on GH: adding blank lines before the YAML doesn’t work. Start with: const sourceMarkdown = `--- …and it works. |
What doesn’t work? I believe it works. That’s the default. That’s what 1.6m people per month use the plugin for. And it’s well tested |
Here’s a reduced test case: import {fromMarkdown} from 'mdast-util-from-markdown'
import {frontmatter} from 'micromark-extension-frontmatter'
import {frontmatterFromMarkdown} from 'mdast-util-frontmatter'
const sourceMarkdown = `---
title: a
* b
`;
const tree = fromMarkdown(sourceMarkdown, {
extensions: [frontmatter()],
mdastExtensions: [frontmatterFromMarkdown()]
})
console.log(tree) // Contains a paragraph with `* b` as text, not a list. The reason is that this looks a lot like frontmatter. So it continues on checking all those lines for the ending. When doing that, it prevents the lists from being checked. Afterwards, it can’t do those lists again. |
Hi! This was marked as ready to be worked on! Note that while this is ready to be worked on, nothing is said about priority: it may take a while for this to be solved. Is this something you can and want to work on? Team: please use the |
Thank you for your swift reply! You are right that I accidentally added a newline there. However, when I remove that, the list is still not properly being detected. Here's a screenshot from the updated sandbox:
I am sorry if I came across as ungrateful or anything — I am a big fan of your work and appreciate everything you do. I did some extensive checking this morning to find the bug, and ended up finding this oddity. I am, however, not sure what kind of interaction effect is (I believe that each plugin itself works fine, absolutely, but since we have this weird bug, I suspect some interaction – that's also why I'm not sure if this emanates from here, or from the Regarding your reduced case, what is the intention in leaving out the ending fence of the frontmatter here? In other words, why did you not write the following? ---
title: a
---
* b One additional thing I noted, but which I believe can be fixed when this bug is sorted out, is that if the list is not being properly detected, the parser "eats" the indentation spaces in front of the second-order list item. You may note that there is this I greatly appreciate any guidance in this regard. Thank you! |
No need, no worries, you didn’t come across as such. See my later response for a reduced case. You are correct, there is a bug.
Because that’s exactly what’s going on here: if an opening but no closing fence is found, lists don’t work. You had For the second bug, whitespace being eaten, I think it’s unrelated to this, probably. It’s because whitespace in paragraphs is removed. E.g.:
That positional info might be lost somewhere though. As the text node is
Maybe some positional info is lost somewhere |
Ahhhh, I see what you mean, yes! Thank you for the clarification! |
Hey @wooorm, just as an update: I have now figured out that the fault in the options also prevents the Gfm parser ( |
Yep, block quotes, lists, and footnote definitions! All “containers” that fail on this :( |
Hey, thanks for your quick reply! I can absolutely understand if you can't spend time on it right now. I would be happy to prepare a PR, since I think I found the cause, but it would be ideal if you could point me to a few resources so that I can make the fix work properly and stick to the unified conventions. Here's what I found: Because both frontmatters (The Pandoc style – {
flow: {
'45': [
tokenizer, // Parses --- / ...
tokenizer // Parses --- / ---
]
}
} This makes the engine go through the document twice, once for the There are two possible solutions: If the engine allows one to abort parsing anything if the given frontmatter type is not detected, then we could add a break condition so that the parser that doesn't fit does not attempt to detect anything. The other solution would be to merge the openings and closings, i.e., allow for an array of strings in the open/close options that configure the plugin. I've seen that the What are your thoughts? I can get a PR ready for review today so that it's fixed! |
See my small reproduction above: #5 (comment). This issue exists for normal frontmatter. A single format. I think it’s going to get super complex: the engine does not really support aborting. Because normal markdown doesn’t have that: once there’s a line that opens fenced code, anything after it is fenced code, for example. |
Ouh … alright. This explains why I wasn't getting anywhere with my attempts at fixing in this regard … :/ |
Initial checklist
Affected packages and versions
micromark-extension-frontmatter v1.0.0
Link to runnable example
https://codesandbox.io/s/hopeful-matan-nybskz?file=%2Fsrc%2Findex.js
Steps to reproduce
I have a workload where I need to transform Markdown documents that can also include frontmatters to a syntax tree, using
parse()
on the remark processor.It breaks when that said YAML frontmatter uses Jekyll-style delimiters (three dashes both on top and on the bottom of the frontmatter). It works, however, when I use Pandoc-style delimiters (three dashes on top, three dots on the bottom). Usage of the plugin then disables the processor's ability to detect lists properly.
Here are the steps to reproduce. Above I have linked a sandbox that is already properly set up. In there, simply exchange the end-fence for the frontmatter between dashes and dots and observe the difference in produced syntax trees.
Processor setup
Note that while I am being verbose in explicitly stating the delimiters, exchanging the second definition to simply
yaml
does not change the effect.Test document
Use this document and run it through said pipeline. You will notice that, when you use three dashes to end the frontmatter, the list is not correctly detected, whereas, when you exchange that with three dots (Pandoc-style frontmatter), it will be correctly detected.
Expected behavior
Correct behavior with Pandoc-style frontmatter
If you exchange the three dashes with three dots in the test-document, it works as expected.
This is the correct Syntax Tree:
Actual behavior
Incorrect behavior with Jekyll-style frontmatter
If you just run the above test document (with three dashes), it produces a wrong syntax tree. Observe how it does not detect the list properly.
Runtime
Node v16, Other (please specify in steps to reproduce)
Package manager
yarn v1
OS
macOS
Build and bundle tools
Webpack
The text was updated successfully, but these errors were encountered: