New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Preserves em and strong markers in the AST #373
Conversation
break; | ||
} | ||
case 'em': { | ||
yield '_'; | ||
yield n.attributes.marker ?? '*'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we changing _
to *
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any harm in leaving it as _
(as a default convention)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, so the issue is that _
is treated like regular text and not an emphasis delimiter when it's used inside of a text node, e.g. foo_bar_baz
is rendered literally whereas foo*bar*baz
treats the *bar*
part as emphasized. Because the underscore behaves differently in this specific case, it's not safe to convert foo*bar*baz
to foo_bar_baz
—it changes the meaning of the content. So using the asterisk is a safer universal default than the underscore. It's a little unfortunate, I'd prefer the underscore otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahhh, makes sense.
}; | ||
|
||
export const em: Schema = { | ||
render: 'em', | ||
children: ['strong', 's', 'link', 'code', 'text', 'tag'], | ||
attributes: { | ||
marker: { type: String, render: false }, | ||
}, | ||
}; | ||
|
||
export const s: Schema = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add it for Strikethrough (s
) too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like strikethrough always uses the exact same marker syntax, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, right. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PTAL @rpaul-stripe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we can remove some of the character escape code I wrote now that we have captured the marker. Can you take a look at that briefly @rpaul-stripe (not a blocker for merging)?
This PR addresses issue #369. It makes the parser capture the actual symbols used for
em
andstrong
nodes and preserve that information in the AST so that it can be used in the formatter to reproduce the content accurately.em
andstrong
nodes and apply it to a node attribute calledmarker
marker
attributemarker
attribute to thestrong
andem
node schemasem
andstrong
nodes using the original emphasis marker from the source content and falls back on*
or**
when not availableCloses #369