Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't treat numbers as lists. #334

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

gravypod
Copy link

Sometimes a document will contain text that looks like this:

Software version:

20230101.

Currently cmark-gfm will output the following:

<ol start=20230101>
<li></li>
</ol>

This isn't what the author had intended. The author intended for the markdown to render as:

<p>20230101.</p>

If the blank item would be added to an existing list it is preserved. For example:

1. Hello
2.
3. World

Renders as:

<ol>
<li>Hello</li>
<li></li>
<li>World</li>
</ol>

Sometimes a document will contain text that looks like this:

```
Software version:

20230101.
```

Currently cmark-gfm will output the following:

```
<ol start=20230101>
<li></li>
</ol>
```

This isn't what the author had intended. The author intended for the
markdown to render as:

```
<p>20230101.</p>
```

If the blank item would be added to an existing list it is preserved.
For example:

```
1. Hello
2.
3. World
```

Renders as:

```
<ol>
<li>Hello</li>
<li></li>
<li>World</li>
</ol>
```
@wooorm
Copy link

wooorm commented May 31, 2023

I don’t work for GH but my guess is that this likely won’t be accepted: a) this project isn’t maintained except for security vulnerabilities, b) this is a fork of CommonMark, which adds some extra features, but it doesn’t otherwise break with the CM spec, which your PR does.

This isn't what the author had intended. The author intended for the markdown to render as:

This is how markdown works. It’s documented in the spec. If the author doesn’t want this, use an escape:

a
123\. b

a
123. b

Alternatively, “20230101” looks a lot like a date, perhaps you can use 2023-01-01 instead.

@gravypod
Copy link
Author

I don’t work for GH but my guess is that this likely won’t be accepted: a) this project isn’t maintained except for security vulnerabilities,

That's fine, we have our own internal fork and we just like to upstream things which are of general interest to the community.

b) this is a fork of CommonMark, which adds some extra features, but it doesn’t otherwise break with the CM spec, which your PR does.

Does this break the cmark spec? My reading of https://spec.commonmark.org/0.30/#ordered-list doesn't lead me to believe there is anything in the spec which discusses bare numbers being treated as a list.

This repo has a set of tests which validate all of the examples in the spec render as expected and those tests are passing at this change.

If there is some part of the spec which covers these cases and says they should be lists please let me know:

*
-
1.
1)

Alternatively, “20230101” looks a lot like a date, perhaps you can use 2023-01-01 instead.

My kingdom for a unified release versioning scheme.

@wooorm
Copy link

wooorm commented May 31, 2023

Does this break the cmark spec?

Somewhat pedantically, every change would break CommonMark: the grammar of markdown is, just like HTML, expressed as .* in BNF: every character is valid and produces something. Changing the meaning of things, is SemVer breaking.

Whether it’s practically breaking? Yes: as someone who makes popular markdown parsers, I have seen these cases expressed in documents by users, and I have written code to handle this and tests to confirm it.

If there is some part of the spec which covers these cases and says they should be lists please let me know:

*
-
1.
1)

Yes, all those cases are lists. See earlier, 5.2 List items for more info. Particularly see point 3 “Item starting with a blank line” of lists (after https://spec.commonmark.org/0.30/#example-277), especially example 284 https://spec.commonmark.org/0.30/#example-284.

@gravypod
Copy link
Author

@wooorm this code supports 277 + the requirement that you can start lists with a blank line. See the code adding support for this.

I can see https://spec.commonmark.org/0.30/#example-284 applying more here but this code only applies to numbers. I see this section talks both about "lists" preferring to unordered and ordered list.

I can definitely see how we could be running afoul of that here.

I'm going to leave the PR open just in case others are interested in this modification. Users seem to like this on our internal corpus.

@kevinbackhouse
Copy link
Collaborator

@gravypod: Have you tried submitting this change to cmark? It's in a part of the code that's mostly the same in both codebases, so I think it would probably work. cmark-gfm is a fork of cmark that adds the GFM extensions so if they accept the change then I think it would make sense for us to accept it too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants