Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better link syntax for cross-references #548

Open
asmeurer opened this issue Apr 12, 2022 · 10 comments
Open

Better link syntax for cross-references #548

asmeurer opened this issue Apr 12, 2022 · 10 comments
Labels
enhancement New feature or request

Comments

@asmeurer
Copy link
Contributor

Describe the problem/need and solution

I've been using MyST for a bit and it's quite nice being able to use Markdown instead of RST. However, a major pain point is using the cross-referencing syntax. The {ref}`target` and {ref}`name <target>` feel like I am just using a slightly modified version of RST. They aren't very Markdownic, if that is a word. For me, they fail the basic smell tests of good Markdown syntax:

  • Easy to remember The link syntax is basically the same as the RST syntax, except with brackets instead of colons. The RST syntax is notoriously hard to remember.
  • Composable It's impossible to format a link as code (or if it is possible, only with some trick that I haven't yet figured out), because it reuses backticks.
  • Joy to use Every time I have to make a cross-reference I feel the same slight pain I feel whenever I use RST. It's even worse if I haven't used it in a while and have to go lookup the syntax (and I think I've mentioned this a dozen times before but I'll mention it again, please use the word link in the docs to refer to link syntax. No one knows what a "roll" is).

Basically, it feels like RST syntax that has been shoved into Markdown rather than the way Markdown would actually implement such a thing.

MyST does let you write [name](target) and even [](target), and these are both great. But this only works when the target can be accessed with any. This unfortunately virtually never works for my use-case (cross-referencing functions with autodoc in the SymPy documentation).

I would propose extending the usual Markdown link syntax so that you can add the target type before the target name somehow. My suggestion would be to use a colon, like [name](func:target), but if can't work for whatever reason there could be other options. Another suggestion, which is less syntactically nice but would at least make sense logically, would be to allow {ref}`target` inside of a Markdown link, like [name]({ref}`target`) (IMO this should be done regardless of whether any other new syntax is added).

I would also suggest implementing this in a way so that the ~. style works so that something like [name](func:~.target) works (see #468), i.e., get the target for :func:`target` and then convert that into a link rather than just rewriting it to :func:`name <target>`.

Guide for implementation

No response

Tasks and updates

No response

@asmeurer
Copy link
Contributor Author

Ugh, look I'm not ruling anything out, but... I really think you need to try coding some of this, to understand how feasible some of it is.

Conversely, I would suggest for you to write some cross-reference heavy text to understand how difficult the current syntax is to use.

and this is because it exactly is that; its taking the role name and content, and just passing it on to docutils/sphinx to handle

The syntax itself feels just copied from RST. Obviously the semantics need to be there, but it could have been anything that had those three parts. Why, for instance, does MyST use backticks for this? RST uses backticks because in RST backticks are what are used to denote cross-references. In Markdown, backticks mean code. And the <> part for the target is just copied straight from the RST. It's completely different from the usual Markdown way of making a link.

Reusing the triple backticks at least makes some sense because that's the only kind of "block" syntax in Markdown (although even that is somewhat annoying because my editor thinks every directive is a code block).

I disagree, that {rolename}content is a difficult syntax to remember, I feel you are conflating other aspects of RST syntax.

How can you "disagree" that something is hard to remember? I'm telling you that I've had a hard time remembering it (and every time I've looked it up I've had a hard time even finding it because the docs use terminology that I don't expect). You not believing me is not very productive.

It seems like every time I interact with you I have to deal with this same sort of thing, and frankly, it's getting tiring.

Colons are a core aspect of standard URL syntax, so this would not be a good idea.

The colons were just a suggestion to try to prompt discussion. I guessed that they probably wouldn't work for some reason. They point was to give an idea of the sort of thing that might be simpler.

In fact I don't feel that we should be trying to implement any kind bespoke syntax/regexes inside (); that's not very "Markodwnic"

Isn't allowing cross-reference inside of the parentheses already sort of a special case?

Anyway, I can tell you that the first thing I ever tried when I wanted to make a custom link was the [name]({ref}`target`) syntax I suggested in the other issue. If you want to aim for "principle of least surprise" that's a good place to start.

What is Markdownic are attributes, as stipulated by the creator of Markdown: johnmacfarlane.net/beyond-markdown.html#attributes (who I was talking with recently commonmark/commonmark-spec#702 (comment))

I haven't seen these before. Wouldn't a header attribute like described in the first link be preferable to the (target)= syntax currently used by MyST (that's obviously unrelated to this discussion, but if that syntax could be replaced with something that Markdown parsers actually understood that would be awesome).

This is a good example; the ~. style is purely a Python domain specific thing: sphinx-doc/sphinx@b4276ed/sphinx/domains/python.py#L75, it does not work generically for all references

You're misunderstanding the point of this. I don't think MyST should know about ~.. All I'm suggesting is to make the link separately from the parsing of the reference. Basically parse in two steps. I don't know if that's feasible.

Alternately Sphinx or docutils itself could be fixed in this regard. I don't really care how it gets fixed, but this is an annoying pain point.

@choldgraf
Copy link
Member

Hey all - thanks both for sharing your thoughts and perspectives here. I agree with both of you about the pain-points (both on the syntax side, and on the implementation side). I think it's important that we hear each other out about our perspectives and focus on constructive conversation and debate.

In my opinion, it is useful to have design-level conversations (e.g. what would be the best user experience?) separately from implementation-level conversations (e.g., what is realistic given our limited development and maintenance resources). I think we should then consider both of them in coming up with a proposed path forward, but the pros/cons of one do not invalidate the other, they should simply be considered together. So, thanks @asmeurer for being open about your pain points here from a user's perspective, and to @chrisjsewell for providing an implementation-level dose of realism here :-) .

Context on why this is hard to implement in Sphinx

Quick context from what @chrisjsewell was saying. I believe the challenge with the <> syntax is that Sphinx itself (and extensions in Sphinx) hard-code that syntax as a part of their content parsers. E.g. the Sphinx CrossReference class uses a regex to search for it here:

https://github.com/sphinx-doc/sphinx/blob/31eba1a76dd485dc633cae48227b46879eda5df4/sphinx/util/docutils.py#L462-L466

So in Sphinx it's not really a part of the "parsing stage", the <> is just "part of the content block for a role" and is a convention that many extensions use because reStructuredText sort-of implicitly defines this syntax as a part of external hyperlinks.

MyST spec repo for discussion?

This also feels relevant to a few other conversations we've had over the months about how to extend the "role syntax" to include things like options:

And I think more generally, something like this would be a good topic for discussion / conversation in the myst-spec repository, where we are trying to define the specification more formally: https://github.com/executablebooks/myst-spec

@asmeurer
Copy link
Contributor Author

Just to be clear here, do you want me to open an issue for this on the myst-spec repo? You can also move this issue to that repo if you feel it would be better to live there.

@fperez
Copy link

fperez commented Aug 16, 2022

Leaving to @choldgraf the decision whether to move this issue over to the spec repo (probably a good idea IMO), I'll comment here... Thx @asmeurer for this writeup! I have just begun to use these aspects more, and your perspective here is very valuable.

I think it's worth really looking at the user experience aside from the sphinx/docutils-imposed constraints: ultimately that is an internal implementation layer that could in the long run change. The reason so many of us moved from ReST to md was precisely user experience, and that was the entire reason why we had the original impetus for MyST way back when. We should continue looking for that fluid, joyful experience while writing and sharing content.

@john-hen
Copy link
Contributor

john-hen commented Aug 16, 2022

For what it's worth, I've always felt that the most intuitive Markdown link syntax extension for MyST would work as follows:

  • Inline links work just the same as in standard Markdown. So anything of the sort [title](target) (i.e., with parentheses) works as it would otherwise. The target would usually be an explicit link, like https://example.com, but may also be a relative link inside the project. There could be some magic there, along the lines of what GitHub does, like mapping the target index.md to index.html But nothing that surprises users. Like the implicit download role tends to do (in my opinion).
  • Reference links, on the other hand, should do the actual magic. These use brackets for the target: [title][target], where target is looked up elsewhere.

The look-up for reference targets would essentially do what Python does when it looks up the name of an identifier:

  • Check the local scope. Here, the current Markdown file. If [target]: is defined elsewhere in the same document, use that. I.e., standard Markdown behavior.
  • If not that, go one level up: See what Sphinx would find for target with the any role.
  • If not that, go to the global scope: Look up target with Intersphinx.

The any look-up might be ambiguous, so maybe allow to specify Sphinx roles such as func: or download: (or :func:, :download:) as a prefix for target.

I know this would break a lot of stuff that's already working somewhat differently in MyST. But I feel(!?) this should be easy on the implementation side, once refactored, and line up better with what users would expect a "multi-document Markdown renderer" to do.

@rowanc1
Copy link
Member

rowanc1 commented Aug 17, 2022

Just as a note, I have opened up an issue here which is aiming to track various places where there are suggestions/problems/improvements around cross references (including this issue). Another potential avenue that has been discussed is to adopt pandoc citation/reference syntax (starting with an @).

@chrisjsewell
Copy link
Member

chrisjsewell commented Aug 31, 2022

@chrisjsewell
Copy link
Member

These use brackets for the target: [title][target], where target is looked up elsewhere.

Just to note @john-hen this is not particularly easy, because any standard CommonMark parser will only recognise these as links, if it can match it to a target, i.e. this would break CommonMark compliance.
(I've already thought about this before, and asked on the commonmark spec 😅 commonmark/commonmark-spec#702)

@john-hen
Copy link
Contributor

john-hen commented Sep 2, 2022

@chrisjsewell Yeah, okay. But it seems that, between you and John MacFarlane, you both agree that that particular example from the CommonMark spec isn't all that useful. And I would add that it's certainly esoteric. I don't think a lot of people are writing things like [foo][bar][baz] in their Markdown documents and have high expectations as to how that would be parsed.

@chrisjsewell
Copy link
Member

@john-hen I'm afraid that if you don't agree with commonmark specification, then you need to petition for that to change, it is not the business of MyST to go against that. MyST uses a commonmark compliant parser, so it would not be trivial to implement something that went against that anyhow

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

6 participants