Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Internal cross-references MEP #10

Merged
merged 18 commits into from
Mar 17, 2023
Merged

Internal cross-references MEP #10

merged 18 commits into from
Mar 17, 2023

Conversation

rowanc1
Copy link
Member

@rowanc1 rowanc1 commented Jan 10, 2023

📖 Read the MEP Here

📆 MEP will close Monday March 13th, 2023, PT 📆

The current draft with most comments/todos resolved, the Companion issue for the MEP can host overall discussions.

Outstanding issues / questions:

  • agree on using three "schemes"/protocols (project/path/myst). There is some redundancy here and we can possibly just simplify to myst, but we should make sure we don't loose anything. Alternatively we could go in the other direction, adding download instead of path? (removed)
  • Some implications of the "scheme" choice are on downloads and how to download a markdown file easily, for example.
  • agree on syntax for interpolating text/title into a link template (currently suggested as %t, which is similar to Fig. %s). In sphinx this is {name}, which can also be supported, but maybe not mentioned in docs, etc. See @choldgraf response here - choosing %t.
  • update python?py:class --> python:py:class (removed intersphinx)

Doc updates / TODOs

  • Review "search order specificity" and "implicit" targets in a document
  • Add some information on the crossReference spec AST changes.
  • Add examples (I think these can be pretty light, as they are mostly laid out in the doc)

Current Status: DRAFT. Changing to ACTIVE needs to address above points, do a final pass, update the name of the doc and the ID.

We will aim to change this to active for review by wider stakeholders and review (aiming to take @mmcky through in part on Monday) in the next few days and start to get their review.

@welcome
Copy link

welcome bot commented Jan 10, 2023

Thanks for submitting your first pull request! You are awesome! 🤗

If you haven't done so already, check out EBP's Code of Conduct and our Contributing Guide, as this will greatly help the review process.

Welcome to the EBP community! 🎉

chrisjsewell added a commit to executablebooks/MyST-Parser that referenced this pull request Jan 10, 2023
This aligns the treatment of `[](#target)` style links for docutils with sphinx,
such that they are linked to a heading slug.

The core behaviour for sphinx is not changed,
except that failed reference resolution
now emits a `myst.xref_missing` warning (as opposed to a `std.ref` one), with a clearer warning message.
Also on failure, the reference is still created,
for people who wish to suppress the warning (see e.g. #677)

This is another step towards executablebooks/myst-enhancement-proposals#10
@rowanc1
Copy link
Member Author

rowanc1 commented Feb 10, 2023

@chrisjsewell from a few comments you seemed to have strong opinions about the myst url scheme - do you think that is the only thing holding this PR from moving into ACTIVE?

My thoughts on that are in the future we want a scheme to do more than what inv does (e.g. enumeration and dynamic linking of content), this is a core part of what "structured" means to me in MyST, and allows for rich interlinking between projects in a way that goes beyond the inv file format.

That being said, I think there is a lot of great work in the thinking behind links and we are already adopting it in our various parsers.

What I propose:

  1. change myst to inv in this iteration, this hopefully removes any sticking points of this MEP
  2. agree that myst: scheme will be a scheme based on the myst-spec for resolving references, but that can be hashed out in a future MEP

If that sounds good, I can take a pass on cleaning up the final things in this PR and we can get it over to ACTIVE for a wider/final review.

@chrisjsewell
Copy link
Member

chrisjsewell commented Feb 10, 2023 via email

@rowanc1
Copy link
Member Author

rowanc1 commented Feb 10, 2023

Nice. All I am proposing at this stage is that we kick the conversation about the myst: scheme down the road a bit and focus on getting a smaller MEP out the door.

For the rest of the MEP: are there outstanding parts you know/suspect are not backwards compatible? Perhaps we could take a similar approach. That is, make the MEP smaller and focus on shipping it and starting to turn these links on by default in various places!

@choldgraf
Copy link
Member

FWIW, I'll leave it to you two to decide the best path forward, but I agree that smaller and more tightly scoped proposals are easier to explain, discuss, and decide on. We can always make future progress in a new iteration.

chrisjsewell added a commit to executablebooks/MyST-Parser that referenced this pull request Feb 21, 2023
This PR moves myst-parser in the direction of executablebooks/myst-enhancement-proposals#10, in a relatively back compatible manner, and in a way that supports both docutils and sphinx.

It expands the capability of `[](#id)` to more than just linking to heading slugs, with the order of specificity being:

1. If it matches a local (to that document) "explicit" `std:ref` target, then link to that and stop
   - Note, currently only `std:ref` domain/type are supported, e.g. not `math` etc. That's more difficult and can come later
2. If it matches a local (to that document) "implicit" heading slug, then link to that and stop
3. If using docutils (i.e. single-page) build, then stop here and emit a `myst.xref_missing` warning
4. If using sphinx then create a `pending_xref` node, and hand-off to sphinx's "any" resolver, which takes effect once all documents have been read:
    - This first tries to resolve against a local (to the project) reference and, if matching, stops
    - Otherwise also try to match against any intersphinx reference
    - Otherwise emit a ~~`myst.ref`~~ `myst.xref_missing`  warning
 
If the text is explicit, e.g. `[text](#id)`, that text is used, otherwise a determination of implicit text is attempted, e.g. based on the section title or figure caption.
@rowanc1 rowanc1 mentioned this pull request Feb 24, 2023
4 tasks
@rowanc1
Copy link
Member Author

rowanc1 commented Feb 24, 2023

Ok, I have removed the following content, which is all things about intersphinx. I am opting to just kick that over to another MEP that can come in at any time. That is where most of the questions were, and I think the rest is actually already implemented in all parsers (maybe not released yet).

@chrisjsewell can you take a look soon, hopefully nothing left that is outstanding, then we can get this in and turn it to active and get a bit wider review.

This content was removed:

| `` {py:class}`zipfile.ZipFile` ``                 | `[](#zipfile.ZipFile)`                     |
| `` {external+python:py:class}`zipfile.ZipFile` `` | `[](myst:python?py:class#zipfile.ZipFile)` |


- `` {external+python:py:class}`zipfile.ZipFile` `` - intersphinx
- Custom `extlinks` through simple [Sphinx configuration](https://www.sphinx-doc.org/en/master/usage/extensions/extlinks.html)

 Use the query string to filter matches by `domain:object_type`. Use `*` to match any value, e.g. `*:term` will match term in any domain. For example, `[text](?std:label#api/main)` searches only in the `std:label` domain.



Note that the `domain:object_type` syntax is specific to Sphinx, and may change or depend on the `key` that is being referenced. For example, targeting a [JATS document](https://jats.nlm.nih.gov/archiving/tag-library/1.3/attribute/ref-type.html), the reference kind can be `fig`, `sec`, `chem` etc. and doesn't have a concept of a domain. The `myst:` query syntax is designed to be flexible and extensible to these use cases.

For example, `[my custom text](myst:python#zipapp-specifying-the-interpreter)` becomes:

```yaml
- type: link
  url: https://docs.python.org/3.7/library/zipapp.html#zipapp-specifying-the-interpreter
  urlSource: myst:python#zipapp-specifying-the-interpreter
  children:
    - type: text
      value: my custom text
  data:
    title: Specifying the Interpreter
    inv: https://docs.python.org/3.7/objects.inv
    version: '3.7'
    enumerator: '2.3'
    lang: en
  internal: false
  protocol: myst
```

**InterSphinx Protocol**
: There was [discussion](https://github.com/executablebooks/MyST-Parser/pull/613#discussion_r968548359) on what protocol to use for cross-project links (we settled on `myst:`). Other possibilities included `inv:`, we opted for `myst` to reinforce the branding of the project and ensure the concept could be extended to other, richer, content links in the future that were not ".inv" files specifically.

@rowanc1 rowanc1 changed the title Cross-references MEP Internal cross-references MEP Feb 24, 2023
@rowanc1 rowanc1 requested a review from fwkoch February 24, 2023 18:06
Copy link
Member

@chrisjsewell chrisjsewell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, See comments 😄

meps/mep-cross-references.md Outdated Show resolved Hide resolved
meps/mep-cross-references.md Outdated Show resolved Hide resolved
meps/mep-cross-references.md Outdated Show resolved Hide resolved
meps/mep-cross-references.md Outdated Show resolved Hide resolved
meps/mep-cross-references.md Outdated Show resolved Hide resolved
meps/mep-cross-references.md Outdated Show resolved Hide resolved
meps/mep-cross-references.md Outdated Show resolved Hide resolved
meps/mep-cross-references.md Outdated Show resolved Hide resolved
meps/mep-cross-references.md Outdated Show resolved Hide resolved
meps/mep-cross-references.md Outdated Show resolved Hide resolved
fwkoch
fwkoch previously approved these changes Feb 24, 2023
meps/mep-cross-references.md Outdated Show resolved Hide resolved
meps/mep-cross-references.md Outdated Show resolved Hide resolved
meps/mep-cross-references.md Outdated Show resolved Hide resolved
meps/mep-cross-references.md Outdated Show resolved Hide resolved
meps/mep-cross-references.md Outdated Show resolved Hide resolved
@rowanc1
Copy link
Member Author

rowanc1 commented Feb 25, 2023

Ok, incorporated all of the reviews except the %s or %e thing for the number.

I agree that it is legacy python -- but that isn't a bad thing, it was in use for ~25 years and it already is existing in MyST. I think leaning on that rather than introducing another thing to remember is better. Anything we come up with is going to be pretty arbitrary.

I am opting to not change what is already there in MyST/RST for the roles, and leave the %s as it is. We have to support these roles anyways, so it seems strange to deviate.

@chrisjsewell
Copy link
Member

chrisjsewell commented Feb 25, 2023

Ok, incorporated all of the reviews except the %s or %e thing for the number.
I think leaning on that rather than introducing another thing to remember is better.

But what does s signify?
In the MEP you mention "string", but then what is the title if its not a string?
I think %s vs %t is confusing; one you are referring to by the "type" of the variable, the other you are referring to by the "name" of the variable

@rowanc1
Copy link
Member Author

rowanc1 commented Feb 25, 2023

But what does s signify?

Nothing. This isn't new behaviour it is what Sphinx/RST/MyST already supports for the numref role, and is just documenting it here. The current behaviour for numref is documented here:
https://www.sphinx-doc.org/en/master/usage/restructuredtext/roles.html#role-numref

As an alternative to move this forward, we could say in the enumeration section:

We continue to use and support the enumeration and name interpolation of the {numref} role, which is now also available in the markdown link.

And also paraphrase the existing behaviour.

@chrisjsewell
Copy link
Member

The current behaviour for numref is documented here:

Sphinx specifically marks in the code that this is the "old style" (i.e. legacy) format: https://github.com/sphinx-doc/sphinx/blob/30f347226fa4ccf49b3f7ef860fd962af5bdf4f0/sphinx/domains/std.py#L902

I just don't feel we should base our solution on something that sphinx has already deemed to be subpar, and is essentially only supporting for back-compatibility

I also don't think, for a "new" part of the spec, we should be already be fragmenting into different ways to do the same thing. We should really just pick one and go with that.

But basically; if you change %s to %e, then I would sign off on this


I would also highlight a key difference with Sphinx numref; there the text is just that (plain text), whereas in MyST [text](link) is not actually text, its inline syntax, which makes things a bit more "fiddly".
With these placeholders, I do want to stress, you are introducing a new, non-CommonMark, syntax token. We should be very careful about that.


After talking here, I actually feel using only the {name} and {number} placeholders is perhaps better (or even {title}and {enum});
its a lot more intuitive what it is, without having to read the spec:
[{number}](#link) vs [%s](#link)

this could also be extensible, if we / other implementations wanted to add additional placeholders

The key hesitation I have though, is that braces {} are already used for roles/directives (and maybe now attributes)

@choldgraf
Copy link
Member

choldgraf commented Feb 25, 2023

I think that if we can't get to consensus on this quickly, then we should just remove the "title and number interpolation syntax" aspect of the MEP, and leave it to implementations to define this (and perhaps use numref as an example and note well visit this in the future). We can make this a targeted MEP in the future if it's important to standardize. If y'all think it's crucial to block the MEP on this topic then that's fine too, but IMO it doesn't feel like something worth blocking on if it requires ongoing discussion.

@rowanc1
Copy link
Member Author

rowanc1 commented Feb 27, 2023

I actually feel using only the {name} and {number} placeholders is perhaps better

I am good with that, this is exactly the same as the "new" sphinx syntax, so we aren't introducing anything new here.
I will update the text accordingly.

@chrisjsewell
Copy link
Member

I am good with that, this is exactly the same as the "new" sphinx syntax, so we aren't introducing anything new here.
I will update the text accordingly.

Thanks @rowanc1.
As I mentioned the only thing I'm a little worried about is the "proliferation" of braces lol.

For example, you could theoretically get something like this [{name} {name}`this is actually a role`{.a c="this is an inline attribute"}](#link)

Do you think thats not a bit eurgh?

Also to clarify (and maybe update this in the mep), does {number} go to e.g. Section 2.1.2 or 2.1.2?
and for whatever one it is not, should there be another placeholder, that can give you it?

@rowanc1
Copy link
Member Author

rowanc1 commented Feb 27, 2023

I have clarified that {number} is just the enumerator. I don't think that we need one for interpolating the default "Figure 1", but we could certainly add that down the line if there is demand.

I agree that all the brackets that could be get visually messy, but I don't think that it will be common in practice. Most of the time people will use the default auto-reference (i.e. if coming from \autoref in latex). In my experience when people over-ride that default, I haven't ever come across anything complicated (e.g. "see here" and "in figure {number}" are common, sometimes with some bold text).

@chrisjsewell
Copy link
Member

I agree that all the brackets that could be get visually messy, but I don't think that it will be common in practice.

yep it is not the end of the world, but something I did want to emphasize.

(i.e. if coming from \autoref in latex

I always used http://tug.ctan.org/tex-archive/macros/latex/contrib/cleveref/cleveref.pdf 😉 , shame we can't do reference "concatenation", but let's not get ahead of ourselves

Almost there although I just noticed something....

@stevejpurves
Copy link
Member

A few comments and questions from coming into this fresh, nothing to stop this going ahead as is though:

  • the new syntax is very nice and simplifies things enormously. Seems easier to reason about overall even though various subtleties are being hidden from the user
  • the range of things you can do inside the link is really powerful - including the string interpolation on {number} and {name}. The syntax choice there is natural as well, especially for python users
  • Q: I understand the introduction of two new schemes -- but from the MEP I'm not clear what they are for other than signally internal myst logic should handle those schemes -- I understand project: will conduct a search over the project files in the toc, does this mean that path is only ever used in the case of downloads? in particular it takes a while to understand that use of path: in the "File Download (explicit)" in the table here - is just to explicitly remove the ambiguity between that and the the "Project Document" example. i.e. if the file is in the TOC but we want to have a download link then the path: scheme is included? is that correct?
  • will the current MyST syntax (i.e. [](my-label)) be deprecated in favor of [](#my-label)? is that what issuing the xref_legacy warning achieves/signals?

Overall though this is a great enhancement and a well thought though MEP 👏🏼 - definitely I don't want to be writing or thinking about any of the other cross reference variants - []() is the way!

@stevejpurves stevejpurves self-requested a review March 6, 2023 22:56
| External URL | `<https://example.com>` | `[](https://example.com)` |
| Local file download | `<path:file.txt>` | `[](file.txt)` |
| File download (explicit) | `<path:file.md>` | `[](path:file.md)` |
| Project document | `<project:file.md>` | `[](file.md)` |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from the table here I found some of the ambiguity confusing. The syntax is fine though, it was just the way it was expressed in the MEP -- but reading on, the Downloads section below does make it clearer what is going on between [](path:file.md) and [](file.md) which is (i) download file that is also in the TOC and (ii) link to a file in the TOC(?)

So I think that is fine.

@chrisjsewell
Copy link
Member

chrisjsewell commented Mar 7, 2023

will the current MyST syntax (i.e. ) be deprecated in favor of ? is that what issuing the xref_legacy warning achieves/signals?

yes 👍

Copy link
Member

@fwkoch fwkoch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes since my last review look good!

Only one small comment in the Specification AST section to make it slightly more clear we are changing the existing myst-spec link (and a pointer to where we have already implemented the changes).

docs/meps/mep-0002.md Outdated Show resolved Hide resolved
Co-authored-by: Franklin Koch <franklin@curvenote.com>
docs/meps/mep-0002.md Outdated Show resolved Hide resolved
Copy link
Member

@gregcaporaso gregcaporaso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all sounds great to me. I think the new syntax will be extremely convenient.

Copy link

@fperez fperez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have two minor suggestions - I'd appreciate the 2nd in particular being addressed, but don't consider ti a blocker (the first is trivial).

docs/meps/mep-0002.md Outdated Show resolved Hide resolved
docs/meps/mep-0002.md Outdated Show resolved Hide resolved
@fperez
Copy link

fperez commented Mar 9, 2023

All good from my end now, thanks! 🚀

@rowanc1
Copy link
Member Author

rowanc1 commented Mar 9, 2023

Thank you @fwkoch @fperez @gregcaporaso @stevejpurves @chrisjsewell for the reviews! 🚀 I have brought all of the changes in that you suggested.

@stevejpurves I have updated the first time we introduced project/path schemes with a bit of explanatory text, I don't think that this materially changes anything, but does give the reader of the MEP a bit more of an idea of what is to come. @chrisjsewell @fwkoch can you review the change to make sure it is consistent please (added the middle few lines)!

The links are defined by a scheme, which can be standard protocols (http:, mailto:).
Here we propose two new schemes, project and path:
the project scheme allows for cross-referencing pages, sections, equations, figures, or other components of a MyST project;
the path scheme allows for referencing files outside of the project or explicitly downloading the source of a document in the project.
The schemes are an extensibility point specifically described by CommonMark
and are used to indicate that the link should be resolved by MyST specific logic, and follows standard URI syntax

I am hoping that this high-level clarification also helps in some of the challenges you had @stevejpurves with understanding the downloads section. Please let us know if that is the case!

docs/meps/mep-0002.md Outdated Show resolved Hide resolved
Co-authored-by: Franklin Koch <franklin@curvenote.com>
@chrisjsewell
Copy link
Member

can you review the change to make sure it is consistent please

yep all good cheers 👍

@rowanc1
Copy link
Member Author

rowanc1 commented Mar 9, 2023

Awesome, thanks @chrisjsewell @fwkoch and @jstac as well!

I think that we are meeting all criteria from the MEP process for this being accepted! 🎉

📆 MEP will close Monday March 13th, PT 📆

As this is our first MEP running through the process, I propose we leave this open until Monday March 13th, PT (i.e. an additional five working days) and then if no other @executablebooks/core-team members ask for changes in that time, this is ✅✅✅✅✅✅ ACCEPTED.

For reference:

A MEP may be accepted when all of the following conditions are met:

  • More than five (5) weekdays have passed since the proposal was marked as Active.
  • At least two PR Approvals from core team members.
  • No Request Changes from a core team member.

Thanks again to every one participating as we bootstrap this process. 🚀

Approvers included:
- @jstac
- @gregcaporaso
- @fwkoch
- @stevejpurves
- @chrisjsewell
- @rowanc1

Early and other reviews by:
- @mmcky
- @choldgraf
- @agoose77
- @fperez
@rowanc1
Copy link
Member Author

rowanc1 commented Mar 17, 2023

Hi all - a bit delayed on hitting accept on this MEP, but that did give another 4 days beyond the deadline to have any other feedback. There hasn't been any, and we have the required conditions to merge the MEP!

I am going to squash merge this in now, and will coordinate a plan to update myst-spec over the next few weeks to reflect this MEP!

Thank you again for everyone participating in this process and helping bootstrap these MEPs into existence, this is the first of many enhancements that will improve MyST over the coming years. 🚀

@rowanc1 rowanc1 merged commit 107b38d into main Mar 17, 2023
@rowanc1 rowanc1 deleted the cross-references branch March 17, 2023 19:35
@welcome
Copy link

welcome bot commented Mar 17, 2023

Congrats on your first merged pull request in this project! 🎉
congrats

Thank you for contributing, we are very proud of you! ❤️

@choldgraf
Copy link
Member

Wooo! Thanks Rowan for all of your work spearheading this one, and to everybody else for writing, commenting, discussing, etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet