Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

📥 Load TOC from toc entry #1188

Open
wants to merge 33 commits into
base: main
Choose a base branch
from

Conversation

agoose77
Copy link
Collaborator

@agoose77 agoose77 commented May 7, 2024

This PR adds a new toc field to the project frontmatter and export list items, which explicitly outlines a table of contents (TOC) according to #1151.

To-Do

  • Add documentation for toc
  • Implement pattern loading
  • Add end-to-end tests for toc in exports
  • Fix level depth (see failing TeX export)
    - [ ] Implement url loading
  • Implement --write-toc support
  • Better error for non-file root entry (or relax this)

The existing internal representation for pages is less feature-rich than this TOC. So, we need to follow up this PR to extend our internals.

@agoose77 agoose77 force-pushed the agoose77/feat-load-toc-from-frontmatter branch 2 times, most recently from d2935cd to a85a2c1 Compare May 7, 2024 16:11
@agoose77 agoose77 self-assigned this May 7, 2024
@agoose77
Copy link
Collaborator Author

agoose77 commented May 8, 2024

@rowanc1 UX-wise, how do you feel about myst init --write-toc modifying myst.yml vs spitting out a YAML fragment to stdout?

@rowanc1
Copy link
Member

rowanc1 commented May 8, 2024

Yep, I think that should be the plan. It is annoying that that by default removes comments. Not sure if there is an easy way around that?

@agoose77
Copy link
Collaborator Author

agoose77 commented May 8, 2024

Yeah, that's a common problem among most YAML readers. I think we should just assume that the myst.yml isn't super weird, and manually insert a hunk. Maybe we can also update existing TOCs (by matching indentation), will need to think on that.

@agoose77
Copy link
Collaborator Author

agoose77 commented May 9, 2024

@rowanc1 @fwkoch this PR is making good progress.

When looking at TeX exports, it occurs to me that the current design of the TOC is export agnostic; it simply encodes a tree structure. For sites, we just use the TOC as-is, whilst for exports to e.g. LaTeX, the user should have some agency in controlling whether they are given chapter or section.

I think the proper solution to this is to add a configuration option on the export for TeX, i.e. top-level: chapters. Do you have any thoughts here?

@agoose77 agoose77 force-pushed the agoose77/feat-load-toc-from-frontmatter branch from 6bca7c1 to 81f81a9 Compare May 9, 2024 18:03
@agoose77
Copy link
Collaborator Author

I've removed the url TOC entry type. We used to have it in Jupyter Book, but I think it's mixing concerns; MyST TOCs should be local content, that doesn't prioritise site exports over any other; we could conceive of having external content pulled in, but I don't think the ergonomics / attribution story is very compelling.

So, external URLs need to be in the articles themselves, or the theme. @choldgraf can I get a sense check on this from your side; can you reason about whether Jupyter Book having the ability to set URL links in the TOC is important enough to keep?

@agoose77
Copy link
Collaborator Author

@fwkoch keeping you in the loop -- I'm stuck on what the best resolution for choosing between chapter and section in TeX exports.

I am thinking that we should generalise for all exports with a new export field level which sets the level origin. We already have this conceptually for the article member of export. I imagine that the export level would take precedence (be inherited) for articles with unset levels.

@fwkoch
Copy link
Member

fwkoch commented May 15, 2024

Hey @agoose77 - I haven't done an in-depth code review yet, but just trying to respond to some of the comments.

First, from my perspective, dropping url is fine, simply because MyST currently doesn't support it. Later, we can decide if adding it is necessary for jupyterbook parity - but that's a totally independent feature.

Regarding the level question, it's a bit dicey. In addition to chapter and section, there is also an even broader level part. Just to summarize: Currently, if you are using a _toc.yml, the decision for which of these is your "highest level" (erm... lowest number? but largest in breadth...) comes from the top-level key: parts/chapters/sections. If you are using "myst-site-style" article list, you can use levels -1 and 0 (for part and chapter, respectively), then sections start at 1. Then, separate from what the user defined there, if the export is not tex-based, parts and chapters are meaningless, and everything is bumped up to start at 1.

Now, in the new TOC, users are no longer required to specify parts/chapters/sections or level - they just do nice simple nested children. So that comes all the way back around to what you were saying - we need another way to specify this, and possibly it should be tied to the tex export rather than the toc.

So, yeah, I think your suggestion is a great place to start:

---
exports:
  - format: tex
    toc:
      - file: ...
    level: chapters // default is 'sections' maybe, and would this be more clear as 'top_level' or something?
---

That gives us clearer validation at the export level (e.g. "typst export does not support 'parts'"), and it allows for easier re-use of tocs without ambiguous behaviour (e.g. "what happens to my 'chapters' in a site" - chapters just don't exist until you are in tex land).

Not sure how helpful all that is, pretty sure all I said is "I agree with you" 😆 but let me know if I can lend a hand with implementation or anything.

@agoose77
Copy link
Collaborator Author

I did a quick search:

It looks lie ~10% of _toc.yml files in the wild use URLs. This is more than I had thought. I don't feel comfortable dropping URLs from the schema with that in mind, even though this PR throws an error if it encounters them.

I don't really know what a URL would mean in the context of a TeX or Typst export. I suppose we could embed them as links - that's a valid thing to do. So, I'll leave in for now.

@agoose77
Copy link
Collaborator Author

@fwkoch would you be able to plumb in the top_level export option? I am getting some fatigue on this PR and would appreciate a boost!

@choldgraf
Copy link
Member

Hey all - just a quick note from me because I missed the URL thing (@agoose77 if you'd like feedback from me please ping me in Discord or Slack, I get way too many github notifications and sometimes miss @ mentions here)

In my experience, URLs are pretty common to have as part of the left sidebar in a Jupyter Book. This is usually used in order to link to an external site that's related to your book. For example, 2i2c's catalyst project documentation includes an external link for the instructor training.

CleanShot 2024-05-17 at 06 17 42@2x

This might be less necessary with other ways to expose external links in MyST e.g., resolving this one would help a lot

You could also imagine a dedicated section for external links in the left navbar that comes before or after the other links.

@fwkoch fwkoch force-pushed the agoose77/feat-load-toc-from-frontmatter branch from d250fee to 817ef38 Compare May 23, 2024 08:37
@fwkoch
Copy link
Member

fwkoch commented May 23, 2024

Ok - I've made a number of updates to this PR and I think it is ready for another look and a little more discussion.

First, the updates I made:

  • plumbed in top_level into the exports, as requested, and added some tests
  • myst init --write-toc will upgrade an existing _toc.yml file to a new myst.yml toc - This means when existing users update their myst version and see a new "legacy toc encountered" warning, they are given a call to action to fix it, rather than just getting stuck with the warning and no path forward.
  • Addressed a bug where export would prefer _toc.yml over myst.yml toc
  • Re-enabled export specifying _toc.yml file for toc - we don't want to break this functionality with the new toc implementation
  • URLs in the toc no longer cause the build to crash - they just log an error and are ignored (for comparison, URLs in _toc.yml files were silently ignored)
    • We can pick up the discussion on what to do with these URLs later.
  • Removed warning when encountering a nested _toc.yml file when building the project structure. There is currently no way to reproduce this functionality with the new toc, so we cannot really warn that it's an error...

Now, a few points of discussion:

  • URLs in the toc - Is it fine to warn and ignore URLs for now?
    • I think this is fine for now and we can open an issue to discuss how URLs should be treated in the future
  • Nested TOCs - this feature helps keep constantly changing projects organized - one folder may be allowed to autogenerate the toc from contents while another folder is strictly structured. With tocs only defined in project config, this feature is not supported; a nested myst.yml with a new toc will spawn an entirely new myst project. We need a way to define a folder-specific toc within a project. Can we just keep the legacy nested _toc.yml functionality for now?
    • We can open an issue to address sub-folder config, including "sub-tocs"
  • Implicit index file - To me, this was a peeve of the _toc.yml format; the structure required a single index file. This led to ambiguous behaviour when building multi-page exports: Ignoring the index file was required to have parts/chapters work as expected... but then what if you don't want that file ignored. Now, in the new-style toc, there is no longer a need for an index file; the top level of the toc could theoretically be a title with children. However, there is still a check in the code that requires the first toc entry to be a file, basically reinstating the index file requirement but in an implicit way, where it is just toc[0] rather than toc.index. This feels bad to me, there are now export cases where toc[0] just gets popped off and ignored...
    Addressing this will require some extra work on the site-project side, since sites need an index. Maybe we enforce the index page for projects but not for exports...? Or maybe we relax the project index page rule somehow...? Maybe this one is also fine to kick down the road. Is the it sufficient to say for now we are matching _toc.yml behavior, including the non-ideal implicit index, and removing this limitation will be a separate feature?
  • And, to that end, I think just a high level review of the scope of this PR, looking back to the comments here: Add a new table of contents format #1102 - In retrospect, does this PR provide an "MVP" for a new toc that fits the requirements of jupyterbook migration without "closing any important doors?" From my perspective, aside from the points I raised above, I think it does...?

@fwkoch fwkoch marked this pull request as ready for review May 23, 2024 18:39
@stevejpurves
Copy link
Member

@fwkoch just one question for now - on the

"This feels bad to me, there are now export cases where toc[0] just gets popped off and ignored..."

does this mean that a user could silently(?) lose the page of a multi-article pdf export for example, with no means to fix it?

@agoose77
Copy link
Collaborator Author

@fwkoch @rowanc1 and I discussed breaking _toc.yml exports in order to keep things simple (reverting 8eeb770). However, you've already written the code and I don't have super strong feelings.

Nested TOCs - this feature helps keep constantly changing projects organized - one folder may be allowed to autogenerate the toc from contents while another folder is strictly structured. With tocs only defined in project config, this feature is not supported; a nested myst.yml with a new toc will spawn an entirely new myst project. We need a way to define a folder-specific toc within a project. Can we just keep the legacy nested _toc.yml functionality for now?

Is this really something that we need? JB doesn't have this feature, for example. My personal preference is to have a single source of truth, rather than distributing it throughout the project file system.

However, there is still a check in the code that requires the first toc entry to be a file, basically reinstating the index file requirement but in an implicit way,

Yes, I don't love this. For me, the crux of the issue is whether the TOC is a site thing or a project thing. It's sort of both; it represents the heirarchy in the site, but it also represents the list of things that we want to build. Moreover, I don't think we need to impose this constraint; we just need a way to identify the index page. Perhaps we want to indicate the index page via a part?

URLs in the toc - Is it fine to warn and ignore URLs for now?

From @choldgraf and looking at projects in the wild, I think URLs in the site navigation are common-enough that we need to support them. As pointed out above, the TOC is both an allow-list and a site-map. In this case, it's the site-map part we're talking about.

To my mind, I think it's easy enough to say "drop URLs for non-web exports". I don't love the mixing of concerns, but it is simple enough to reason about.

@choldgraf
Copy link
Member

choldgraf commented May 24, 2024

tl;dr: If this PR doesn't introduce any of the sub-optimal behavior you're describing, I feel that it's fine to merge in and iterate from there. We can treat our current goal as alpha-level MVP, while some missing features might be needed for a beta-level MVP.

More detailed thoughts below:

Is it fine to warn and ignore URLs for now?

If the current implementation does not support URLs, and this PR adds new functionality but does not deprecate pre-existing URL functionality, then I think it's fine to merge this in and iterate from there. I agree we do want to have URL supported in the TOC in the future, but as long as we aren't removing anything with this PR I think it's fine.

We need a way to define a folder-specific toc within a project. Can we just keep the legacy nested _toc.yml functionality for now?

Could you explain the use-case that you have in mind? Why would somebody define multiple folders with nested _toc.yml files instead of just define a single myst.yml file with a toc that had all the nested folders in it?

That said, I don't have clear opinions on this one, I'm fine deferring to your judgment on how to handle this in the short term.

Is the it sufficient to say for now we are matching _toc.yml behavior, including the non-ideal implicit index, and removing this limitation will be a separate feature?

I'd say "if this PR doesn't remove functionality and just mimics current suboptimal behavior, then let's just get it in and iterate". That is basically the theme of my thoughts in general haha.

By the way - I am confused about the placement of the Table of Contents documentation. It's currently nested under websites, which suggests this feature only makes sense for websites. That's not true though, right? It is also meaningful for non-website exports, so maybe we should move this into a different part of the documentation?

does this PR provide an "MVP" for a new toc that fits the requirements of jupyterbook migration without "closing any important doors?"

As long as this PR isn't introducing a degradation of behavior, I think we should merge it in and iterate.

For an alpha-level MVP I think it'd be fine to not have URL support. I suspect that a beta-level MVP would need URL support, but I think our goal right now is to get to "good enough to recommend that power users give it a shot" and I think the current behavior is fine for that.

@choldgraf
Copy link
Member

choldgraf commented May 24, 2024

I took the liberty of making some content changes to this PR, with the goal of making it a bit easier to follow. Here are the main highlights, I'm happy if you want to revert any of them:

  • I moved the "TOC structure" to the top. I found it confusing that we described the TOC, but then started by describing the case that a TOC is implicitly defined (AKA, not in myst.yml). I feel that we should lead with the case that we think is most likely to be used.
  • I added subsections so that the main ideas are more glanceable from the in-page table of contents.
  • As part of this, I also cleaned up the xref documents, moved it under websites (instead of reference) and added a short section on the .json that exists for any page.

However, in doing this i also discovered that the Netlify build is not working. It is erroring with the following log:

CleanShot 2024-05-23 at 21 32 54@2x

I can't tell if this is related to the commit I just made in 812d900 - but if it's a bug that i've introduced, I also think this is a case where we should surface a more useful error message than a javascript error.

@agoose77 agoose77 force-pushed the agoose77/feat-load-toc-from-frontmatter branch from 9e0bed5 to e8a20c5 Compare May 24, 2024 15:00
@agoose77
Copy link
Collaborator Author

Thanks @choldgraf, that's a wonderful change. I've moved a subset of the changes to a separate PR as they aren't specifically TOC related (#1234). I've fixed the bug in MyST (unrelated to this PR) in #1233.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants