Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create tutorials and documentation meant for developers that wish to build on the Sphinx stack #837

Open
choldgraf opened this issue Oct 6, 2022 · 9 comments
Labels
documentation Improvements or additions to documentation enhancement New feature or request

Comments

@choldgraf
Copy link
Member

choldgraf commented Oct 6, 2022

Context

The Python implementation of MyST and Jupyter Book depends heavily on Sphinx, and the ways to extend Jupyter Book will require an understanding of how Sphinx works. However, Sphinx and docutils are both very complex, and both of them lack complete and accessible documentation. This makes it hard for people to learn about how our projects work and how to contribute, and also makes it harder for them to extend and build on top of the stack.

Proposal

We should have a dedicated space in the documentation to help people in a few ways to make it easier for them to learn and contribute. Here are a few that came to mind immediately:

  • get started with the docutils/sphinx stack
  • understand how Jupyter Book and its components use that stack
  • pointers to take first-steps towards extending and building on that stack.

References

I think this is also related to:

@kai-tub
Copy link

kai-tub commented Oct 6, 2022

Thanks for opening an issue!
I know that this is a slippery slope, but I really think that there are some areas where 'quick' tutorials would really help external developers. Sphinx alone is a huge beast that took me forever to get the basics running, and I still have no clear idea how to test it.

I guess my point is that I know that it seems like a never-ending task to write tutorials about so many tools, but if I can get away with looking at a short mini-tutorial that links to the relevant source code section for more information, I am already a lot better off!
I personally don't even mind looking at the source code but with so many projects, it is sometimes hard to track what repositories are even relevant.

I am quite sure that everybody is aware of these projects, but IMHO these are projects with the best documentation about their "stack":

  • FastAPI or SQLModel (or anything from Sebastián really)
  • furo for its clear navigation (see customisation section)
  • And not yet out, but still on many people's radar: textual
  • .... Ok, I am blanking. There are a couple of other good examples for an architectural overview, but those are the 'heavy' hitters IMHO

PS: Slightly off-topic. I also think that this article from Sebastian about the future of education and art could be nicely targeted towards the executable book audience. Especially with many users in educational roles, maybe this could become a platform to invite "artists for education"?

@chrisjsewell
Copy link
Member

chrisjsewell commented Oct 8, 2022

Heya, I would suggest this needs to be decided as an off-shoot of high-level strategy (see e.g. #839)

In particular, should we be encouraging users to work with in the sphinx/docutils space, if (potentially) the high-level strategy is to move to a JavaScript implementation (see e.g. #838)?
Naturally, then any work by developers or users (making e.g. sphinx extensions) within this space, could be an inefficient use of time and resources.

We should also make sure we are not duplicating upstream efforts and/or trying to upstream any work we do. There is already some good work done, primarily by the RTD guys, to add sphinx tutorials: sphinx-doc/sphinx#9165

I would also note here executablebooks/jupyter-book#1673, i.e. a push to expose more of sphinx, which again seems at odds with a strategy of moving away from sphinx

@NickleDave
Copy link

NickleDave commented Oct 8, 2022

Hope it's okay that I chime in as an outsider.
I totally get the reasons for moving to js (I think). And so it makes sense to not develop docs that would then take time and energy from the main thing you are working on.

But there's a world of Python users that do not have any of the insight you all have built up from working with docutils and sphinx. When I heard @choldgraf @pradyunsg et al. on Talk Python describing the internals, I was really wishing that I could read about that somewhere.

And clearly there are still people in Python land thinking about the tooling
sphinx-doc/sphinx#8039
reStructuredText/startup#1

Would it be worth writing some summary of "lessons learned about Sphinx/docutils internals" with a big disclaimer at the top that says "we are focusing on js for reasons x, y, z, please see these other doc pages/blog posts/etc". Like, a post mortem (you can give it a more polite title).

Of course in general reduplication of effort is bad. But as you all know better than me there's a bit of a divide between Jupyter users and contributors because of javascript. If you at least write down what you learned in a way that's a little more readable for the general public, maybe other Python devs could develop complementary tooling for all the users that aren't going to master Typescript

I'd love to see a post with diagrams like in this presentation
https://docs.google.com/presentation/d/168yre5u_D2wQpeySrrDqV3cM9qE85YiaRTT8tMpjcGo/htmlpresent
that are not trapped inside an html-published slide deck from 2015 that no one can download easily,
maybe using the awesome mermaid directive that some brilliant EB developers came up with 😁

@chrisjsewell
Copy link
Member

Heya thanks @NickleDave

But as you all know better than me there's a bit of a divide between Jupyter users and contributors because of javascript

Yeh I think that's the same kind of here, and more generally its just this trade-off that javascript is not particularly nice to use (Python is certainly better) but it has been the only game in town for web/browser-based development.
Now we have e.g. WASM (-> pyiodide etc) but still all the tooling is not yet there to really use this to replace JS.

So thats why JS.

Then even if we could use Python, the problem with Sphinx, is that what we really want is a "reference implementation" for https://github.com/executablebooks/myst-spec, and its basically impossible to do this on top of a framework that we don't control, and certainly not one as messy and complicated as sphinx.
(Sphinx as a framework also has some quite "severe" limitations, when it comes to building something for the browser.)

Sphinx does still have a certain degree of method to its madness though lol, I for instance added the order of events list here: https://www.sphinx-doc.org/en/master/extdev/appapi.html#sphinx-core-events, and it is possible to "master"

@NickleDave
Copy link

NickleDave commented Oct 8, 2022

So thats why JS.

All makes sense to me

yeh I guess I was the et al here lol

In fact you were, sorry! Hadn't made the connection, I blame common names like "Chris". Almost as bad as "David" 🙄 😼

Thank you for taking the time to share the links to the order of execution and what you had written up previously, both are definitely informative to me.

I can totally understand not wanting to sound like you are criticizing the really hard-working Sphinx and docutils devs/maintainers who have their own project goals/histories/trade-offs, but something like this as a blog post I do think could be of wide interest

Sphinx is quite tied to having files exist on a file system

Can you say a little more about this point? I saw it in other places too.
I think what's implied here is that processing steps fail when files don't exist? And the alternative would be to have an intermediate step that works with e.g., strings, so one can do something like load, loads, dump, dumps? (Forgive my possibly very naive way of describing it)

edit: read comment here: sphinx-doc/sphinx#10894 (comment)

@chrisjsewell
Copy link
Member

Thanks!

I blame common names like "Chris". Almost as bad as "David"

Indeed 😆

Sphinx is quite tied to having files exist on a file system
Can you say a little more about this point? I saw it in other places too.

Yeh so Ideally, I would say, you want some kind of interface layer, whereby you could supply a "virtual file system" for sphinx to work on, using e.g. the python pathlib.Path API
In fact as an example I would give my own https://github.com/aiidateam/archive-path, whereby you could have sphinx read a zip file

@kai-tub
Copy link

kai-tub commented Oct 8, 2022

@chrisjsewell thank you for providing such a complete overview of your experience with Sphinx/Docutils!
I agree with @NickleDave that it would be quite helpful to have this 'formalized' into a blog post to point others toward.

@choldgraf
Copy link
Member Author

Hey all - just a quick note that I provide some explanation of my perspective on the Jupyter Book / MyST JS differentiation and plan here:

For this issue of tutorials etc, I agree completely that we should offload as much team knowledge about Sphinx and Docutils as possible into the broader community. In my opinion this is where we should direct a significant amount of our capacity on the current grant, as it is crucial to spread knowledge of this stack across multiple people.

I'm indifferent as to whether we do this in an Executable Books-specific place, or via upstream contributions to other places. I think the most important thing is to get the information out there because this is where the community is currently bottlenecked. We should take whatever path has the least resistance for now, and decide what to do with the information later.

@kai-tub
Copy link

kai-tub commented Nov 22, 2022

I would like to follow up on this a bit.
While reading various independent discussions in this repository, it is quite hard to get motivated to dive deeper than necessary into Sphinx (from a developer's perspective), as it seems like the Sphinx ecosystem is quite unapproachable and doesn't look like it will be the future of the MyST ecosystem.
Though, I still think it would be nice to have some of your knowledge put into short tutorials.

I think that there is almost no question that the executable books project belongs to one of the most experienced "outside" sphinx users/developers (maybe with the readthedocs team) and that you are quite familiar with its shortcomings and limitations.

I am aware that this has been discussed in a Sphinx issue (sphinx-doc/sphinx#9165) and that there has been some work to onboard new users, but I am still missing a good introduction for testing Sphinx.
From my perspective, this seems to be the most elusive component and is only solvable by searching through various GitHub issues and reading through code (and this still seems to be quite unhelpful as Sphinx apparently has "special" handling for their own test suite?).

I know that from a testing perspective, the MyST ecosystem had to fight quite a bit with directive parsing and post-transforms, but even a trivial 'CLI' guided testing section would be tremendously helpful for writing 'simple' tests.
Again, I am aware that we probably can all agree that it would be nice if there were a string-based (non-disk file-focused) API, but given that it looks like this will be a task for a different documentation framework....
I would really appreciate it if the ecosystem of the executable book tries to upstream their "testing" knowledge for the "Sphinx way" (a tutorial and a couple of quick how-to guides a la https://documentation.divio.com/introduction/; PS: Just to show you that I try to read all of your posts across all projects :D ) and a discussion/review post that describes the shortcomings of the current implementation in a separate blog post.
I assume that your idea would be to write a 'comparison' with the custom JS implementation vs. Sphinx and discuss the various trade-offs, but I would argue that it would already be quite handy to have a general/conceptual discussion on what a 'new' framework could learn from Sphinx. Especially as this would be quite interesting for people that are interested in the 'abstract' knowledge and maybe not necessarily in the implementation details of Sphinx.
Again, I know that this knowledge is somewhat already present in various issue discussions, but it is quite inaccessible and hard to find for later review.

TL;DR: IMHO, I think it would be nice if the executable books project could fill in a couple of holes in the Sphinx documentation regarding testing, as I think this project has the most in-depth knowledge on this subject aside from the core maintainers. I know that the testing methodology of Sphinx is maybe not be how many would like it to be, but a general tutorial for others that will have to use Sphinx would be very helpful for the wider community. Also, given the experience, it would also be nice to have a single blog post, accumulating the experience and giving a single reference to what an alternative solution could do differently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants