-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge in upstream changes from mkdocs-material #338
base: main
Are you sure you want to change the base?
Conversation
7b53c8b
to
593856d
Compare
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as outdated.
This comment was marked as outdated.
This replaces the modified Sphinx search implementation with the lunr.js-based search implementation from the upstream mkdocs-material theme.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
I hope you don't mind if I push some commits to your branch. I won't force push anything, so you keep squashing the history as you see fit. |
Yes please go ahead! |
Maybe there's a more elegant way to approach this, but I'm just hacking this into something passable for CI
I lied. I screwed up the typing on a commit I didn't run through mypy first... |
This comment was marked as resolved.
This comment was marked as resolved.
add info to doc and mention the `hide-edit-link` metadata for overriding per page
I noticed you fixed some lint issues in the search plugin (which I copied unmodified from upstream except for replacing mkdocs with mkdocs_compat) --- unless they are necessary bug fixes it would be better to revert those to make it easier to merge changes in the future. Similarly we can just check for the HTML builder in our own search extension |
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
probably a relic from the merge script rebase
This comment was marked as resolved.
This comment was marked as resolved.
also ran `npm audit fix` which bumped `node_modules/tar` and `tar` in package-lock
I bumped mermaid to 10.7.0 (as is used in upstream). this fixed the mermaid problem and resolves #328. I also ran |
Bumping dependencies is fine in general, the one tricky thing is that we have to be careful about cssnano/postcss-related deps because newer versions result in OOM (also applies to upstream theme) and I don't yet know of a solution. The only way I got this merge to work was to copy the lockfile from the upstream theme and then run npm install to integrate our additional deps. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly, this works well enough for me as it is now.
I'm going to focus on the feat: upstream
issues though.
I haven't fully wrapped my head around the ported search implementation. My main confusion comes from unfamiliarity with lunr.js (or any search implementation for that matter). But I can try to help if this PR gets too stale.
FWIW, I really don't use the search functionality on web sites unless I'm in a hurry or completely lost for more than 3-5 minutes. I typically find the right page and use Ctrl+F instead.
A few other things I've noticed (all of which do not bother me):
|
{% if page.meta.git_revision_date_localized %} | ||
{% set updated = page.meta.git_revision_date_localized %} | ||
{% elif page.meta.revision_date %} | ||
{% set updated = page.meta.revision_date %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just noticed that we've been feeding Sphinx' last_updated
context into this template.
"meta": {"hide": [], "revision_date": context.get("last_updated"), "meta": []}, |
But this doesn't actually describe the last revision date. All pages have the last build date used even if the doc wasn't actually altered on the date it was built. This inaccuracy is not caused by the shallow checkout used in CI builds; I'm also seeing the same behavior locally.
From yesterday's RTD build for this PR (in task_lists.html):
This template feature would be better served by #341, but I guess this behavior would suffice since it has been this way. We could instead manually get the info using
time.strftime("%x", os.stats(doc2path(pagename)).st_mtime)
but again, the checkout date is not the same as the last modified date since git checkout
doesn't persist file attributes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Git doesn't store any times at the per-file level, and git checkout
/ git clone
sets the modification time to the time of the checkout (for an existing work tree, only the modified files get updated time stamps). Therefore we have to explicitly query git to find the most recent commit that modified the file as in #341, because the only timestamps stored by git are commit timestamps. For that we will need a non-shallow checkout but that probably isn't too big of a deal in practice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ran across a reference in upstream docs that recommends sparce checkouts. But I didn't see it get used in their CI (at least in community edition docs build).
I think the usefulness of search may depend on a lot of factors (including how good the search results are!) but for API documentation in particular a common use case is that you want to find the documentation for a specific symbol, and for that search is very useful. Suppose I want to find out about the https://jbms.github.io/sphinx-immaterial/?q=IndexDomain with the new search results: https://sphinx-immaterial--338.org.readthedocs.build/en/338/?q=IndexDomain As for how search works, basically:
To improve results I think we should:
It is possible we may need other changes to make API documentation searching work well, but those things would be a start. |
Superb explanation! 🚀 Doing another code dive... |
Just a gut reaction here: To avoid any deviation from the inherited search plugin, we could instead derive from it and make any API changes that way. sphinx-immaterial/sphinx_immaterial/search.py Lines 57 to 58 in 89076eb
To compensate for this complexity, our search.py module could be organized as a sub-package. Also, support for other features (like tags) would then be contained/scoped to search-specific implementations. |
Yeah potentially we could indeed monkey patch some of the classes (e.g. SearchIndex and Parser) rather than modifying the file itself --- once we know what modifications are needed we can figure out the best way to make them. Note that the upstream search plugin is already used in a hacky way from |
Still investigating an adequate approach, so this post is just an organization of my thoughts/findings.
I found that Sphinx aggregates a To suplement this info into our
It would be nicer to have a 1-to-1 map of HTML-to-doctree to properly & robustly set the boost appropriately for API descriptions. Initially, I don't think we need to monkeypatch anything to achieve this. It could probably be done in a separate function that I'm partial to opening another PR that merges into this branch, so we don't have to play around with this branch's git history to keep my proposal's reviews/changes "clean". The remaining goals blocking this PR are not simple objectives. PS - I also thought about leveraging upstream's tag plugin with Sphinx domains, but I don't see that being very feasible since tags are really specific to an entire page (not sections of a page). |
I think we can map back to the sphinx object description info from the HTML in a similar way to how the old
We can first iterate over all objects to compute a map of |
Note that most object descriptions do not result in a section based on the current section parsing logic in the search plugin --- we would have to change the parsing, or modify the HTML before feeding it to the parser, to fix that. With apigen we do get a section because there is a separate page, but I think otherwise we don't. |
Just as a quick hack, I added |
This replaces the modified Sphinx search implementation with the lunr.js-based search implementation from the upstream mkdocs-material theme.
Maintaining the existing custom search was proving to be untenable. Using the upstream search implementation greatly reduces the size of the diff and should make it much easier to incorporate future changes.
The downsides are:
Remaining work before merging this: