Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update for sphinx v7.3.x #346

Open
wants to merge 10 commits into
base: main
Choose a base branch
from
Open

Update for sphinx v7.3.x #346

wants to merge 10 commits into from

Conversation

2bndy5
Copy link
Collaborator

@2bndy5 2bndy5 commented Apr 18, 2024

resolves #345

@2bndy5 2bndy5 marked this pull request as draft April 18, 2024 09:36
@2bndy5
Copy link
Collaborator Author

2bndy5 commented Apr 18, 2024

I think there's a regression about consecutive builds using cpp.apigen. I now get warnings about documents not included in any toctree upon rebuilding docs (without flushing old build artifacts):

C:\dev\sphinx-immaterial\docs\cpp_apigen_generated/cpp_apigen_demo.Array-ce8a1651-06f1923e.rst: WARNING: document isn't included in any toctree
C:\dev\sphinx-immaterial\docs\cpp_apigen_generated/cpp_apigen_demo.DataOrder-62e6a0a2-a428c0b6.rst: WARNING: document isn't included in any toctree
C:\dev\sphinx-immaterial\docs\cpp_apigen_generated/cpp_apigen_demo.IndexInterval-562a996b-19da0973.rst: WARNING: document isn't included in any toctree

...

C:\dev\sphinx-immaterial\sphinx_immaterial.apidoc.cpp.apigen:1: WARNING: toctree contains reference to nonexisting document 'cpp_apigen_generated/cpp_apigen_demo.IndexInterval-562a996b'
C:\dev\sphinx-immaterial\sphinx_immaterial.apidoc.cpp.apigen:1: WARNING: toctree contains reference to nonexisting document 'cpp_apigen_generated/cpp_apigen_demo.DataOrder-62e6a0a2'
C:\dev\sphinx-immaterial\sphinx_immaterial.apidoc.cpp.apigen:1: WARNING: toctree contains reference to nonexisting document 'cpp_apigen_generated/cpp_apigen_demo.Array-ce8a1651'

I think there's an undesired behavior specific to Windows (or any Case sensitive file system) in which a consecutive build fails after a certain number of rebuilds. It seems that a hash is added to the file name each time a build is invoked, and at some point the path to the generated file becomes too long. This happens with both python apigen and cpp apigen.

@2bndy5 2bndy5 marked this pull request as ready for review April 19, 2024 00:49
fix tooling ignore comments
@jbms
Copy link
Owner

jbms commented Apr 19, 2024

We may need to investigate exactly how the previous state is reloaded when sphinx does an incremental build.

cpp apigen stuffs some extra attributes into app.env, and then define a _builder_inited that modifies it in place to add the hashes.

I think we may similarly add attributes to app.env elsewhere in this theme, and I think in all cases we don't want that data to be persisted to the next build.

Therefore we may need to change where it is stored --- possibly on the app object itself would be better.

@2bndy5
Copy link
Collaborator Author

2bndy5 commented Apr 20, 2024

I think I found the problem.

page_name = entity.get("page_name")
if page_name is None:
continue
entity["page_name"] = apigen_utils.make_unique_docname(
page_name, writer.case_insensitive_filesystem
)

entities["page_name"] gets appended another hash every time _builder_inited() runs.

I think the fix is to consolidate the for loop in above link with the for loop located just afterward in the locally-scoped function

def get_entities():
for entity_id in api_data.get_documented_entities():
entity = api_data.entities[entity_id]
content = sphinx_utils.format_directive(
"cpp-apigen-entity-page",
entity_id,
)
object_name = api_data.get_entity_object_name(entity)
docname = api_data.get_entity_page_path(entity)
yield (docname, object_name, content)
writer.write_files(get_entities())
My idea is to calc the hash (which based on os.path.basename(entities["pagename"])) without caching the hash-appended file name. Honestly though, I think hashing the bytes written to the file might be a bit more of a cache buster.

@2bndy5
Copy link
Collaborator Author

2bndy5 commented Apr 20, 2024

Well I was able to use the file content (in bytes) as the hash seed, and it did prevent writing new files when the content hasn't changed (for both python and cpp apigen). See f97200f

Now, I'm noticing that the site index is generated for 99 pages on a fresh doc build, but the site index is only generated for 63 files on a subsequent build.

@jbms
Copy link
Owner

jbms commented Apr 20, 2024

Well I was able to use the file content (in bytes) as the hash seed, and it did prevent writing new files when the content hasn't changed (for both python and cpp apigen). See f97200f

Now, I'm noticing that the site index is generated for 99 pages on a fresh doc build, but the site index is only generated for 63 files on a subsequent build.

We don't want to use the file content because in fact the purpose is not cache busting but the opposite --- to ensure a stable URL even as the content changes when the documentation updates. The hash is to deal with case insensitive filesystems where we may have two document names that differ only in casing.

I think there may have been a change in how incremental rebuild state is saved. We want to make sure that this information is not saved across rebuilds, so I think we should move it out of app.env if app.env is getting pickled.

from sphinx.domains.python._annotations import _parse_annotation # type: ignore[import-not-found] # pylint: disable=import-error,no-name-in-module
else:
from sphinx.domains.python import _parse_annotation

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm --- I realize that I steered you wrong on this --- since we monkey patch this, we can run into problems by importing it. Instead I guess we should just create an alias for the module. We will also need to verify that our monkey patches still work --- e.g. if something in the Python domain itself imports this symbol, then we have to replace the imported symbol as well as the original one.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lately, I've been wondering if it would be easier to have our own "better domains" that somehow share/merge data with the vanilla domains (where applicable).

Copy link
Collaborator Author

@2bndy5 2bndy5 May 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Importing as alias seemed to be an over-complication; some of our modules do need to use other API of the python domain. Also, mypy doesn't like it when you import a module and then import it again under an alias (even if only conditionally imported as an alias).

I adjusted the unit test to assert that all modules in sphinx.domains.python which import _parse_annotations() are consistently monkeypatched.

@2bndy5

This comment was marked as resolved.

@2bndy5
Copy link
Collaborator Author

2bndy5 commented May 20, 2024

I'm not sure why, but libclang (v18.1.1) on macos is failing to parse the cpp.apigen demo sources. It seems related to sighingnow/libclang#71

I don't have a mac I could use to test/debug this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sphinx~=7.3.0 breaks sphinx-immaterial
2 participants