Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing broken references on multi-repository site build #7634

Closed
utzig opened this issue May 8, 2020 · 5 comments · Fixed by #9685
Closed

Fixing broken references on multi-repository site build #7634

utzig opened this issue May 8, 2020 · 5 comments · Fixed by #9685

Comments

@utzig
Copy link
Contributor

utzig commented May 8, 2020

I am trying to fix some issues with broken links when building this site: https://developer.nordicsemi.com/nRF_Connect_SDK/doc/latest/nrf/index.html

The build steps are given here: https://developer.nordicsemi.com/nRF_Connect_SDK/doc/latest/nrf/doc_build.html

It basically consists on cloning many different open-source repositories, where the documentation is distributed among them. The details are not relevant, and are handled automatically by tooling. Since those projects consist primarily of C (or C++) source code and doxygen documentation, breathe is used to build from it, and intersphinx is used to resolve references between different projects, which seems to be the source of the issue we're having.

One example of a broken link in a document can be seen here: https://developer.nordicsemi.com/nRF_Connect_SDK/doc/latest/nrf/include/bluetooth/gatt_dm.html#_CPPv4N15bt_gatt_dm_attr4uuidE. If trying to read the documentation for bt_uuid it goes to a broken link. At the same time links that were not generated from source code , eg: https://developer.nordicsemi.com/nRF_Connect_SDK/doc/latest/nrf/samples/nrf9160/lwm2m_carrier/README.html, going to "From Zephyr" and clicking on "MQTT" goes to the correct site.

I did some digging and found a few things. We currently use relative paths in our intersphinx_mapping, and one obvious solution would be to use a URL and everything would work, with a few new issues, like documentation not working locally, versioned documentation requiring extra handling on builds, etc.

Trying to find the reason for why some relative path based links work, and some don't, I found out that intersphinx resolves links on its missing_reference callback, for relative paths it happens here: https://github.com/sphinx-doc/sphinx/blob/3.x/sphinx/ext/intersphinx.py#L312. It uses the refdoc value (a document path) to deduce how many directories it has to move up before joining with the path given in intersphinx_mapping. This works correctly for documentation generated from .rst files, eg :ref:'MQTT <zephyr:mqtt_socket_interface>', the working example I linked above. For the broken links, which are originated from source code parsed by the cpp domain parser, refdoc is not available, which prevents intersphinx from correctly resolving the paths.

To fix the issue I made something that feels like a terrible hack, but seems to work correctly here. I changed the transformation code where references are resolved with the following diff:

--- a/sphinx/transforms/post_transforms/__init__.py
+++ b/sphinx/transforms/post_transforms/__init__.py
@@ -74,7 +74,9 @@ class ReferencesResolver(SphinxPostTransform):
 
             typ = node['reftype']
             target = node['reftarget']
-            refdoc = node.get('refdoc', self.env.docname)
+            if not 'refdoc' in node:
+                node['refdoc'] = self.env.docname
+            refdoc = node['refdoc']
             domain = None
 
             try:

Although this seems wrong to do (and modifying a parameter!), adding the refdoc there fixes the links.

Since I am not very acquainted with the sphinx projects' source code just yet, it would be nice to hear opinions on how to properly fix the issue.

@utzig utzig added the type:bug label May 8, 2020
@tk0miya
Copy link
Member

tk0miya commented May 9, 2020

Thank you for the detailed report. AFAIK, Sphinx core generates refdoc attribute automatically on cross-reference. So I guess the reference nodes not having it are generated by breathe extension in this case. Could you report this into their project, please?

@utzig
Copy link
Contributor Author

utzig commented May 9, 2020

Thank you for the detailed report. AFAIK, Sphinx core generates refdoc attribute automatically on cross-reference. So I guess the reference nodes not having it are generated by breathe extension in this case. Could you report this into their project, please?

AFAIK breathe only reads doxygen XML output, which doesn't really have the full path of the original file, so there's no way to add a refdoc property (I might be wrong!). I thought this would be done either in the domain parsers or or in a hacky way as I suggested. I will open a similar PR there, and investigate a bit further.

@utzig
Copy link
Contributor Author

utzig commented May 9, 2020

Ok, so I took a look at the doxygen generated XML files, and they do have the location (aka path) of the file where a reference is found, so technically it looks like it is possible to fix this on breathe. Thanks!

@tk0miya
Copy link
Member

tk0miya commented May 9, 2020

which doesn't really have the full path of the original file

No, refdoc is a document name where the reference node are written. It is not related to the XML output. And doxygen* directives know it. Therefore, they can fill it on converting XML to reference nodes. I think this is the responsibility of the module that generates a reference node.

@jakobandersen
Copy link
Contributor

@tk0miya, I believe this is a problem in Sphinx, at least in the C++ domain, but I think it is in all domains.
The given example is misleading as the MQTT reference is generated by a role, while the bt_uuid is generated as part of a signature. The only place I could find in Sphinx where refdoc is set is exactly in the XRefRole base class: https://github.com/sphinx-doc/sphinx/blob/3.x/sphinx/roles.py#L152.
Basically, I think that all pending_xref that Sphinx creates other than in this place are broken with respect to intersphinx. I have confirmed it with C++ with

.. cpp:function:: template<typename T> \
                  T f()

	:cpp:any:`T`

where the T in the second line becomes a pending_xref without refdoc, but the T in the last line has it.
(Is there a nice one-liner du print the complete doctree from a pickle file?)

Similarly, are there other attributes of pending_xref that are sort of mandatory?

@tk0miya tk0miya added this to the 4.3.0 milestone Oct 3, 2021
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 23, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants