New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sphinx AutoSummary to automate API documentation #5082
Conversation
working and hopefully illustrative doc skeleton update requirements WIP Change log: - Upload new PyMC logo - Updated conf.py to pymc instead of pymc3 - Start configuring and populating index page - Start configuring and populating learning page - Start configuring and populating api page - Updated API to follow the docs/Architecture.png layout - Start configuring and populating learning page - Fix typo in filename of "index legacy" add community page, other minor changes Continue restructuring the API Delete conflicting files example doc restructure update requirements update requirements and gitignore some fixes add changes introduced by pre-commit checks fix community page title update api for distributions update gaussian process api fix end of file more changes sponsor logos sponsor logos cleanup, other changes
Codecov Report
@@ Coverage Diff @@
## doc_skeleton_update #5082 +/- ##
======================================================
Coverage ? 78.41%
======================================================
Files ? 131
Lines ? 24464
Branches ? 0
======================================================
Hits ? 19184
Misses ? 5280
Partials ? 0 |
There are a few rST files in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After taking a look at the issue and PR it looks like that will soon be available via sphinx core so generating an extension for such a short time is probably an overkill. Let's keep it like this for now and integrate with sphinx when possible.
There also might not be an alternative to the parent api page having only pymc
linked there but we might be able to make that page an orphan and link directly to api/pymc
from the navbar in the homepage toctree 🤔
autosummary_generate = True | ||
|
||
## Overwrite the autosummary generator function with custom one | ||
generate.generate_autosummary_content = custom_generate.generate_autosummary_content |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't see that line in the stackoverflow answer nor on the linked example, why is it needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sphinx autosummary extension uses the function generate_autosummary_content
to generate documentation.
I am more interested to show documentation of functions/classes listed in __all__
of each python module. So, I have a custom implementation of this function in the file docs/source/sphinx_autosummary/custom_generate.py
. And I overwrite the default function at run time.
Definitely not a good practice, but it serves our purpose till the time the related sphinx PR gets merged.
@@ -12,6 +12,13 @@ | |||
# See the License for the specific language governing permissions and | |||
# limitations under the License. | |||
|
|||
""" | |||
This submodule contains various mathematical functions. | |||
Most of them are imported directly from aesara.tensor (see there for more details). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we link to aesara.tensor
?
Hi! Thank you Sayam! Regarding the TODOs:
You mean toctree depth, right?
And this would mean adding all the information from the rST files that we don't want to lose in the python module docstrings so that autosummary picks it up right? So I'm thinking next steps would be, based on the v4 changelog:
|
I think @Sayam753 did that already
I agree someone should double check, but I think docstrings have been kept mostly up to date as they are right side by side with the actual functions. I would also check them outside this PR as that should probably be done by the person who wrote the functions or someone who knows about it, docstring changes automatically reflect on the website |
Thanks for the review Oriol and Martina. I'll get back to completing the remaining TODOs in the coming week :) |
Just realized there is a potential issue with this approach. What are the correct import paths for each function? Is everything available directly as For example |
I think the recursive parsing implemented in autosummary extension does not care if functions/classes are exposed any level up in package hierarchy. As an example, the sample function is implemented in pymc/sampling.py. It is also exposed at pymc namespace. For current stable docs, even the sample function is mentioned as pymc.sampling.sample. But all the times, we use pymc.sample So, the problem is where to show documentation when functions/classes are also exposed at different namespaces. In this PR, the docs are shown where the functions and classes are implemented. I think the stable versioned docs also follow the same. |
Nothing is changed. Everything will be accessed the same way it was accessed before.
No. Just look up |
Thinking more of this -
I have a simple solution. The documentation of functions/classes are shown where they are implemented i.e. at complete paths. We can mention in API quickstart notebook that commonly used functions/classes are exposed at pymc namespace. Thoughts? |
@Sayam753 Yes, that seems like the only logical solution to me. |
I hadn't realized that our current docs do that. In my opinion, the number one priority of documentation is serving the users, and users do not care about where things are implemented, only how to use them. Automating everything is a good argument to document everything where it is implemented, but we should then try and document which functions are available as part of the pymc namespace. (looping over I think now with v4 it's particularly important to document what is available directly in pymc. as we have changed many things that are quite common like moving logp as a method to pymc.logp |
Great opinions @OriolAbril . I also agree to user-serving purpose of the documentation. So, I'll switch to show documentation for classes/functions at pymc namespace and look how to use sphinx for this. |
We might be able to combine the two approaches, have an extra page called 'pymc namespace"or something like this that only lists functions and links to their actual api page. My main concern is that if we only have complete import paths, what lives in the pymc namespace becomes undocumented, even though I think the main reason we expose everything there is to make user's life easier. As long as we manage to document it it's fine, it doesn't need to be necessarily showing the recommended import paths on the api page |
Just realized an extra drawback of documenting on where things are implemented and not where we expose them. I am setting up sphinx-codeautolink for pymc-examples, which automatically adds links from code blocks to their respective API pages. That means that all pymc funcitions will be clickable and will link to API docs. However, we use If you take a look at the preview at https://pymc-examples--239.org.readthedocs.build/en/239/splines/spline.html for example, you see how all libraries for which we have intersphinx configured (arviz, numpy, pandas, matplotlib) have their functions and classes linked (not methods though) but pymc objects don't. |
a6e8ad3
to
2c65116
Compare
I have been thinking a bit but don't really understand how i works yet and can't think of a good alternative. I did realize though that https://pymc-examples.readthedocs.io/en/latest/gaussian_processes/GP-Heteroskedastic.html#heteroskedastic-gp-latent-variance-model for example has working codeautolink for the gp.cov.ExpQuad which I think is a good example of what we could get if documenting everything at the recommended import path (note that the link isn't working right now but this has already been fixed in the development version of codeautolink). Would it be possible for example to use the exploring code to define the hierarchy but to document things only on some user provided set of modules? Those could be I also don't want anyone to dedicate too much time to this as writing those api pages is a bit of work, but is generally a once per major version thing that doesn't need to be updated unless new functions are added, and when it is updated is 1 line, 1 function name worth of characters only |
I need to lookup how recursive parsing is done in autosummary extension to answer this. |
Hi. I tried looking into autosummary codebase to figure out sidebar issues. I think it will require a lot more time for me to wrap my head around autosummary and sphinx. |
@OriolAbril What do you think we should do here? |
Now, I think more straight forward and less time consuming way would be to update the rST files manually with all the new stuff that got into v4. I can take a stab doing that early next week. And we can revisit here once if needed in future. |
Agreed, I can probably update #5059 directly with manual API files and make sure we use "recommended" import path. I might need help knowing what the "recommended" import path is in some cases though. |
closing for now, we'll discuss in the doc meeting how to divide the todos left from #5059 which include api docs |
Right now, we have many rST files here that use references to classes/functions to generate API documentation. It is hard to regularly sync these files with the codebase. So, this PR uses sphinx autosummary extension to automatically generate API documentation.
Details -
__all__
of any python module.__all__
to expose stuff, find here the Feature Request Issue and related PR. This feature will be available in next sphinx release. Till the time, I have added the PR changes tosphinx_autosummary/custom_generate.py
and its working.__all__
is not present in python module, then default behaviour is to expose all the functions/classes present in that module.docs/source/api
are deleted..gitignore
Note - This PR is targeted to
doc_skeleton_update
branch that @martinacantaro is actively working on. I thought to directly build on top of Martina's work.TODOs