Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incomplete documentation for env_version #12237

Open
3 of 7 tasks
mgeier opened this issue Apr 7, 2024 · 7 comments
Open
3 of 7 tasks

Incomplete documentation for env_version #12237

mgeier opened this issue Apr 7, 2024 · 7 comments

Comments

@mgeier
Copy link
Contributor

mgeier commented Apr 7, 2024

Describe the bug

https://www.sphinx-doc.org/en/master/extdev/index.html#extension-metadata documents env_version.

How to Reproduce

Look at https://www.sphinx-doc.org/en/master/extdev/index.html#extension-metadata

@picnixz
Copy link
Member

picnixz commented Apr 10, 2024

@chrisjsewell You introduced the typed interface for that one so perhaps you could take that one?

@chrisjsewell
Copy link
Member

you introduced the typed interface for that one so perhaps you could take that one?

Well, that's one of the reasons I introduced it; because for a long time I had no idea env_version was even an option 😅
But indeed thank you for detailing this @mgeier, it would be good to make the documentation around it clearer

alternatively, a Domain could be used to store data in the environment.

I would note here, that I feel the purpose of a domain is more around handling referencing, and then the storage of data here is more of a "side-effect" of that; to store pointers to targets, etc
I don't believe you should use it to store arbitrary data.

@mgeier
Copy link
Contributor Author

mgeier commented Apr 10, 2024

I don't believe you should use it to store arbitrary data.

This kinda says the opposite: #9003 (comment)

Would you suggest adding arbitrary attributes to app.env instead?
Or do you have something even better?

And it's not only about storing data, it's also about making it work with parallel builds, which is actually hard, see e.g. #12251.

And to be fancy, it's about removing the appropriate data if a source file is removed (which the "todo" extension does but the "recipe" extension doesn't, AFAICT).

For a concrete example, see mgeier/sphinx-last-updated-by-git#66. I'm not sure if that's better or worse, BTW!

@chrisjsewell
Copy link
Member

This kinda says the opposite

well that is not consistent with what the actual source says:

class Domain:
"""
A Domain is meant to be a group of "object" description directives for
objects of a similar nature, and corresponding roles to create references to
them. Examples would be Python modules, classes, functions etc., elements
of a templating language, Sphinx roles and directives, etc.
Each domain has a separate storage for information about existing objects
and how to reference them in `self.data`, which must be a dictionary. It
also must implement several functions that expose the object information in
a uniform way to parts of Sphinx that allow the user to reference or search
for objects in a domain-agnostic way.
About `self.data`: since all object and cross-referencing information is
stored on a BuildEnvironment instance, the `domain.data` object is also
stored in the `env.domaindata` dict under the key `domain.name`. Before the
build process starts, every active domain is instantiated and given the
environment object; the `domaindata` dict must then either be nonexistent or
a dictionary whose 'version' key is equal to the domain class'
:attr:`data_version` attribute. Otherwise, `OSError` is raised and the
pickled environment is discarded.
"""

And it's not only about storing data, it's also about making it work with parallel builds, which is actually hard

We deal with this in sphinx-needs with parallel builds for large documentation sets, and it works quite well: https://github.com/useblocks/sphinx-needs/blob/0f71ecde47a3f7c3245c5bcf545ef5281376768c/sphinx_needs/needs.py#L266-L270

But I agree that it would be nice to have a "consistent" workflow for storing data (outside of references)

@mgeier
Copy link
Contributor Author

mgeier commented Apr 15, 2024

that is not consistent with what the actual source says

Well, the docstring you are quoting is nearly 15 years old, so maybe the usage recommendations of this have changed slightly in the meantime?

Maybe the docstring should be updated? I have added this to the task list above.

Is there anything except that docstring that makes you think that it is not appropriate to store arbitrary data in a Domain (as compared to adding arbitrary attributes to app.env)?

We deal with this in sphinx-needs with parallel builds for large documentation sets, and it works quite well: https://github.com/useblocks/sphinx-needs/blob/0f71ecde47a3f7c3245c5bcf545ef5281376768c/sphinx_needs/needs.py#L266-L270

Thanks for this example, this is really interesting, since it seems to be a perfect fit for a Domain!

The SphinxNeedsData class looks like it could be perfectly derived from Domain:
https://github.com/useblocks/sphinx-needs/blob/0f71ecde47a3f7c3245c5bcf545ef5281376768c/sphinx_needs/data.py#L399-L597

Now the merging and invalidation of data is scattered over different places, a Domain would allow everything to be implemented in one place.

As an added benefit, this domain could then be used to namespace the (many) roles/directives to get something like needs:table instead of needtable, which would arguably a bit cleaner.

I'm of course not suggesting to do that now, because it would be too disruptive, but for a new project I would definitely consider it.

Ironically, the extension is not returning env_version, which makes Sphinx believe that nothing is stored in app.env, AFAIU: https://github.com/useblocks/sphinx-needs/blob/0f71ecde47a3f7c3245c5bcf545ef5281376768c/sphinx_needs/needs.py#L302-L306
Adding env_version has already been suggested in useblocks/sphinx-needs#852 (comment)

@chrisjsewell
Copy link
Member

Oh I forgot there is something more in-line with what you want:

class EnvironmentCollector:

and can be added with

def add_env_collector(self, collector: type[EnvironmentCollector]) -> None:

you see it just links methods to event hooks though, ao nothing you couldn't implement on your own

so maybe the usage recommendations of this have changed slightly in the meantime?
Is there anything except that docstring that makes you think that it is not appropriate to store arbitrary data in a Domain

The whole of the Domain API is designed around storing and resolving references, this is the only thing it is used for in core sphinx, and there will likely be even more logic being added in this direction in the future: #8929

In addition, all roles are expected to be references to domain objects, and all directives are expected to create targets.

As I mentioned already, storing data is just a "side-effect" of reference resolution, not a primary goal of the Domain.

If all you want to do is store data, then you are bringing in a whole lot of things you probably don't need;
for example you literally have to implement a concrete resolve_any_xref method just to get it to work.

Adding env_version has already been suggested in useblocks/sphinx-needs#852 (comment)

yep by me 😄

@picnixz
Copy link
Member

picnixz commented Apr 16, 2024

Just my 2 cents:

  • I don't store information in domains, and rather store everything in the Environment.

Would you suggest adding arbitrary attributes to app.env instead?

In my own extension, I have a single dictionary which is storing every information I need. It works a bit like the registry on MS-DOS systems where you have namespaces + key/values and any other "component" can put whatever they want in that registry using their own registry key (a bit like the Stash in pytest actually).

I create that dictionary as a single attribute in app.env upon 'builder-inited' and merge information in 'env-merge-info'. I can easily access whatever information is being stored using something like app.env.registry[namespace][key].

does not set parallel_read_safe but I guess there is no reason for this extension to disable parallelization, right

Probably because we didn't want the tutorial to add some unknown logic.

... but it gives no indication how/why to do that.

There is actually a docstring about what the environment version is meant for:

# This is increased every time an environment attribute is added
# or changed to properly invalidate pickle files.
ENV_VERSION = 61

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants