Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Document each member at the lowest-level submodule, with higher level modules pointing at submodule's primary declaration without duplicating them. #679

Open
aaronsteers opened this issue Apr 10, 2024 · 2 comments

Comments

@aaronsteers
Copy link

aaronsteers commented Apr 10, 2024

Problem Description

I want to be able to import classes and functions to higher level modules, for developer convenience, but without insane redundancies in where classes are defined.

In other words, I want to reduce reader fatigue by giving the authoritative reference for any class or function in a single location - while other reference point to those references.

Proposal

Whenever a class is import from a submodule to a parent module, the parent module will "point to" the lower level module, without redundantly documenting it.

Alternatives

I'm not aware of any alternative - although perhaps someone has something very clever in custom rendering templates...

Additional context

Take SQLAlchemy Engine - it can be imported as from sqlalchemy import Engine or from sqlalchemy.engine import Engine. One can make a case that classes which are used extremely often should be easily importable at higher levels in the module hierarchy.

However, with pdoc today, there isn't a way (that I'm aware of at least) to have an index of members at the level they are imported, while still reserving the deep and full documentation of those members for the submodules where they are declared.

One final layer of context is that with modern Python applications, SDKs, and libraries, typing is increasingly a mandatory and required practice. It is not longer sufficient to simply know which functions or classes we are directly calling; we also need to know how to type the classes that are being passed and returned to/from those interfaces. Hence, convenience must be balanced with practicality and intuitive module design - and it is increasingly unlikely that each package member will be declared exactly once in any library. Rather than put full documentation at every node in the hierarchy, it would be a better end-user experience to have a single 'authoritative' location for each member, at the lowest level, with other convenience imports pointing at that member rather than redundantly describing it.

Implementation Option

To implement, a check could be added: if a member that would be documented in a parent module will also be rendered in one of its submodules, then rather than document the entire member, simply declare it and link to the submodule's reference.

This logic can be repeated recursively, down to the lowest-level declared and documented submodule (excluding private or un-rendered submodules).

@aaronsteers aaronsteers changed the title Proposal: Single definition location, multiple reference points Proposal: Single definition location, multiple import locations and reference points Apr 10, 2024
@aaronsteers aaronsteers changed the title Proposal: Single definition location, multiple import locations and reference points Proposal: Docuement each member at the lowest-level submodule, with higher level declarations pointing at the primary declaration without duplicating it. Apr 10, 2024
@aaronsteers aaronsteers changed the title Proposal: Docuement each member at the lowest-level submodule, with higher level declarations pointing at the primary declaration without duplicating it. Proposal: Document each member at the lowest-level submodule, with higher level declarations pointing at the primary declaration without duplicating it. Apr 10, 2024
@aaronsteers aaronsteers changed the title Proposal: Document each member at the lowest-level submodule, with higher level declarations pointing at the primary declaration without duplicating it. Proposal: Document each member at the lowest-level submodule, with higher level modules pointing at submodule's primary declaration without duplicating them. Apr 10, 2024
@mhils
Copy link
Member

mhils commented Apr 10, 2024

This doesn't sound completely unreasonable, but also more complexity than I would like to adopt. You can maybe do something like this:

from . import engine

Engine = engine.Engine
"""@alias sqlalchemy.engine.Engine"""

and then do some custom template hackery that specializes on docstrings with @alias? I'm afraid though this is out of scope for pdoc itself.

@aaronsteers
Copy link
Author

aaronsteers commented Apr 12, 2024

@mhils - Thanks very much for your reply.

What do you think about maybe a more conservative approach, where somehow a top-level module could opt-out of the "full" rendering of its classes and modules entirely - or opt to document them all of its functions/classes as pointers to their import location.

Sample below from our PyAirbyte Docs implementation, where all of these imports are declared in __all__ to reduce verbosity of imports, but all are actually declared in public submodules. Also, in this case, I'm rendering README.md, which I can use to add other notes or references.

Defining classes and functions as aliases would be helpful here, in case the user doesn't know the authoritative lower-level import path. But again, we don't really want or need to give the full definitions here in this file, since we'll also be defining in the submodules.

Details

# Copyright (c) 2024 Airbyte, Inc., all rights reserved.
"""PyAirbyte brings Airbyte ELT to every Python developer.

.. include:: ../README.md

## API Reference

"""
from __future__ import annotations

from airbyte import caches, cloud, datasets, documents, exceptions, results, secrets, sources
from airbyte.caches.bigquery import BigQueryCache
from airbyte.caches.duckdb import DuckDBCache
from airbyte.caches.util import get_default_cache, new_local_cache
from airbyte.datasets import CachedDataset
from airbyte.records import StreamRecord
from airbyte.results import ReadResult
from airbyte.secrets import SecretSourceEnum, get_secret
from airbyte.sources import registry
from airbyte.sources.base import Source
from airbyte.sources.registry import get_available_connectors
from airbyte.sources.util import get_source


__all__ = [
    # Modules
    "cloud",
    "caches",
    "datasets",
    "documents",
    "exceptions",
    "records",
    "registry",
    "results",
    "secrets",
    "sources",
    # Factories
    "get_available_connectors",
    "get_default_cache",
    "get_secret",
    "get_source",
    "new_local_cache",
    # Classes
    "BigQueryCache",
    "CachedDataset",
    "DuckDBCache",
    "ReadResult",
    "SecretSourceEnum",
    "Source",
    "StreamRecord",
]

__docformat__ = "google"

Or is the technical challenge here that pdoc doesn't necessarily have access to whether a specific class/method is imported from a different module - and if so, from which? In that case, even an option to render submodules without classes/methods, could be helpful, I think. In that case, it would be up to me to create markdown content that lists/links them all - which is not too bad of a solution in my case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants