Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error if an extensition has been compiled for several python version. #407

Open
nennigb opened this issue Jul 21, 2022 · 6 comments
Open
Labels
bug Something isn't working

Comments

@nennigb
Copy link

nennigb commented Jul 21, 2022

Expected Behavior

I am documenting a package containing fortran extension build with f2py/gfortran/gcc on linux.
Once the package is installed the root folder contains mypkg.cpython-39-x86_64-linux-gnu.so which is imported in the __init__.py file.
If I run pdoc, it works fine ;-)

# Here I use py39
pdoc3 --force --config latex_math=True --html pypolsys
/home/xyz/anaconda3/envs/work/lib/python3.9/site-packages/pdoc/__init__.py:754: UserWarning: Couldn't read PEP-224 variable docstrings from <Module 'pypolsys.polsys'>: source code not available
  m = Module(import_module(fullname),
html/pypolsys/index.html
html/pypolsys/polsys.html
html/pypolsys/test.html
html/pypolsys/utils.html
html/pypolsys/version.html

I am using several python version (the package are installed in editable mode) and I can have also mypkg.cpython-38-x86_64-linux-gnu.so or mypkg.cpython-310-x86_64-linux-gnu.so in the same place. Then pdoc crashes.

# Here I use py39
pdoc3 --force --config latex_math=True --html pypolsys
/home/xyz/anaconda3/envs/work/lib/python3.9/site-packages/pdoc/__init__.py:754: UserWarning: Couldn't read PEP-224 variable docstrings from <Module 'pypolsys.polsys'>: source code not available
  m = Module(import_module(fullname),
Traceback (most recent call last):
  File "/home/xyz/anaconda3/envs/work/lib/python3.9/site-packages/pdoc/__init__.py", line 222, in import_module
    module = importlib.import_module(module_path)
  File "/home/xyz/anaconda3/envs/work/lib/python3.9/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
  File "<frozen importlib._bootstrap>", line 981, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'pypolsys.polsys.cpython-310-x86_64-linux-gnu'; 'pypolsys.polsys' is not a package

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/xyz/anaconda3/envs/work/bin/pdoc3", line 8, in <module>
    sys.exit(main())
  File "/home/xyz/anaconda3/envs/work/lib/python3.9/site-packages/pdoc/cli.py", line 534, in main
    modules = [pdoc.Module(module, docfilter=docfilter,
  File "/home/xyz/anaconda3/envs/work/lib/python3.9/site-packages/pdoc/cli.py", line 534, in <listcomp>
    modules = [pdoc.Module(module, docfilter=docfilter,
  File "/home/xyz/anaconda3/envs/work/lib/python3.9/site-packages/pdoc/__init__.py", line 754, in __init__
    m = Module(import_module(fullname),
  File "/home/xyz/anaconda3/envs/work/lib/python3.9/site-packages/pdoc/__init__.py", line 224, in import_module
    raise ImportError(f'Error importing {module!r}: {e.__class__.__name__}: {e}')
ImportError: Error importing 'pypolsys.polsys.cpython-310-x86_64-linux-gnu': ModuleNotFoundError: No module named 'pypolsys.polsys.cpython-310-x86_64-linux-gnu'; 'pypolsys.polsys' is not a package

It is noteworthy that arbitrary .so file like abcd.so is also a problem.
We could expect that pdoc ignore such additional .so file (filter by python version, just pick one, ...)

Steps to Reproduce

  1. Pick a python package with fortran extension
  2. Copy and rename the .so with another python version to add a second .so file
  3. Call pdoc3 mypkg

Additional info

  • pdoc version:
    pdoc 0.10.0
@kernc
Copy link
Member

kernc commented Jul 21, 2022

Since you seem to have already investigated it some, care to propose a fix here:

pdoc/pdoc/__init__.py

Lines 196 to 235 in 2cce30a

def import_module(module: Union[str, ModuleType],
*, reload: bool = False) -> ModuleType:
"""
Return module object matching `module` specification (either a python
module path or a filesystem path to file/directory).
"""
@contextmanager
def _module_path(module):
from os.path import abspath, dirname, isfile, isdir, split
path = '_pdoc_dummy_nonexistent'
module_name = inspect.getmodulename(module)
if isdir(module):
path, module = split(abspath(module))
elif isfile(module) and module_name:
path, module = dirname(abspath(module)), module_name
try:
sys.path.insert(0, path)
yield module
finally:
sys.path.remove(path)
if isinstance(module, Module):
module = module.obj
if isinstance(module, str):
with _module_path(module) as module_path:
try:
module = importlib.import_module(module_path)
except Exception as e:
raise ImportError(f'Error importing {module!r}: {e.__class__.__name__}: {e}')
assert inspect.ismodule(module)
# If this is pdoc itself, return without reloading. Otherwise later
# `isinstance(..., pdoc.Doc)` calls won't work correctly.
if reload and not module.__name__.startswith(__name__):
module = importlib.reload(module)
# We recursively reload all submodules, in case __all_ is used - cf. issue #264
for mod_key, mod in list(sys.modules.items()):
if mod_key.startswith(module.__name__):
importlib.reload(mod)
return module

or here:

pdoc/pdoc/__init__.py

Lines 719 to 768 in 2cce30a

# If the module is a package, scan the directory for submodules
if self.is_package:
def iter_modules(paths):
"""
Custom implementation of `pkgutil.iter_modules()`
because that one doesn't play well with namespace packages.
See: https://github.com/pypa/setuptools/issues/83
"""
from os.path import isdir, join
for pth in paths:
for file in os.listdir(pth):
if file.startswith(('.', '__pycache__', '__init__.py')):
continue
module_name = inspect.getmodulename(file)
if module_name:
yield module_name
if isdir(join(pth, file)) and '.' not in file:
yield file
for root in iter_modules(self.obj.__path__):
# Ignore if this module was already doc'd.
if root in self.doc:
continue
# Ignore if it isn't exported
if not _is_public(root) and not _is_whitelisted(root, self):
continue
if _is_blacklisted(root, self):
self._skipped_submodules.add(root)
continue
assert self.refname == self.name
fullname = f"{self.name}.{root}"
try:
m = Module(import_module(fullname),
docfilter=docfilter, supermodule=self,
context=self._context, skip_errors=skip_errors)
except Exception as ex:
if skip_errors:
warn(str(ex), Module.ImportWarning)
continue
raise
self.doc[root] = m
# Skip empty namespace packages because they may
# as well be other auxiliary directories
if m.is_namespace and not m.doc:
del self.doc[root]
self._context.pop(m.refname)

?

@kernc kernc added the bug Something isn't working label Jul 21, 2022
@kernc
Copy link
Member

kernc commented Jul 21, 2022

It looks like something in iter_modules(). inspect.getmodulename(file) recognizes those .so files as valid modules, so the function offers them for importing ... 🤔

@nennigb
Copy link
Author

nennigb commented Jul 21, 2022

yes, module_name = inspect.getmodulename(file) returns a module name
in this function there is a list of all possible suffix given by importlib.machinery.all_suffixes()

['.py', '.pyc', '.cpython-39-x86_64-linux-gnu.so', '.abi3.so', '.so']

when processing the true extension, the suffix '.cpython-39-x86_64-linux-gnu.so' is detected and remove, but when we have the wrong python version, the '.so' is caught and only the '.so' is removed leading to a wrong module name...

Perhaps we may add a extra check to see if the module_name is valid. For instance here it contains .. Not sure if it should be always true at this stage although it is a loop module files...
Another approach is to silent the path import error while keeping invalid python module import error...

I also made a mistake since an arbitrary abcd.so also raise an error (the file was originality in the wrong place :-( I edit my previous message).

@kernc
Copy link
Member

kernc commented Jul 21, 2022

I'd rather something more strict like:

if '.cpython-' in module_name:
    continue

PR welcome.
I see with --skip-errors CLI switch, one can already turn this into a warning.

pdoc/pdoc/__init__.py

Lines 753 to 761 in 2cce30a

try:
m = Module(import_module(fullname),
docfilter=docfilter, supermodule=self,
context=self._context, skip_errors=skip_errors)
except Exception as ex:
if skip_errors:
warn(str(ex), Module.ImportWarning)
continue
raise

@kernc
Copy link
Member

kernc commented Jul 21, 2022

Can you show example of an error with abcd.so file?

@nennigb
Copy link
Author

nennigb commented Jul 21, 2022

if '.cpython-' in module_name:
   continue

Agree, it will strongly limit the side effect, but I not sure if it will work on other platform. I will have a look.

If abcd.so is an empty file, the error is

    raise ImportError(f'Error importing {module!r}: {e.__class__.__name__}: {e}')
ImportError: Error importing 'pypolsys.abcd': ImportError: /home/xyz/Recherche/CODE/polsysplp/pypolsys/abcd.so: file too short

if abscd.so is a copy of a valid module extension with bad naming convention, the error is

    raise ImportError(f'Error importing {module!r}: {e.__class__.__name__}: {e}')
ImportError: Error importing 'pypolsys.abcd': ImportError: dynamic module does not define module export function (PyInit_abcd)

if the so file is a valid, non extension so file

    raise ImportError(f'Error importing {module!r}: {e.__class__.__name__}: {e}')
ImportError: Error importing 'pypolsys.libstdc++': ModuleNotFoundError: No module named 'pypolsys.libstdc++'

I will be happy to contribute.

ps : I miss the --skip-errors switch !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Development

No branches or pull requests

2 participants