Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an option to put libraries into a library sub directory #5841

Open
wants to merge 5 commits into
base: develop
Choose a base branch
from

Conversation

mozbugbox
Copy link
Contributor

@mozbugbox mozbugbox commented May 18, 2021

Try to do issue #1048, #2468.

Add a option --lib-subdir libs option to pyi-makespec.

Known Issues:

  • Only tested on Linux
  • Cannot put the dir of a module into the library subdir. When numpy/ was put into the lib subdir, it cannot be properly imported. Don't know why.
  • libpythonXX.so and base_library.zip are still in the home dir.
  • the DYLD_LIBRARY_PATH is not touched on MacOS. It seemed there are some special way to treat DLL loading on Mac.

So basically, most dynamic modules are put into a lib sub directory. packages are left to the home directory.

@bwoodsend
Copy link
Member

Woah hoah, that's a bold one!

Only tested on Linux

Well we have CI/CD for that. Github have added this approve and run feature which means that, as you're a first time contributor, every time you want to run CI I have to click yes do it. Normally this is useful but I think it's going to be a pain here. You might find it easier to enable actions on your own fork then temporarily create a pull request to your fork's develop branch. That way, you CI will run on your fork without having to wait for me to tell it that it's ok to run.

Cannot put the dir of a module into the library subdir. When numpy/ was put into the lib subdir, it cannot be properly imported. Don't know why.

Is this specifically numpy or does it not work for even really simple libraries containing the odd data file? You'd need to add the subdirectory to sys.path at runtime (which could be done in here) but other than that it should be moveable.

the DYLD_LIBRARY_PATH is not touched on MacOS. It seemed there are some special way to treat DLL loading on Mac.

They are treated weirdly but I'd rather that they weren't. Every DLL gets monkey-patched in a way I can't remember (cc @rokm help me out here) and never really understood anyway.

@rokm
Copy link
Member

rokm commented May 18, 2021

This is indeed a bold move (especially for the first contribution), but the situation with shared libraries in PyInstaller is rather complicated...


So basically, most dynamic modules are put into a lib sub directory. packages are left to the home directory.

That's going to wreak havoc on macOS, because library paths of all shared libraries AND extensions (including those that remain in the package directories) are rewritten to be relative to the application root.

EDIT: or rather, if A (shared library or extension, potentially located anywhere) is linked against B (shared library collected in the application's root), A's link path to B is modified so that it points to B in application's root (_MEIPASS) but as relative path to the A's location.

That is indeed really ugly and inconvenient, but I think it was designed this way to deal with shared libraries that get pulled from non-package directories in site-packages (e.g., site-packages/blspy.libs). So the path rewriting code will need to be adjusted for the new location, but this can be done only if all libraries (except Python) are moved to the new location. (As a side note, we should eventually upgrade this path rewriting procedure with an ability to infer actual library locations from the TOC - then we could break free from the assumption that everything is in _MEIPASS, or in some other pre-determined location).


I suspect it will also break ctypes and possibly cffi in frozen application, because those also have special provisions for looking for dynamic libraries in the _MEIPASS .

And I wouldn't be surprised if there were other obscure corners of PyInstaller that depend on shared libraries being in the application's root directory...


I'm not particularly fond of having this as an option, either. Having two possible deployment layouts with potentially subtly different behaviors will be a nightmare to support. So either we keep collecting stuff into _MEIPASS, or we decide to revamp that, and live with it. Point in the case: the current batch of tests on CI is probably running without the new relocation, so no new issues will be uncovered.


All in all, I'd prefer if this started smaller. Perhaps with trying to relocate just the .pyd extensions on Windows. On linux and macOS, we already divert the python extensions coming from python's lib-dynload directory into _MEIPASS/lib-dynload subdirectory. Perhaps something similar could be done for .pyd extensions coming from python's DLLs directory? (Maybe move them into lib-dynload as well?)

@mozbugbox
Copy link
Contributor Author

The code strictly follows the steps described in issue #1048. So the changes are straight forward and small. It just did:

  • Add a lib_subdir option to EXE() and COLLECT(). In the COLLECT.assemble(), prepend the sub_dir to tofnm filename if the type of the TOC is either EXTENSION or BINARY.
  • In the bootloader, Adjust LD_LIBRARY_PATH and sys.path if the lib_subdir is set.
  • Add the cmdline option to the pyi-makespec and the related template file.

That's all about it.

As for the _MEIPASS. I think it's a one-file thing. The intend of this patch is for cleanup the home directory for the one-folder mode. Does _MEIPASS has much impact on one-folder mode?

The option does not have to take effect on Mac if it is too much a hassle.

@mozbugbox
Copy link
Contributor Author

And for the problem with relocating directories like numpy/ extension package, I've traced it to the FrozenImporter.exec_module() in pyimod03_importers.py:

# If `submodule_search_locations` is not None, this is a package;
# set __path__.
if spec.submodule_search_locations is not None:
# Since PYTHONHOME is set in bootloader, 'sys.prefix' points to
# the correct path where PyInstaller should find bundled dynamic
# libraries. In one-file mode it points to the tmp directory where
# bundled files are extracted at execution time.
#
# __path__ cannot be empty list because 'wx' module prepends
# something to it. It cannot contain value 'sys.prefix' because
# 'xml.etree.cElementTree' fails otherwise.
#
# Set __path__ to point to 'sys.prefix/package/subpackage'.
module.__path__ = [pyi_os_path.os_path_dirname(module.__file__)]

That __path__ thing is passed from the parent module to find_spec() of meta_path finders to find submodules. Since python modules are frozen and given the path of dirname(module.__file__), which for numpy means the home/module_path directory. Therefore the submodules of numpy are always searched under the home directory.

Change the __path__ to include home/lib_subdir/module_path makes relocate packages to a sub dir work alright:

            module_dir = pyi_os_path.os_path_dirname(module.__file__)
            path = pyi_os_path.os_path_join(sys.prefix, "libs")
            path = pyi_os_path.os_path_join(path, module_dir[len(sys.prefix)+1:])
            module.__path__ = [module_dir, path]

The "libs" is hard coded here. Is there anyway in the python bootloader part to fetch EXE OPTIONS? like the pyi_arch_get_option() rom the binary bootloader? Else the pyi-lib-subdir has to be passed through sys module?

@rokm
Copy link
Member

rokm commented May 20, 2021

The "libs" is hard coded here. Is there anyway in the python bootloader part to fetch EXE OPTIONS? like the pyi_arch_get_option() rom the binary bootloader? Else the pyi-lib-subdir has to be passed through sys module?

This would probably be best passed through sys module, same as how it is done for sys._MEIPASS (which denotes application's top-level directory in both onedir and onefile applications).

Copy link
Member

@htgoebel htgoebel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By chance I stepped over this pull-request and left some minor comments.

Please also mind adding documentation for the new command-line option and for sys.pyi_runtime_options.

if (typ in ('EXTENSION', 'BINARY') and self.lib_subdir and
not libpython_pat.match(inm) and not inm_dir):
tofnm = os.path.join(self.name, self.lib_subdir, inm)
todir = os.path.dirname(tofnm)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same code as above, line 745

@@ -748,6 +764,7 @@ def assemble(self):
upx_exclude=self.upx_exclude,
dist_nm=inm)
if typ != 'DEPENDENCY':
logger.debug(f"Copy file: {fnm} => {tofnm}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don;t use f-strings for logging, as this will format the string even if the message is never logged. Use

                 logger.debug("Copy file: %r => %r", fnm, tofnm)

@@ -349,6 +349,8 @@ def __add_options(parser):
"binaries during compression. "
"FILE is the filename of the binary without path. "
"This option can be used multiple times.")
g.add_argument("--lib-subdir", dest="lib_subdir",
help="put library into a sub directory")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

library - singular?

Please be more verbose and ecplaing which libraries.

@@ -445,7 +447,7 @@ def __add_options(parser):


def main(scripts, name=None, onefile=None,
console=True, debug=None, strip=False, noupx=False, upx_exclude=None,
console=True, debug=None, strip=False, noupx=False, upx_exclude=None, lib_subdir=None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Long line.

@@ -567,6 +569,7 @@ def main(scripts, name=None, onefile=None,
'upx': not noupx,
'upx_exclude': upx_exclude,
'runtime_tmpdir': runtime_tmpdir,
'lib_subdir': '"%s"' % lib_subdir if lib_subdir else lib_subdir,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is bound to fail as soon as the value contains a quote-character itself.
Why is the values put into quotes here at all? If this is requried for serialization, it should be done at the very place of serialization.

@@ -498,6 +531,9 @@ pyi_pylib_start_python(ARCHIVE_STATUS *status)
SetErrorMode(0);
#endif

VS("LOADER: Saving runtime options\n");
pyi_pylib_save_runtime_opts(status);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good idea. Anyho, I suggest moving this into a pull-request of its own and — more important — have a changelog entry for it.

value = space + 1;
} else {
key = strdup(ptoc->name);
value = "";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIR, an option without value is a boolean option. So the option being present in TOC means the option is set. Thus I suggest setting the value to True (or at least to a non-empty string).

Comment on lines +541 to +543
path = []
module_dir = pyi_os_path.os_path_dirname(module.__file__)
path.append(module_dir)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

module_dir = pyi_os_path.os_path_dirname(module.__file__)
path = [module_dir]

Comment on lines +545 to +547
sub_path = sys.pyi_runtime_options["pyi-lib-subdir"]
sub_path = pyi_os_path.os_path_join(SYS_PREFIX, sub_path)
sub_path = pyi_os_path.os_path_join(sub_path, module_dir[SYS_PREFIXLEN:])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO this is easier to understand:

                sub_path = pyi_os_path.os_path_join(SYS_PREFIX,
                               pyi_os_path.os_path_join(sys.pyi_runtime_options["pyi-lib-subdir"], 
                                   module_dir[SYS_PREFIXLEN:])

if "pyi-lib-subdir" in sys.pyi_runtime_options:
sub_path = sys.pyi_runtime_options["pyi-lib-subdir"]
sub_path = pyi_os_path.os_path_join(SYS_PREFIX, sub_path)
sub_lib_dynload = pyi_os_path.os_path_join(sub_path, "lib-dynload")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

@ImLvna
Copy link

ImLvna commented May 19, 2023

Is there any update on the progress of this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants