Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os.environ didn't return unicode after compiled #7989

Open
T-256 opened this issue Oct 7, 2023 · 18 comments
Open

os.environ didn't return unicode after compiled #7989

T-256 opened this issue Oct 7, 2023 · 18 comments
Labels

Comments

@T-256
Copy link

T-256 commented Oct 7, 2023

image

My app depends on APPDATA env var, and it crashes on system that have unicode username:

  File "C:\Users\0254~1\Desktop\pathlib.py", line 1166, in resolve
  File "C:\Users\0254~1\Desktop\pathlib.py", line 205, in resolve
OSError: [WinError 123] The filename, directory name, or volume label syntax is incorrect: 'C:\\Users\\???\\AppData\\Roaming'

Note that it passes "question mark" instead of unicode characters.

@T-256 T-256 added the triage Please triage and relabel this issue label Oct 7, 2023
@T-256 T-256 changed the title os.environ don't return unicode after compiled os.environ didn't return unicode after compiled Oct 7, 2023
@rokm
Copy link
Member

rokm commented Oct 7, 2023

It seems to work as expected on my test Windows 10 system, with python 3.12.0 (32-bit installer from Python.org) and PyInstaller 5.13.2 (since the end of your build log suggests you are using PyInstaller < 6.0.0). I copied your username ممد from the log in (unrelated) nuitka issue that you opened.

Screenshot from 2023-10-07 11-56-22

Can you run my version of the program

program.py
import os
import sys
import pathlib

appdata = os.environ["APPDATA"]
print(f"appdata = {appdata!r}")
print(f"appdata len = {len(appdata)!r}")
print(f"appdata encoded = {appdata.encode()!r}")
print(f"appdata encoded len = {len(appdata.encode())!r}")

print("")
print(f"sys.flags = {sys.flags!r}")
print(f"sys.getdefaultencoding() = {sys.getdefaultencoding()!r}")
print(f"sys.getfilesystemencoding() = {sys.getfilesystemencoding()!r}")
print(f"sys.getfilesystemencodeerrors() = {sys.getfilesystemencodeerrors()!r}")
 
print("")
appdata_path = pathlib.Path(appdata).resolve()
print(f"contents of {appdata_path!r}")
for entry in appdata_path.iterdir():
    print(f" {entry!r}")

so we can see the detailed information?

@rokm
Copy link
Member

rokm commented Oct 7, 2023

Also, does building with PyInstaller v6 make any difference?

@rokm rokm added the state:need info Need more information for solve or help. label Oct 7, 2023
@T-256
Copy link
Author

T-256 commented Oct 7, 2023

I'm using win10-1507, official cpython 3.12 32bit

C:\Users\ممد\Python312-32>python -m PyInstaller app.py
453 INFO: PyInstaller: 6.0.0
453 INFO: Python: 3.12.0
484 INFO: Platform: Windows-10-10.0.10240-SP0
500 INFO: wrote C:\Users\ممد\Python312-32\app.spec
500 INFO: Extending PYTHONPATH with paths
['C:\\Users\\ممد\\Python312-32']
859 INFO: checking Analysis
859 INFO: Building Analysis because Analysis-00.toc is non existent
859 INFO: Initializing module dependency graph...
859 INFO: Caching module graph hooks...
890 INFO: Analyzing base_library.zip ...
2827 INFO: Loading module hook 'hook-heapq.py' from 'C:\\Users\\ممد\\Python312-32\\Lib\\site-packages\\PyInstaller\\hooks'...
3000 INFO: Loading module hook 'hook-encodings.py' from 'C:\\Users\\ممد\\Python312-32\\Lib\\site-packages\\PyInstaller\\hooks'...
6452 INFO: Loading module hook 'hook-pickle.py' from 'C:\\Users\\ممد\\Python312-32\\Lib\\site-packages\\PyInstaller\\hooks'...
8749 INFO: Caching module dependency graph...
8937 INFO: Running Analysis Analysis-00.toc
8937 INFO: Looking for Python shared library...
8937 INFO: Using Python shared library: C:\Users\ممد\Python312-32\python312.dll
8937 INFO: Analyzing C:\Users\ممد\Python312-32\app.py
8953 INFO: Processing module hooks...
8984 INFO: Looking for ctypes DLLs
8984 INFO: Analyzing run-time hooks ...
9000 INFO: Including run-time hook 'C:\\Users\\ممد\\Python312-32\\Lib\\site-packages\\PyInstaller\\hooks\\rthooks\\pyi_rth_inspect.py'
9000 INFO: Looking for dynamic libraries
9187 INFO: Extra DLL search directories (AddDllDirectory): []
9187 INFO: Extra DLL search directories (PATH): []
9312 INFO: Warnings written to C:\Users\ممد\Python312-32\build\app\warn-app.txt
9343 INFO: Graph cross-reference written to C:\Users\ممد\Python312-32\build\app\xref-app.html
9375 INFO: checking PYZ
9375 INFO: Building PYZ because PYZ-00.toc is non existent
9375 INFO: Building PYZ (ZlibArchive) C:\Users\ممد\Python312-32\build\app\PYZ-00.pyz
9750 INFO: Building PYZ (ZlibArchive) C:\Users\ممد\Python312-32\build\app\PYZ-00.pyz completed successfully.
9765 INFO: checking PKG
9765 INFO: Building PKG because PKG-00.toc is non existent
9765 INFO: Building PKG (CArchive) app.pkg
9781 INFO: Building PKG (CArchive) app.pkg completed successfully.
9796 INFO: Bootloader C:\Users\ممد\Python312-32\Lib\site-packages\PyInstaller\bootloader\Windows-32bit-intel\run.exe
9796 INFO: checking EXE
9796 INFO: Building EXE because EXE-00.toc is non existent
9796 INFO: Building EXE from EXE-00.toc
9796 INFO: Copying bootloader EXE to C:\Users\ممد\Python312-32\build\app\app.exe
9796 INFO: Copying icon to EXE
9890 INFO: Copying 0 resources to EXE
9890 INFO: Embedding manifest in EXE
10015 INFO: Appending PKG archive to EXE
10015 INFO: Fixing EXE headers
10343 INFO: Building EXE from EXE-00.toc completed successfully.
10343 INFO: checking COLLECT
10343 INFO: Building COLLECT because COLLECT-00.toc is non existent
10343 INFO: Building COLLECT COLLECT-00.toc
10515 INFO: Building COLLECT COLLECT-00.toc completed successfully.

C:\Users\ممد\Python312-32>python app.py
appdata = 'C:\\Users\\ممد\\AppData\\Roaming'
appdata len = 28
appdata encoded = b'C:\\Users\\\xd9\x85\xd9\x85\xd8\xaf\\AppData\\Roaming'
appdata encoded len = 31

sys.flags = sys.flags(debug=0, inspect=0, interactive=0, optimize=0, dont_write_bytecode=0, no_user_site=0, no_site=0, ignore_environment=0, verbose=0, bytes_warning=0, quiet=0, hash_randomization=1, isolated=0, dev_mode=False, utf8_mode=1, warn_default_encoding=0, safe_path=False, int_max_str_digits=4300)
sys.getdefaultencoding() = 'utf-8'
sys.getfilesystemencoding() = 'utf-8'
sys.getfilesystemencodeerrors() = 'surrogatepass'

contents of WindowsPath('C:/Users/ممد/AppData/Roaming')
 WindowsPath('C:/Users/ممد/AppData/Roaming/Adobe')
 WindowsPath('C:/Users/ممد/AppData/Roaming/Microsoft')

C:\Users\ممد\Python312-32>dist\app\app.exe
appdata = 'C:\\Users\\???\\AppData\\Roaming'
appdata len = 28
appdata encoded = b'C:\\Users\\???\\AppData\\Roaming'
appdata encoded len = 28

sys.flags = sys.flags(debug=0, inspect=0, interactive=0, optimize=0, dont_write_bytecode=1, no_user_site=1, no_site=1, ignore_environment=1, verbose=0, bytes_warning=0, quiet=0, hash_randomization=1, isolated=1, dev_mode=False, utf8_mode=0, warn_default_encoding=0, safe_path=True, int_max_str_digits=4300)
sys.getdefaultencoding() = 'utf-8'
sys.getfilesystemencoding() = 'utf-8'
sys.getfilesystemencodeerrors() = 'surrogatepass'

contents of WindowsPath('C:/Users/???/AppData/Roaming')
Traceback (most recent call last):
  File "app.py", line 20, in <module>
    for entry in appdata_path.iterdir():
  File "pathlib.py", line 1057, in iterdir
OSError: [WinError 123] The filename, directory name, or volume label syntax is incorrect: 'C:\\Users\\???\\AppData\\Roaming'
[3048] Failed to execute script 'app' due to unhandled exception!

@T-256
Copy link
Author

T-256 commented Oct 7, 2023

with force utf8_mode=1 to app.spec:

# -*- mode: python ; coding: utf-8 -*-


a = Analysis(
    ['app.py'],
    pathex=[],
    binaries=[],
    datas=[],
    hiddenimports=[],
    hookspath=[],
    hooksconfig={},
    runtime_hooks=[],
    excludes=[],
    noarchive=False,
)
pyz = PYZ(a.pure)

exe = EXE(
    pyz,
    a.scripts,
    [('X utf8_mode=1', None, 'OPTION')],
    exclude_binaries=True,
    name='app',
    debug=False,
    bootloader_ignore_signals=False,
    strip=False,
    upx=True,
    console=True,
    disable_windowed_traceback=False,
    argv_emulation=False,
    target_arch=None,
    codesign_identity=None,
    entitlements_file=None,
)
coll = COLLECT(
    exe,
    a.binaries,
    a.datas,
    strip=False,
    upx=True,
    upx_exclude=[],
    name='app',
)

it seems still remains utf8_mode=0:

C:\Users\ممد\Python312-32>python -m PyInstaller app.spec
469 INFO: PyInstaller: 6.0.0
469 INFO: Python: 3.12.0
515 INFO: Platform: Windows-10-10.0.10240-SP0
515 INFO: Extending PYTHONPATH with paths
['C:\\Users\\ممد\\Python312-32']
890 INFO: checking Analysis
890 INFO: Building Analysis because Analysis-00.toc is non existent
890 INFO: Initializing module dependency graph...
890 INFO: Caching module graph hooks...
937 INFO: Analyzing base_library.zip ...
3219 INFO: Loading module hook 'hook-encodings.py' from 'C:\\Users\\ممد\\Python312-32\\Lib\\site-packages\\PyInstaller\\hooks'...
4359 INFO: Loading module hook 'hook-heapq.py' from 'C:\\Users\\ممد\\Python312-32\\Lib\\site-packages\\PyInstaller\\hooks'...
6125 INFO: Loading module hook 'hook-pickle.py' from 'C:\\Users\\ممد\\Python312-32\\Lib\\site-packages\\PyInstaller\\hooks'...
8515 INFO: Caching module dependency graph...
8703 INFO: Running Analysis Analysis-00.toc
8703 INFO: Looking for Python shared library...
8703 INFO: Using Python shared library: C:\Users\ممد\Python312-32\python312.dll
8703 INFO: Analyzing C:\Users\ممد\Python312-32\app.py
8719 INFO: Processing module hooks...
8750 INFO: Looking for ctypes DLLs
8766 INFO: Analyzing run-time hooks ...
8766 INFO: Including run-time hook 'C:\\Users\\ممد\\Python312-32\\Lib\\site-packages\\PyInstaller\\hooks\\rthooks\\pyi_rth_inspect.py'
8781 INFO: Looking for dynamic libraries
8984 INFO: Extra DLL search directories (AddDllDirectory): []
8984 INFO: Extra DLL search directories (PATH): []
9125 INFO: Warnings written to C:\Users\ممد\Python312-32\build\app\warn-app.txt
9156 INFO: Graph cross-reference written to C:\Users\ممد\Python312-32\build\app\xref-app.html
9203 INFO: checking PYZ
9203 INFO: Building PYZ because PYZ-00.toc is non existent
9203 INFO: Building PYZ (ZlibArchive) C:\Users\ممد\Python312-32\build\app\PYZ-00.pyz
9687 INFO: Building PYZ (ZlibArchive) C:\Users\ممد\Python312-32\build\app\PYZ-00.pyz completed successfully.
9703 INFO: checking PKG
9703 INFO: Building PKG because PKG-00.toc is non existent
9703 INFO: Building PKG (CArchive) app.pkg
9718 INFO: Building PKG (CArchive) app.pkg completed successfully.
9734 INFO: Bootloader C:\Users\ممد\Python312-32\Lib\site-packages\PyInstaller\bootloader\Windows-32bit-intel\run.exe
9734 INFO: checking EXE
9750 INFO: Building EXE because EXE-00.toc is non existent
9750 INFO: Building EXE from EXE-00.toc
9750 INFO: Copying bootloader EXE to C:\Users\ممد\Python312-32\build\app\app.exe
9765 INFO: Copying icon to EXE
9891 INFO: Copying 0 resources to EXE
9891 INFO: Embedding manifest in EXE
10000 INFO: Appending PKG archive to EXE
10000 INFO: Fixing EXE headers
10437 INFO: Building EXE from EXE-00.toc completed successfully.
10437 INFO: checking COLLECT
10437 INFO: Building COLLECT because COLLECT-00.toc is non existent
10437 INFO: Building COLLECT COLLECT-00.toc
10594 INFO: Building COLLECT COLLECT-00.toc completed successfully.

C:\Users\ممد\Python312-32>dist\app\app.exe
appdata = 'C:\\Users\\???\\AppData\\Roaming'
appdata len = 28
appdata encoded = b'C:\\Users\\???\\AppData\\Roaming'
appdata encoded len = 28

sys.flags = sys.flags(debug=0, inspect=0, interactive=0, optimize=0, dont_write_bytecode=1, no_user_site=1, no_site=1, ignore_environment=1, verbose=0, bytes_warning=0, quiet=0, hash_randomization=1, isolated=1, dev_mode=False, utf8_mode=0, warn_default_encoding=0, safe_path=True, int_max_str_digits=4300)
sys.getdefaultencoding() = 'utf-8'
sys.getfilesystemencoding() = 'utf-8'
sys.getfilesystemencodeerrors() = 'surrogatepass'

contents of WindowsPath('C:/Users/???/AppData/Roaming')
Traceback (most recent call last):
  File "app.py", line 20, in <module>
    for entry in appdata_path.iterdir():
  File "pathlib.py", line 1057, in iterdir
OSError: [WinError 123] The filename, directory name, or volume label syntax is incorrect: 'C:\\Users\\???\\AppData\\Roaming'
[312] Failed to execute script 'app' due to unhandled exception!

C:\Users\ممد\Python312-32>

@rokm
Copy link
Member

rokm commented Oct 7, 2023

with force utf8_mode=1 to app.spec:

Yeah, I just noticed the examples in documentation are wrong - it should be just utf8 instead of utf8_mode:

    [('X utf8=1', None, 'OPTION')],

or even just

    [('X utf8', None, 'OPTION')],

Does that make a difference?

@T-256
Copy link
Author

T-256 commented Oct 7, 2023

Does that make a difference?

C:\Users\ممد\Python312-32>dist\app\app.exe
appdata = 'C:\\Users\\???\\AppData\\Roaming'
appdata len = 28
appdata encoded = b'C:\\Users\\???\\AppData\\Roaming'
appdata encoded len = 28

sys.flags = sys.flags(debug=0, inspect=0, interactive=0, optimize=0, dont_write_bytecode=1, no_user_site=1, no_site=1, ignore_environment=1, verbose=0, bytes_warning=0, quiet=0, hash_randomization=1, isolated=1, dev_mode=False, utf8_mode=1, warn_default_encoding=0, safe_path=True, int_max_str_digits=4300)
sys.getdefaultencoding() = 'utf-8'
sys.getfilesystemencoding() = 'utf-8'
sys.getfilesystemencodeerrors() = 'surrogatepass'

contents of WindowsPath('C:/Users/???/AppData/Roaming')
Traceback (most recent call last):
  File "app.py", line 20, in <module>
    for entry in appdata_path.iterdir():
  File "pathlib.py", line 1057, in iterdir
OSError: [WinError 123] The filename, directory name, or volume label syntax is incorrect: 'C:\\Users\\???\\AppData\\Roaming'
[3024] Failed to execute script 'app' due to unhandled exception!

@rokm
Copy link
Member

rokm commented Oct 7, 2023

Can you upload your build somewhere so I can try running it in my test environment? This way, we can rule out the differences in build - which will likely means that the different behavior is caused by some local system setting.

@rokm
Copy link
Member

rokm commented Oct 7, 2023

Also, do you explicitly enable utf8 mode in your python? Do you have PYTHON* environment variables set?

@T-256
Copy link
Author

T-256 commented Oct 7, 2023

Can you upload your build somewhere so I can try running it in my test environment?

app.zip

which will likely means that the different behavior is caused by some local system setting.

I think it's because of that OS. I tested exact built app on win11-22h2 and works with these flags (no new change):

sys.flags = sys.flags(debug=0, inspect=0, interactive=0, optimize=0, dont_write_bytecode=1, no_user_site=1, no_site=1, ignore_environment=1, verbose=0, bytes_warning=0, quiet=0, hash_randomization=1, isolated=1, dev_mode=False, utf8_mode=1, warn_default_encoding=0, safe_path=True, int_max_str_digits=4300)
sys.getdefaultencoding() = 'utf-8'
sys.getfilesystemencoding() = 'utf-8'
sys.getfilesystemencodeerrors() = 'surrogatepass'

Also, do you explicitly enable utf8 mode in your python? Do you have PYTHON* environment variables set?

No. pure install on VM. Env vars:

C:\Users\ممد\Python312-32>set
ALLUSERSPROFILE=C:\ProgramData
APPDATA=C:\Users\ممد\AppData\Roaming
CommonProgramFiles=C:\Program Files\Common Files
CommonProgramFiles(x86)=C:\Program Files (x86)\Common Files
CommonProgramW6432=C:\Program Files\Common Files
COMPUTERNAME=DESKTOP-8VBGTVS
ComSpec=C:\Windows\system32\cmd.exe
HOMEDRIVE=C:
HOMEPATH=\Users\ممد
LOCALAPPDATA=C:\Users\ممد\AppData\Local
LOGONSERVER=\\DESKTOP-8VBGTVS
NUMBER_OF_PROCESSORS=4
OS=Windows_NT
Path=C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\
PATHEXT=.COM;.EXE;.BAT;.CMD;.VBS;.VBE;.JS;.JSE;.WSF;.WSH;.MSC
PROCESSOR_ARCHITECTURE=AMD64
PROCESSOR_IDENTIFIER=AMD64 Family 25 Model 80 Stepping 0, AuthenticAMD
PROCESSOR_LEVEL=25
PROCESSOR_REVISION=5000
ProgramData=C:\ProgramData
ProgramFiles=C:\Program Files
ProgramFiles(x86)=C:\Program Files (x86)
ProgramW6432=C:\Program Files
PROMPT=$P$G
PSModulePath=C:\Windows\system32\WindowsPowerShell\v1.0\Modules\
PUBLIC=C:\Users\Public
SESSIONNAME=Console
SystemDrive=C:
SystemRoot=C:\Windows
TEMP=C:\Users\0254~1\AppData\Local\Temp
TMP=C:\Users\0254~1\AppData\Local\Temp
USERDOMAIN=DESKTOP-8VBGTVS
USERDOMAIN_ROAMINGPROFILE=DESKTOP-8VBGTVS
USERNAME=ممد
USERPROFILE=C:\Users\ممد
windir=C:\Windows

@rokm
Copy link
Member

rokm commented Oct 7, 2023

I think it's because of that OS. I tested exact built app on win11-22h2 and works with these flags (no new change):

Indeed, it works for me as well. Up-to-date win10 (19045.3516).

What's the version of the system where it doesn't work?

@T-256
Copy link
Author

T-256 commented Oct 7, 2023

Indeed, it works for me as well.

So, if issue is with OS, the main quesion is why python interpreter works on it?

What's the version of the system where it doesn't work?

Windows 10 1507 (Build 10240) - 64-bit
it's old and I think first version of win10.

@rokm
Copy link
Member

rokm commented Oct 7, 2023

So, if issue is with OS, the main quesion is why python interpreter works on it?

No idea, really. I'll have to test with 1507, and if I can reproduce the issue, see what is actually going on.

@T-256
Copy link
Author

T-256 commented Oct 7, 2023

FWIW, I tried interact directly with kernel32:

import os
import sys
import ctypes
import pathlib


def get_env(name):
    n = ctypes.windll.kernel32.GetEnvironmentVariableW(name, None, 0)
    if n == 0:
        return None
    buf = ctypes.create_unicode_buffer(u'\0'*n)
    ctypes.windll.kernel32.GetEnvironmentVariableW(name, buf, n)
    return buf.value


appdata = os.environ["APPDATA"]
print(f"appdata = {appdata!r}")
print(f"appdata len = {len(appdata)!r}")
print(f"appdata encoded = {appdata.encode()!r}")
print(f"appdata encoded len = {len(appdata.encode())!r}")

print()
print(f"sys.flags = {sys.flags!r}")
print(f"sys.getdefaultencoding() = {sys.getdefaultencoding()!r}")
print(f"sys.getfilesystemencoding() = {sys.getfilesystemencoding()!r}")
print(f"sys.getfilesystemencodeerrors() = {sys.getfilesystemencodeerrors()!r}")

print()
appdata = get_env("APPDATA")
print(f"appdata = {appdata!r}")
print(f"appdata len = {len(appdata)!r}")
print(f"appdata encoded = {appdata.encode()!r}")
print(f"appdata encoded len = {len(appdata.encode())!r}")

print()
appdata_path = pathlib.Path(appdata).resolve()
print(f"contents of {appdata_path!r}")
for entry in appdata_path.iterdir():
    print(f" {entry!r}")

image

I don't understand why there are visual characters difference, but good point is that working.

@T-256
Copy link
Author

T-256 commented Oct 7, 2023

I also tried to decode b'C:\\Users\\\xd9\x85\xd9\x85\xd8\xaf\\AppData\\Roaming' with all available encodings, and compare them visually to check which prints like that compiled app did. and here is the result:
cp720, cp860, cp861, cp863, oem

Perhaps somewhere in compiled app using one of these encodings.

UPDATE

I found that shell uses oem encoding with codepage 437 by executing chcp.

Where compiled app decided to use oem encoding? I tried to print out sys.std*.encoding, all of them were set to utf-8.

I also tried chcp 65001 and visual noise by get_env got fixed, but still os.environ don't recognize unicode.

@rokm
Copy link
Member

rokm commented Oct 7, 2023

From what I can tell, the behavior difference between python and frozen application comes from the way the bootloader is compiled (which makes sense, because the bootloader itself does not touch APPDATA or other system environment variables).

The long story is, I set up a 1507 machine for test, and could not install VS2022 in it ("Unsupported Windows version"), so I decided to go with msys2; there, I had to decide between MINGW64 and UCRT64 toolchain, and ended up testing with both. If I build bootloader with MINGW64 (which supposedly uses the legacy msvcrt runtime), I can reproduce the issue. If I build bootloader with UCRT64 (which uses the new ucrt runtime), the encoding issue is gone. And it's likely that some Windows upgrade later on fixed the msvcrt runtime to behave more like ucrt, or something along those lines, so you get different behavior only on old Windows 10 version(s).

So I'll need to check which runtime the bootloader ends up using when built with MSVC (and what options we should set in wscript to use ucrt, if possible).

@T-256
Copy link
Author

T-256 commented Oct 7, 2023

https://www.msys2.org/docs/environments:

MSVCRT vs UCRT

These are two variants of the C standard library on Microsoft Windows.

MSVCRT (Microsoft Visual C++ Runtime) is available by default on all Microsoft Windows versions, but due to backwards compatibility issues is stuck in the past, not C99 compatible and is missing some features.

  • It isn't C99 compatible, for example the printf() function family, but...
  • mingw-w64 provides replacement functions to make things C99 compatible in many cases
  • It doesn't support the UTF-8 locale
  • Binaries linked with MSVCRT should not be mixed with UCRT ones because the internal structures and data types are different. (More strictly, object files or static libraries built for different targets shouldn't be mixed. DLLs built for different CRTs can be mixed as long as they don't share CRT objects, e.g. FILE*, across DLL boundaries.) Same rule is applied for MSVC compiled binaries because MSVC uses UCRT by default (if not changed).
  • Works out of the box on every version of Microsoft Windows.

UCRT (Universal C Runtime) is a newer version which is also used by Microsoft Visual Studio by default. It should work and behave as if the code was compiled with MSVC.

  • Better compatibility with MSVC, both at build time and at run time.
  • It only ships by default on Windows 10 and for older versions you have to provide it yourself or depend on the user having it installed.

I think the reason is msvcrt doesn't support the UTF-8 locale(?)
But with UCRT it means will we have problem on windows 7/8 users?
What's runtime cpython uses? (AFAIK cpython < 3.9 supports win7)

@rokm
Copy link
Member

rokm commented Oct 7, 2023

Actually, it might be a matter of ucrt being linked statically vs. dynamically. Seems we are using the default (static), but python executable is dynamically linked against ucrt (that's why if you analyze python.exe in dependency walker, you see it linked against api-ms-win-*.dll, but our bootloader (and consequently the frozen application) is not.

@rokm
Copy link
Member

rokm commented Oct 9, 2023

Yeah, this is going to be a pain one way or another...

When building with MSVC, we are not specifying either /MT nor /MD, so /MT (static linking against CRT) is applied. Python and its shared libraries (and extensions) are of course linked against CRT dynamically.

And mixing both modes is discouraged, as it can lead to subtle issues, such as the one we've seen here (where either the state of the private copy of CRT used by bootloader is out of sync with the global/shared state used by python shared library, or there is some other sort of incompatibility between the bootloader-private copy of CRT and the global/shared one).

Now of course, the correct way to fix this would be to build with /MD. But that adds dynamic dependency to both ucrt (api-ms-*.dll) and vcruntime (vcruntime140.dll).

FWIW, I am quite fine with users having to install UCRT (which is part of the OS in Windows 10 and 11) on older systems. When building the frozen application, whether api-ms-*.dll files are collected greatly depends on the build environment (i.e., if the DLLs are resolvable, which is not the case by default; they are resolvable in Anaconda environments (which ship their own copies) or if user has Windows SDK installed and in path); so chances are that UCRT needs to be available on the system anyway. And we can add a section to the documentation describing which KB needs to be installed on Windows 7 and on Windows 8. (And then we can perhaps stop collecting api-ms-*.dll files altogether).

The VC runtime, however, is not a part of the OS even on 10, and judging by the issues with vcruntime140.dll not being properly extracted from onefile builds with splash screen, it seems that running onefile executables on minimal Windows 10 systems is quite a common practice.

We can go half-way, though, by doing "semi-static" linking: link vcruntime statically, while forcing ucrt to be linked dynamically (the UCRT64 msys2 toolchain I mentioned earlier seems to do the same; it dynamically links only against ucrt). This seems sufficient to fix the environment variable encoding issue we are dealing with here; which means that the offending part lies in ucrt, not vcruntime (and as far as I can tell, we depend on a limited set of functions from vcruntime anyway).

Here is a proof-of-concept branch, if you want to test it. I've included rebuilt bootloaders for easier testing:
https://github.com/rokm/pyinstaller/tree/msvc-building-changes

You can install it directly with

pip uninstall pyinstaller

followed by

pip install git+https://github.com/rokm/pyinstaller@msvc-building-changes

or

pip install https://github.com/rokm/pyinstaller/archive/refs/heads/msvc-building-changes.zip

@rokm rokm added bug and removed state:need info Need more information for solve or help. triage Please triage and relabel this issue labels Oct 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants