Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

apidoc template rst_t file does not use utf-8 at reading and writing #8477

Closed
panhaoyu opened this issue Nov 23, 2020 · 0 comments
Closed

apidoc template rst_t file does not use utf-8 at reading and writing #8477

panhaoyu opened this issue Nov 23, 2020 · 0 comments

Comments

@panhaoyu
Copy link
Contributor

Describe the bug
apidoc template rst_t file does not use utf-8 at reading and writing.

To Reproduce

Here is a pytest case, and it will run on my computer, but I think it won't run on non-gbk encoding computers.

def test_utf8_template():
    TEMPLATE = 'There are some Chinese characters below.\n这是一行中文字符\n'

    # Firstly create three temp directories for the template, python source, rst destination directories.
    with tempfile.TemporaryDirectory() as template_dir:
        with tempfile.TemporaryDirectory() as src_dir:
            with tempfile.TemporaryDirectory() as dst_dir:
                # Write a module template rst_t file, use utf-8 encoding.
                # Here I make sure that the template is utf-8.
                with open(os.path.join(template_dir, 'module.rst_t'), mode='w', encoding='utf-8') as f:
                    f.write(TEMPLATE)

                # Create a python file in the python source directory.
                with open(os.path.join(src_dir, 'main.py'), mode='w', encoding='utf-8') as f:
                    f.write('import os\ndef test():\n    print(123)')

                # Use apidoc to generate rst file, and the file generated successfully.
                apidoc_main(['-o', dst_dir, '-t', template_dir, src_dir])
                assert os.path.exists(os.path.join(dst_dir, 'main.rst'))

                # Read the file with utf-8 encoding, but it not work!
                # That's because it is not utf-8 encoding at all!
                with pytest.raises(UnicodeError, match="'utf-8' codec can't decode"):
                    with open(os.path.join(dst_dir, 'main.rst'), mode='r', encoding='utf-8') as f:
                        print(f.read())

                # Read the file with gbk encoding, and it works!
                with open(os.path.join(dst_dir, 'main.rst'), mode='r', encoding='gbk') as f:
                    content = f.read()
                    assert content == TEMPLATE[:-1]

                # That is to say, on my computer, utf-8 is not default encoding.
                # My computer is windows 10 and I'm in China.
                # So, `with open('a.txt', w)` will run as `with open('a.txt', mode='w', encoding='gbk')`.
                # In such systems, we should explicitly specify `encoding='utf-8'` to use utf-8.

                # In fact, when we Chinese learn python file IO,
                # the first thing is to be taught that always use `encoding='utf-8'` in the python world.

Expected behavior
Parse rst_t with non-ascii files.

Your project
https://github.com/panhaoyu/sphinx

Screenshots
The test case is enough I think, or I can provide an image.

Environment info

  • OS: Win10
  • Python version: 3.7.5
  • Sphinx version: 3.x
  • Sphinx extensions: no
  • Extra tools: no
@panhaoyu panhaoyu changed the title <what happen when you do on which document project> apidoc template rst_t file does not use utf-8 at reading and writing Nov 23, 2020
@tk0miya tk0miya added this to the 3.4.0 milestone Nov 23, 2020
@tk0miya tk0miya closed this as completed Nov 23, 2020
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jul 17, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants