Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Setup-cfg-fmt doesn't respect newline convention of source file nor read/write it with the correct encoding #94

Open
CAM-Gerlach opened this issue Aug 13, 2021 · 2 comments

Comments

@CAM-Gerlach
Copy link

Setup-cfg-fmt does not respect the EOL of the setup.cfg, and instead converts it to a platform-dependent default. I'm not aware of a case where that would be the desirable behavior, whereas many/most projects standardize on one newline convention (typically \n) for all source files, which this breaks. Furthermore, if using an EoL fixer with pre-commit, it will catch this but aside from extra noise in the output, if it isn't sequenced after this hook in the config file, it will cause the next run to fail too.

Fixing this is very simple—just check the current EoL character when reading in the file, and pass that explicitly on writing it. So at the top of format_file(), on the line after read()ing the file, you'd just add newline = f.newlines, and then pass newline=newline in the call to open() when writing it,

In addition, neither open() call specifies the correct encoding to use, which means the platform and environment-dependent locale encoding will be assumed. As setup.cfg is required to be UTF-8, this means loading will fail on Windows if any non-ASCII characters are present (common in e.g. authors' names), or on any platform that doesn't have UTF-8 mode enabled, and even if loading were to succeed, the file would be written in the incorrect encoding. Furthermore, not specifying an encoding when opening in text mode will be an optional warning in Python 3.10, with a DeprecationWarning planned to follow in later versions. To fix this, simply explicitly pass encoding="utf-8" to both calls.

Happy to submit a PR to fix both of these if you'd like. Thanks!

@asottile
Copy link
Owner

adding encoding='utf-8' seems fine, as for picking a line ending I'd lean towards always using \n -- I don't think your suggestion of newline = f.newlines works, that value is always None and determining mixed newlines or the proper ending seems out of scope and/or difficult (and/or better handled by a tool that actually is intended to solve that problem)

@CAM-Gerlach
Copy link
Author

CAM-Gerlach commented Aug 16, 2021

I don't think your suggestion of newline = f.newlines works, that value is always None

That attribute is lazy and you have to do a .read() first, or it will be None as you observed, which is why my suggestion added it it after the existing read().

determining mixed newlines or the proper ending seems out of scope and/or difficult (and/or better handled by a tool that actually is intended to solve that problem)

Good point on the mixed newlines; since that's a relatively uncommon edge case I hadn't considered it. However, in that case (where per the docs, f.newlines will return a tuple of line endings, so if f.newlines is not a str instance, it can simply fall back to \n as you suggest.

Other than cases of mixed EOLs where the file must be conformed to something, I'd think arbitrarily changing line ending, generally a per-project default handled by other hooks and orthogonal to the standards and conventions used in setup.cfg, would be out of scope here.

as for picking a line ending I'd lean towards always using \n

While hardcoding it would work for my particular application, it won't solve this problem for any projects that use another line ending, and potentially create a new one for those that use the platform default. So I'd suggest respecting the existing line ending for the common/simple case, and only conforming if its non-trivial.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants