Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't convert windows style \r\n to \n in file-contents-sorter #959

Closed
wants to merge 1 commit into from
Closed

Don't convert windows style \r\n to \n in file-contents-sorter #959

wants to merge 1 commit into from

Conversation

Maetveis
Copy link
Contributor

@Maetveis Maetveis commented Sep 6, 2023

Instead preserve the original line endings.

With the --unique flag identical lines with different line endings are collapsed to the first entry.

Fixes: #958

Instead preserve the original line endings.

With the `--unique` flag identical lines with different line endings
are collapsed to the first entry.
@asottile
Copy link
Member

asottile commented Sep 6, 2023

this is a really complicated change. I'd rather keep the existing behavior and just say windows newlines are unsupported

@asottile asottile closed this Sep 6, 2023
@Maetveis
Copy link
Contributor Author

Maetveis commented Sep 6, 2023

this is a really complicated change. I'd rather keep the existing behavior and just say windows newlines are unsupported

I understand, yet I think this is going to cause headaches for windows users or even more to people who don't regularly use windows for development, but have CI pipelines that also run pre-commit on windows (like I did).

What about keeping only the guess_eol function to just use the line separator that first appears in the original file? It wouldn't support mixed line endings, but that's bad practice anyways (and there's a hook to fix it).

@asottile
Copy link
Member

asottile commented Sep 6, 2023

imo it's 2023 and most tools support posix newlines -- we should just let windows newlines die rather than adding more complexity to support them

@asottile
Copy link
Member

asottile commented Sep 6, 2023

that is to say -- if you're testing both posix and windows it might make sense to configure git to be more posixlike on windows

@swaldhoer
Copy link

@asottile this pull request and the accompanying issue are old, but I still want to have a discussion about this topic.

I start with some background: We have several large repositories that share their quality assurance tooling.
These tools are own developments and mostly directly implement in Python or have Python wrappers.
The tool around the tools, i.e., applying them to repository, is also part of the own developments.
However, this tool is not that nice and for sure has way more undiscovered bugs than pre-commit.
So, I was quite happy when I discovered pre-commit and after some playing around, I concluded, that pre-commit is exactly what we need.

Right before starting the actual work (unfortunately, at scale it is not that easy to just add the pre-commit configuration file and all is just good) I discovered this discussion.

imo it's 2023 and most tools support posix newlines -- we should just let windows newlines die rather than adding more complexity to support them

Reading this, I was somewhat disappointed, not because I prefer the one or other operating system - I really just don't care - but the state it leaves me as potential user:

  • pre-commit is implement in a OS agnostic language.
  • pre-commit is (more or less) a wrapper for other tools.

So why does this tool force OS specific things onto the user, when they are on a platform that behaves different?
Just looking at the tool, this just isn't the goal of this tool.

Linux is like that, POSIX is like that, and Windows is like that and so on.
In my opinion, this tool does not need to make a point pro or contra OS specificities or idiosyncrasies.

Apart from any discussion what is or is not correct: There are lots of domains where Windows is the default development environment, and you need to play by the operating system rules. Yeah, these domains are not the modern and shiny Internet, AI, etc. things, but automotive, generally software for embedded systems and many more.
So, if you are working in these domains you are forced to this platform and sometimes strictly adhere to the OS rules (whether it makes sense or not) and this is not something that will change in the next 10 years, maybe never, not for us and many other users.

Why the large picture? It's not about the tool itself, but the spirit it lives in the default hooks:

  • Yes, the tool is open source, and I can re-do everything the way I like it etc.
    But I would rather not fix something (that is not broken) and add some fragmentation that is not needed and in contrast to contributing something to this project.
  • Yes, I could just re-implement that single hook. No big deal, and then I re-implement the next hook...and the next hook, and in the end, I start redoing everything. I think this problem is real, as there are are tools, e.g., like the json-formatter that could also just replace CRLF - because why not?

My question is (sorry I know this sounds snappy, but it is not meant that way): Is the plan of this project to focus on the hooking functionality for git and providing basic hooks without an opinion what the correct operating behavior or is the plan to be more opinionated (e.g., like black)?

I would greatly appreciate, if you have the time to discuss this a bit.



I want to add these side two notes, because they show that whatever the correct line separator is, CRLF is widely accept as exactly that and there might be no reason to force anything against that general understanding.

  • Even in the Internet world, where Linux and POSIX are specificities dominate.
    See for example https://www.rfc-editor.org/rfc/rfc7468: I can download a certificate on Windows (with CRLF) and perfectly validate it on Linux with openssl.
  • I have never encountered, e.g., that GCC on Linux has spluttered when compiling sources that use CRLF (code generators for embedded systems are basically always only available on Windows, that’s why I am seeing that a lot).

@asottile
Copy link
Member

I'm not reading all of that. if you want to add the complexity and maintain it by all means fork the repository to do what you want

@pre-commit pre-commit locked as off-topic and limited conversation to collaborators Feb 26, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Development

Successfully merging this pull request may close these issues.

file-contents-sorter changes all newlines to unix-style '\n'
3 participants