Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why we have not considered Line Endings for Different OS(Linux, Windows, MacOS) #137

Open
deepanshukhanna opened this issue Feb 15, 2023 · 1 comment

Comments

@deepanshukhanna
Copy link

I came across one issue in which my files saved on git are differ in terms of line start or end. git diff or git UI handles it as it normalize line endings to LF (/n) on Unix-based systems and to CRLF (/r/n) on Windows-based systems. My point here is if git handling it in diff or UI we will never able to see these differences and that too doesn't matter as per the file changes. can we configure here or do some changes so that this diff match patch don't consider while evaluating differences? reason for this is, it will be very difficult to take this into consideration after evaluation of diff-match-patch.

@dmsnell
Copy link

dmsnell commented Feb 15, 2023

@deepanshukhanna the easiest thing might be to normalize those line endings before running the texts in diff-match-patch. Not sure what language you are using, but a Regular Expression pattern should be translatable to pretty much anything, or you could manually scan the text.

replace( text_a, search: "/\r\n/g", replace: "\n" )

this would prevent those differences from appearing in a diff. the other side of this coin is that if you do this to hide those differences you won't be able to move from the original input text to the other as the indices will be changed during normalization.

alternatively it may not be as hard as you imagine to post-process. we're only expecting the addition of \r or the deletion of \r, so you can skip over those in any display you are using. if you find a diff operation that's only \r you can skip it as if it doesn't exist. if you're working with the patch format which only states how many characters to delete then you can still analyze this against the input text to remove the display of those line-ending changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants