Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix ripgrep trailing newline #1496

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

ippsav
Copy link
Contributor

@ippsav ippsav commented Jul 30, 2023

Description

Fix trailing newline issue in code highlights (#1487 )

The ripgrep JSON output includes a trailing newline at the end of each match. However, the code highlighting in delta does not expect a trailing newline, which was causing an issue when the match contains a newline at the end.

To fix this, We check if the trailing newline is part of the actual match text and it's at the end of the string. If not, the trailing newline is removed.

Unit tests:

ripgrep_json.rs:

  • test_match_text_ends_with_newline
  • test_match_text_without_newline

.data
.submatches
.iter()
.any(|m| code.ends_with(&m._match.text) && m._match.text.ends_with('\n'));
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we just be looking at the last match here rather than all of them? (I might be getting confused).

Copy link
Contributor Author

@ippsav ippsav Aug 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

He might have multiple matches and it wouldn't always be the last match or am i missing something ?

[{"match":{"text":"\n\n\n"},"start":60,"end":63}}, {"match":{"text":"config"},"start":50,"end":55}}]

This is not an actual example

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK sorry I'll try to find time to think through this properly! Your PR looks great, the only reason I've been slow is because I was slightly confused about exactly what the condition should be for us stripping newline.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries ! in any case keep me updated ! thanks

@dandavison
Copy link
Owner

Could you give an example of a toy input file and a ripgrep command that crashes delta without this fix and behaves correctly with this fix?

@ippsav
Copy link
Contributor Author

ippsav commented Aug 6, 2023

Here is an example @dandavison :

example file:

{"type":"match","data":{"path":{"text":"src/cli.rs"},"lines":{"text":"    fn from_clap_and_git_config(config\n"},"line_number":null,"absolute_offset":35837,"submatches":[{"match":{"text":"config\n"},"start":33,"end":39}]}}

Output of running the command with delta:
20230806_14h35m27s_grim

Output of running the same command with the fix:
20230806_14h36m57s_grim

I ve just realised the test cases are not really testing what it should be testing i ll be fixing it right now, my bad on that !

@dandavison
Copy link
Owner

Can you give the input file and ripgrep command that you're using to generate the example? I think if you include that command as a comment by the test case that would be helpful.

If you want to use delta's source code as the example input file you could make it reproducible via a command that specifies a master commit like

git show 0608765f40a193d16651e1ac0c547bd2f9280bf9:src/cli.rs | rg --json config

@ippsav
Copy link
Contributor Author

ippsav commented Aug 6, 2023

here is a simpler example that relates to the example i mentioned above:
raw.txt:

    fn from_clap_and_git_config(config

There is a new line at the end make sure you press enter.

The output of running cat raw.txt | rg -U --json "config\n"

{"type":"begin","data":{"path":{"text":"<stdin>"}}}
{"type":"match","data":{"path":{"text":"<stdin>"},"lines":{"text":"    fn from_clap_and_git_config(config\n"},"line_number":1,"absolute_offset":0,"submatches":[{"match":{"text":"config\n"},"start":32,"end":39}]}}
{"type":"end","data":{"path":{"text":"<stdin>"},"binary_offset":null,"stats":{"elapsed":{"secs":0,"nanos":14720,"human":"0.000015s"},"searches":1,"searches_with_match":1,"bytes_searched":40,"bytes_printed":265,"matched_lines":1,"matches":1}}}
{"data":{"elapsed_total":{"human":"0.000646s","nanos":646020,"secs":0},"stats":{"bytes_printed":265,"bytes_searched":40,"elapsed":{"human":"0.000015s","nanos":14720,"secs":0},"matched_lines":1,"matches":1,"searches":1,"searches_with_match":1}},"type":"summary"}

@ippsav ippsav requested a review from dandavison August 6, 2023 14:09
// ripgrep --json emits a trailing newline for each match, but
// delta's code highlighting does not expect a trailing newline.
// Therefore, remove the trailing newline if it is not part of the
// match.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If delta needs to strip the \n in the usual case, then that must be because some bad thing X happens if it does not do it. So why is it OK to not strip the \n when it is part of the match? Why does X not happen?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because once it will be printed, the \n will be stripped since it's included in the match, unless im missing something here

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry, I'm not following. Would you mind explaining in more detail? I'm not understanding the relationship between whether or not something is in the match, and the stripping newlines for the highlighter. I think it's me that's missing something!

Copy link
Contributor Author

@ippsav ippsav Aug 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need some time, I've been trying with the example you mentioned in the #1487 that's failing, i should look more into it, specially that now i am thinking about it i don't think i handled it well it's a bit confusing

@tbm
Copy link

tbm commented Nov 27, 2023

@ippsav do you have any time to update your PR with the comments from @dandavison ? I'm the guy who reported this issue and would like to see your PR in. Thank you for working on this!

@ippsav
Copy link
Contributor Author

ippsav commented Nov 27, 2023

@tbm there have been a lot of work lately, hence I was not very active, I ll take a look at it very soon this week, sorry for the inconvenience !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants