Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiline matches with CR (^M) characters #423

Open
wookayin opened this issue Dec 31, 2023 · 3 comments
Open

Multiline matches with CR (^M) characters #423

wookayin opened this issue Dec 31, 2023 · 3 comments
Assignees
Labels
bug Something isn't working

Comments

@wookayin
Copy link

wookayin commented Dec 31, 2023

Describe the bug

I think only the title of git commit messages should be considered, but when the commit contains mixed CR/LF for some reason, extraction of commit title would be broken, resulting into multi-line messages (with CR) printed in the release note.

To reproduce

git clone https://github.com/neovim/neovim
git cliff -c c.toml 72a6643b1~1..ca5de93

c.toml:

[git]
conventional_commits = true
filter_unconventional = false
filter_commits = false
commit_parsers = [
    { message = "^.*", group = "Others" },
]

[changelog]
body = """
{% for group, commits in commits | group_by(attribute="group") %}
    ### {{ group | upper_first }}
    {% for commit in commits%}\
       - <<< {{ commit.id }} >>> {{ commit.message | upper_first }}
    {% endfor %}\
{% endfor %}\n
"""

Output:


### Others
- <<< 72a6643b1380cdf6f1153d70eeaffb90bdca30d6 >>> Docs #24061

- nvim requires rpc responses in reverse order. https://github.com/neovim/neovim/issues/19932
- NVIM_APPNAME: UIs normally should NOT set this.^M
^M
ref #23520^M
fix #24050^M
fix #23660^M
fix #23353^M
fix #23337^M
fix #22213^M
fix #19161^M
fix #18088^M
fix #20693
- <<< ca5de9306c00d07cce1daef1f0038c937098bc66 >>> Inlay hints #23984

A strange commit 72a6643b1380cdf6f1153d70eeaffb90bdca30d6 has a commit message where CR and LF is mixed.

Expected behavior

Only the first line is considered. Maybe we should normalize CR, LF into LF.

Screenshots / Logs

N/A

Software information

  • Operating system: macos
  • Rust version: 1.70
  • Project version: 1.4.0

Additional context

This repro is a simplification of neovim/neovim#26818 where git cliff --config scripts/cliff.toml v0.9.0..HEAD produces some strange multi-line release note items.

@orhun
Copy link
Owner

orhun commented Jan 3, 2024

Hello, thanks for reporting this!

This is because the commit in question is not conventional and it is not filtered out. That is why git-cliff uses that commit in the changelog as-is.

To skip those commits:

filter_unconventional = true

Or if you only want the first line to appear in the changelog:

- <<< {{ commit.id }} >>> {{ commit.message | split(pat="\n") | first | upper_first | trim }}

You can also use commit preprocessors/postprocessors to process the commit/changelog.

When it comes to the actual question, I agree that we should normalize CR, LF into LF. That can be also done with pre-processors though. I'm not sure if git-cliff should manipulate the commit message internally in this case.

@wookayin
Copy link
Author

wookayin commented Jan 3, 2024

Q: Is the regex ^foo matched against each of the lines or against the very first few characters only? How is the multiline string (or '\n') handled on regex matching? I thought it should be latter, but I don't still understand why this multiline string appears. Other commits also have a body that follow the title, but they won't appear. What makes the difference?

@orhun
Copy link
Owner

orhun commented Jan 5, 2024

Q: Is the regex ^foo matched against each of the lines or against the very first few characters only?

It is matched for the first line since it is not configured as multi line. I also think that it should support multiline but I'm not sure how to achieve that with serde_regex:

	/// Regex for matching the commit message.
	#[serde(with = "serde_regex", default)]
	pub message:       Option<Regex>,

Feel free to open a tracking issue about this!

Other commits also have a body that follow the title, but they won't appear. What makes the difference?

  • feat(lsp): inlay hints #23984 is a conventional commit so only the "Inlay hints #23984" parts makes it into the changelog. You can use commit.body to access the rest of the commit message.
  • docs #24061....... is not conventional so the entirety of the commit appears in the changelog.

I hope this answers your question 🐻

@orhun orhun changed the title multiline matches with CR (^M) characters Multiline matches with CR (^M) characters May 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants