Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: make fixes and refs detection stricter #612

Closed
wants to merge 2 commits into from

Conversation

richardlau
Copy link
Member

@richardlau richardlau commented Mar 19, 2022

Only parse fixes and refs links at the beginning of lines.

Previously the parser was picking up the "fix" type Conventional
Commits titles from the changelog in body of the npm update pull
requests.

Refs: nodejs/node#42382 (comment)

cc @Trott


I have updated one of the existing test cases as otherwise the test was failing with this change. This test was from #117, for nodejs/node#17107. I ran node ../node-core-utils/bin/get-metadata.js 17107 with this diff:

diff --git a/lib/links.js b/lib/links.js
index 4d313a1..639ced5 100644
--- a/lib/links.js
+++ b/lib/links.js
@@ -14,6 +14,7 @@ export class LinkParser {
     this.owner = owner;
     this.repo = repo;
     this.$ = cheerio.load(html);
+    console.log(JSON.stringify(html));
   }
 
   getFixesUrlsFromArray(arr) {

and found the HTML differs from what is in the test fixture, so I updated it. The test passes was the updated HTML. I used the same diff to capture the test data for nodejs/node#42382.

Only parse `fixes` and `refs` links at the beginning of lines.

Previously the parser was picking up the "fix" type Conventional
Commits titles from the changelog in body of the npm update pull
requests.
@codecov
Copy link

codecov bot commented Mar 19, 2022

Codecov Report

Merging #612 (5da9c55) into main (5e85166) will not change coverage.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##             main     #612   +/-   ##
=======================================
  Coverage   84.10%   84.10%           
=======================================
  Files          37       37           
  Lines        4051     4051           
=======================================
  Hits         3407     3407           
  Misses        644      644           
Impacted Files Coverage Δ
lib/links.js 92.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5e85166...5da9c55. Read the comment docs.

lib/links.js Outdated Show resolved Hide resolved
lib/links.js Outdated Show resolved Hide resolved
Co-authored-by: Tobias Nießen <tniessen@tnie.de>
@@ -1,8 +1,8 @@
import cheerio from 'cheerio';

const FIXES_RE = /(Close[ds]?|Fix(e[ds])?|Resolve[sd]?)\s*:\s*(\S+)/mgi;
const FIXES_RE = /^\s*(Close[ds]?|Fix(e[ds])?|Resolve[sd]?)\s*:\s*(\S+)/mgi;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By tolerating spaces at the start of the line, might this then match the "Fixes:" etc. in patches we float from V8 etc.? (Maybe that's not a problem?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe for a separate PR, but I think we probably don't need to tolerate spaces before the colon:

Suggested change
const FIXES_RE = /^\s*(Close[ds]?|Fix(e[ds])?|Resolve[sd]?)\s*:\s*(\S+)/mgi;
const FIXES_RE = /^\s*(Close[ds]?|Fix(e[ds])?|Resolve[sd]?):\s*(\S+)/mgi;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By tolerating spaces at the start of the line, might this then match the "Fixes:" etc. in patches we float from V8 etc.? (Maybe that's not a problem?)

Not sure. Maybe we need to find example PRs and run git node metadata on them to see what they generate? We can do that on both open and closed PRs -- there will be an error about the PR not being landable but the output will still include the generated metadata.

It's entirely possible we may need stricter checking of the right hand side of the colon, but it's hard to speculate on that without more concrete examples.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found one. nodejs/node#37276 results in Fixes: https://github.com/v8:11389 (with and without this PR):

$ git node metadata 37276
✔  Done loading data for nodejs/node/pull/37276
----------------------------------- PR info ------------------------------------
Title      deps: V8: workaround for darwin arm64 code page decommit failures (#37276)
   ⚠  Could not retrieve the email or name of the PR author's from user's GitHub profile!
Branch     matinzd:patch-1 -> nodejs:master
Labels     v8 engine, backported-to-v14.x
Commits    2
 - deps: V8: Workaround MacOS 11.2 code page decommit failures
 - bump: v8: v8_embedder_string to 25
Committers 2
 - GitHub <noreply@github.com>
 - matinzd <matin.zadehdolatabad@rambody.me>
------------------------------ Generated metadata ------------------------------
PR-URL: https://github.com/nodejs/node/pull/37276
Fixes: https://github.com/v8:11389
Refs: https://bugs.chromium.org/p/v8/issues/detail?id=11389#c18
Reviewed-By: Michaël Zasso <targos@protonmail.com>
Reviewed-By: Ash Cripps <acripps@redhat.com>
--------------------------------------------------------------------------------

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just noticed that for PR-URL we have a different parsing scheme that pulls the links out of the corresponding anchor tag (the parser is parsing HTML). Maybe we should be reusing that logic?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reusing the same logic that is used for refs and PR-URL seems a better solution. Or at least a more consistent one. PR: #614

const FIX_RE = /(Close[ds]?|Fix(e[ds])?|Resolve[sd]?)\s*:\s*(\S+)/i;
const REFS_RE = /Refs?\s*:\s*(\S+)/mgi;
const REFS_RE = /^\s*Refs?\s*:\s*(\S+)/mgi;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe for a separate PR, but I think we probably don't need to tolerate spaces before the colon:

Suggested change
const REFS_RE = /^\s*Refs?\s*:\s*(\S+)/mgi;
const REFS_RE = /^\s*Refs?:\s*(\S+)/mgi;

@richardlau
Copy link
Member Author

Closing in favour of #614

@richardlau richardlau closed this Mar 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants