-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HTML Search: Fix partial matches overwriting full matches #11958
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -477,14 +477,18 @@ const Search = { | |
// add support for partial matches | ||
if (word.length > 2) { | ||
const escapedWord = _escapeRegExp(word); | ||
Object.keys(terms).forEach((term) => { | ||
if (term.match(escapedWord) && !terms[word]) | ||
arr.push({ files: terms[term], score: Scorer.partialTerm }); | ||
}); | ||
Object.keys(titleTerms).forEach((term) => { | ||
if (term.match(escapedWord) && !titleTerms[word]) | ||
arr.push({ files: titleTerms[word], score: Scorer.partialTitle }); | ||
}); | ||
if (!terms.hasOwnProperty(word)) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It is marginally more efficient to do this check outside the loop too. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Unless the performance improvement is significant, I think it'd be better to leave the conditional check after the (that's based on considering that if the first condition fails, then the clause after the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm going to push back on this a bit: I think this small improvement is in scope (since we're changing the logic anyway) and the diff here is relatively small and easy to understand. Moving the property check should indeed reduce the number of regex matches a fair bit. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok, no problem. I think we'll have to agree to disagree on that. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd prefer if you could avoid the extraction of an outer There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, I understood your point, I just don't agree with it in this case. I have more to say on this, but I think we've gone out of scope of a review. If you want to talk about it more, feel free to email me. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for explaining. From my perspective, bug report, code review and related discussion are often useful information when trying to understand a commit/change, so I'd prefer to keep the discussion public; in this case I do think that the size of the line diff is relevant to code review. Would you be OK with me drafting an alternative PR? I don't know if it'd be more likely to be accepted for merge, and it's OK if it isn't, but I'd like to present it as an alternative. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I don't think I have more to say on the technical front. But just to state it one more time so it's clear: I think this (small) PR as it stands fixes the issue in a more fundamental way and improves performance so it's my strong recommendation that it be the final state of the code. I don't feel the benefit of breaking it up into two PRs is worth the hassle and churn (leaving aside that this discussion has now occupied more effort than that would entail...).
It's not clear to me when someone who can actually merge the code will look at these changes. I can't stop you from proposing an alternative, but I think the best thing is probably just to wait for a maintainer to break our impasse. I am willing to make the change you suggest if it will actually get this merged and my PR is open to updates from the owners of this repository. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok, after finding more time to understand the code, I concede that the performance improvements here are important. I've opened #12045 with a description of what I believe the problem is. Also, compared to my suggested alternative #12040, I think that the use of the JavaScript @wlach could you edit the description of this pull request (#11958) to add 'Resolves #12045'? (and thanks for your patience!) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I wouldn't bother much about this. Here, it's more important to use |
||
Object.keys(terms).forEach((term) => { | ||
if (term.match(escapedWord)) | ||
arr.push({ files: terms[term], score: Scorer.partialTerm }); | ||
}); | ||
} | ||
if (!titleTerms.hasOwnProperty(word)) { | ||
Object.keys(titleTerms).forEach((term) => { | ||
if (term.match(escapedWord)) | ||
arr.push({ files: titleTerms[word], score: Scorer.partialTitle }); | ||
This comment was marked as resolved.
Sorry, something went wrong. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I think #11957 as described is broad enough to describe both. It's the same error: a partial match overwriting a full one. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh wait, I see what you mean now given https://github.com/sphinx-doc/sphinx/pull/11958/files#r1509999419. Yeah, this looks like a separate bug strictly speaking. This is such a rabbit hole. 😭 I'd be inclined to fix it in a new PR with its own test (maybe depending on this one). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yep, no problem, there's a lot going on here :) As mentioned though, I feel good about the effect on quality that all this will have. Agreed; I'll create a separate bugreport for this, and then request review for #12037 after this one is merged. Could you clarify in the description of #11957 that it relates solely to terms that only occur in the zeroth document in the index? (if my understanding of that is true?) Edit: fix pr/issues references There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Arg, sorry - those linked issues were wrong; I meant to request a change for the description of #11957 - to make the nature of the bug more precise. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Reported separately as #12040.
This comment was marked as outdated.
Sorry, something went wrong. |
||
}); | ||
} | ||
} | ||
|
||
// no match but word was a required one | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -21,7 +21,7 @@ describe('Basic html theme search', function() { | |
"<no title>", | ||
"", | ||
null, | ||
2, | ||
5, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. With this change, the score gets boosted to 5 (what it should be). |
||
"index.rst" | ||
]]; | ||
expect(Search.performTermsSearch(searchterms, excluded, terms, titleterms)).toEqual(hits); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
terms[word]
will be falsey (0
) for the first document.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wlach based on index-format findings (and test coverage), should this be rephrased to:
(kinda wordy, but captures the fact that
[0, 1]
, for example, is not falsy)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(edited with a correction and for conciseness)