fix(logical): fix scoring for logical OR #593

adlerfaulkner · 2021-12-14T08:55:12Z

Change match selection in logical OR to select the best
scoring (lowest value) match instead of the first match. Currently,
a logical OR which fuzzy searches across multiple fields may
score incorrectly because it returns the score for the first matching
field, not the best.

Change match selection in logical OR to select the best scoring (lowest value) match instead of the first match. Currently, a logical OR which fuzzy searches across multiple fields may score incorrectly because it returns the score for the first matching field, not the best.

adlerfaulkner · 2021-12-14T09:44:08Z

Hello!
Loving Fuse's ease of use and maintainability, thanks @krisk!

I noticed this scoring problem when using the solution for searching for multiple tokens across multiple keys by @0xdevalias on #235.

When fuzzy searching for the query wood using a logical OR across title and author.lastName...

{
      $or: [
        { title: 'wood' },
        { 'author.lastName': 'wood' }
      ]
}

...in the data below, one would expect all three results to be equally scored because they all have woodhouse in author.lastName. However, the current scoring method for logical OR only uses the score of the FIRST fuzzy match found from the list of statements in the OR query, not the BEST match. This makes it so that the second item in the list below gets a worse score than the others because Wooster is the first fuzzy match found based on the order of the statements in the OR.

  {
     "title": "Right Ho Jeeves",
     "author": {
        "firstName": "P.D",
        "lastName": "Woodhouse"
     }
  },
  {
     "title": "The Code of the Wooster",
     "author": {
        "firstName": "P.D",
        "lastName": "Woodhouse"
     }
  },
  {
     "title": "Thank You Jeeves",
     "author": {
        "firstName": "P.D",
        "lastName": "Woodhouse"
     }
  }

I have fixed this by choosing the match with the best score from the statements of the logical OR.

Edit:
Until @krisk reviews, my fork with this fix in master can be found here https://github.com/comake/Fuse

krisk · 2021-12-22T05:08:54Z

test/logical-search.test.js

+
+  describe('When searching for the term "wood"', () => {
+    test('we get the top three results all with an exact match from their author.lastName', () => {
+      expect(idx(result.slice(0,3)).sort()).toMatchObject([3,4,5])


Shouldn't the order be 4,3,5? I am imagining it would have to match the following:

{ item: { title: 'The Code of the Wooster', author: { firstName: 'P.D', lastName: 'Woodhouse' } }, refIndex: 4 }, { item: { title: 'Right Ho Jeeves', author: { firstName: 'P.D', lastName: 'Woodhouse' } }, refIndex: 3 }, { item: { title: 'Thank You Jeeves', author: { firstName: 'P.D', lastName: 'Woodhouse' } }, refIndex: 5 }

But, I think the reason the order is 3,4,5 is because of line 185: res = [...result]. Did you mean to do res.push(...result)?

@krisk I suppose that would be a design decision. Do we want to include all matches from all fields in the OR which have a match in the score, or do we want to only score based on the best match out of all the fields scanned by the OR.

Happy to switch to res.push(...result) as I feel that would be the better design decision (include scores from all matches in logical OR).

I've wavered on this for a while. I do think including scores for all matches is ideal.

score results in a logical OR query by including all matches from all terms of the OR instead of the best match or first match

adlerfaulkner · 2021-12-23T03:27:10Z

@krisk Updated to incorporate all matches in a logical OR into the score in 1e420fd

Fixes #593

adlerfaulkner added 2 commits December 14, 2021 00:49

chore(logical): add tests for logical OR fuzzy search scoring fix

d55bcfc

adlerfaulkner mentioned this pull request Dec 14, 2021

Match all tokens should look in different keys as well. #235

Closed

krisk reviewed Dec 22, 2021

View reviewed changes

fix(logical): include all matches in logical OR score

1e420fd

score results in a logical OR query by including all matches from all terms of the OR instead of the best match or first match

krisk added a commit that referenced this pull request Dec 23, 2021

fix(logical): scoring for logical OR

c5d663e

Fixes #593

krisk mentioned this pull request Dec 23, 2021

fix(logical): scoring for logical OR #604

Merged

krisk closed this in #604 Dec 23, 2021

krisk added a commit that referenced this pull request Dec 23, 2021

fix(logical): scoring for logical OR

6f6af51

Fixes #593

usernamerandom11 mentioned this pull request Sep 7, 2024

[Snyk] Upgrade: dotenv, farmhash, fs-extra, fuse.js, js2xmlparser, rate-limiter-flexible, request-ip, svcorelib, url-parse, xss usernamerandom11/JokeAPI#38

Open

usernamerandom11 mentioned this pull request Sep 21, 2024

[Snyk] Upgrade: dotenv, farmhash, fs-extra, fuse.js, js2xmlparser, rate-limiter-flexible, request-ip, svcorelib, url-parse, xss usernamerandom11/JokeAPI#39

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GitHub Sponsors

fix(logical): fix scoring for logical OR #593

fix(logical): fix scoring for logical OR #593

adlerfaulkner commented Dec 14, 2021

adlerfaulkner commented Dec 14, 2021 •

edited

Loading

krisk Dec 22, 2021

adlerfaulkner Dec 22, 2021 •

edited

Loading

krisk Dec 23, 2021

adlerfaulkner commented Dec 23, 2021

fix(logical): fix scoring for logical OR #593

fix(logical): fix scoring for logical OR #593

Conversation

adlerfaulkner commented Dec 14, 2021

adlerfaulkner commented Dec 14, 2021 • edited Loading

krisk Dec 22, 2021

Choose a reason for hiding this comment

adlerfaulkner Dec 22, 2021 • edited Loading

Choose a reason for hiding this comment

krisk Dec 23, 2021

Choose a reason for hiding this comment

adlerfaulkner commented Dec 23, 2021

adlerfaulkner commented Dec 14, 2021 •

edited

Loading

adlerfaulkner Dec 22, 2021 •

edited

Loading