Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expression::lex parses negation inconsistently #263

Open
spud opened this issue Jan 31, 2022 · 0 comments
Open

Expression::lex parses negation inconsistently #263

spud opened this issue Jan 31, 2022 · 0 comments

Comments

@spud
Copy link

spud commented Jan 31, 2022

I've just tried out my first implementation of TNTSearch, so bear with me!

I'd been struggling with strange results from boolean searches using the "foo -bar" syntax, seeing results that were clearly inaccurate. Glancing at the source code, I noticed that the tilde (~) was also used for excluding words, so I tried the same query using "foo ~bar", expecting the same result set, but got totally different (and more accurate) results.

While debugging, I noticed that the output produced by Expression::lex was different in the two cases.

$ex = new Expression();
$tokens_1 = $ex->lex("foo -bar");
$tokens_2 = $ex->lex("foo ~bar");

The problem is
$tokens_1 != $tokens_2

That simple inconsistency is the basic bug for this report. But I am aware of #246, and I cannot speak to whether or not this fix might address any aspect of that issue. I do know that $tokens_1 was producing wildly inaccurate results, and $tokens_2 produced much better matches, so there is definitely a difference in the results they produce.

A quick look at the code for lex seems to indicate that the inconsistency in parsing can be rectified by changing the initial search and replace arrays into a different order:
$bad = [' or ', ' ', '-'];
$good = ['|', '&', '~'];

This ends up producing the same token array in both situations. I'm just not familiar enough with the implications of that change (it's consistent, but is it right?) to go straight to a pull request. (But happy to if this is confirmed.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant