Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with white spaces #5

Open
hudaniel opened this issue Jun 1, 2016 · 9 comments
Open

Problem with white spaces #5

hudaniel opened this issue Jun 1, 2016 · 9 comments

Comments

@hudaniel
Copy link

hudaniel commented Jun 1, 2016

I noticed that if I have "Game of Thrones" as part of the tagged strings in an index, and I search for "Game " (whitespace after "Game"), the results don't show the original "Game of Thrones" index. If I search for "Game" or "Thrones" it works perfectly fine. Is this expected behavior?

@mikecsh
Copy link

mikecsh commented Oct 9, 2016

I'm seeing the same behaviour in my app - did you find an answer to this @computerion?

@hudaniel
Copy link
Author

hudaniel commented Oct 9, 2016

I worked around this problem by replacing whitespaces with underscores

@mikecsh
Copy link

mikecsh commented Oct 9, 2016

Thanks for the quick reply! Did you replace the whitespace with underscores in your search term, the indexed strings or both?

@mikecsh
Copy link

mikecsh commented Oct 9, 2016

I've implemented @computerion's workaround by replacing the whitespace with underscores in keywords and indexed strings and it works well as long as the search is identical to the underlying indexed string:

e.g:

String: "Game of Thrones" is indexed as "Game_of_thrones"

Search: "Game of" actually searches "Game_of" and so a match is returned

Which is a huge improvement on the default behaviour. However, searching for "thrones game" will return no results when most users are likely to expect this to work. Additionally this approach is likely to be more resource hungry in building and storing the index as every indexed string is effectively unique and the use of the stop word ignoring functionality is lost. Therefore I'll leave this issue open for now.

Thanks a lot for your suggestion @computerion at least I have something reasonably functional now!

@hudaniel
Copy link
Author

hudaniel commented Oct 9, 2016

Yeah it's kind of ugly but I'm glad it helped!

@matehat matehat closed this as completed Oct 9, 2016
@matehat matehat reopened this Oct 9, 2016
@matehat
Copy link
Owner

matehat commented Oct 9, 2016

Hello, and sorry it took so long to reply.
Have you tried trimming the search string before performing the search?

The expected capability is that searching for "thrones game" would work just as well as "game thrones"

@mikecsh
Copy link

mikecsh commented Oct 9, 2016

Hi @matehat, what do you mean by trimming the string before performing the search? If you mean removing any additional whitespace at the ends of the string, there isn't any in the search term, it's just two words separated by a space.

@matehat
Copy link
Owner

matehat commented Oct 9, 2016

I'm talking about NSString#stringByTrimmingCharactersInSet:

You mentioned that "Game " with a trailing space didn't work. So I asked if trimming it, so removing the space, would work.

@mikecsh
Copy link

mikecsh commented Oct 9, 2016

Oh sorry, that was a different commentor. Your suggestion doesn't make a difference in my case.
Indexed strings: "hello world", "hello dolly"

Search "hello" results "hello world", "hello dolly"
Search "hello dolly" no results
Search "hello world" no results
Search "world" results "hello world"
Search "world hello" no results

Per your suggestion:
Search "hello " no results
Search "hello " but trim it first so effectively search "hello" results "hello world", "hello dolly"

The issue is having the search query properly tokenised and having those tokens be individually taken into account during the search. At the moment it appears to behave as though only one token can be searched at a time. The workaround that @computerion suggested effectively makes every reasonably sized indexed string unique and turns multiple search keywords into one token that can match one of those unique strings if the exact phrase appears within it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants