Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about user_weight #1244

Open
lesliev opened this issue Mar 17, 2023 · 1 comment
Open

Question about user_weight #1244

lesliev opened this issue Mar 17, 2023 · 1 comment

Comments

@lesliev
Copy link

lesliev commented Mar 17, 2023

In our project we are not setting user_weight manually anywhere that I can see. But if I run this I can see a weight of mostly 1's and 2's:
select *, weight() from article_core where match('beatl') OPTION ranker=expr('sum(user_weight)');

Then if I run this I get 2's and 3's:
select *, weight() from person_core where match('beatl') OPTION ranker=expr('sum(user_weight)')

I think this explains a mysterious bias where person records are being ranked quite a lot higher than article records.

Does ThinkingSphinx set user_weight for each document somehow? I've searched the whole project for user_weight and I don't see it being set anywhere.

I assume the default ranker is being used to calculate the final weight:
SPH_RANK_PROXIMITY_BM25 = sum(lcs*user_weight)*1000+bm25

@lesliev
Copy link
Author

lesliev commented Mar 20, 2023

I've read some more and Sphinx docs refer to user_weight as the "user field weight". Perhaps it's referring to the field weights that are configured using set_property here: https://freelancing-gods.com/thinking-sphinx/v5/searching.html#fieldweights

But I don't see this being set in any of my index files, the only property I'm setting is set_property delta: true. So why are the user_weight values different for each document then?

mysql> select weight() from article_core where match('beatl') OPTION ranker=expr('sum(user_weight)');
+----------+
| weight() |
+----------+
|        2 |
|        2 |
|        2 |
|        2 |
|        1 |
|        1 |
|        1 |
|        1 |
|        1 |
|        1 |
|        1 |
|        1 |
|        1 |
|        1 |
|        1 |
|        1 |
|        1 |
|        1 |
|        1 |
|        1 |
+----------+
20 rows in set (0.00 sec
mysql> select weight() from person_core where match('beatl') OPTION ranker=expr('sum(user_weight)');
+----------+
| weight() |
+----------+
|        3 |
|        3 |
|        3 |
|        2 |
|        2 |
|        2 |
|        2 |
|        2 |
|        2 |
|        2 |
|        2 |
|        2 |
|        2 |
|        2 |
|        2 |
|        2 |
|        2 |
|        2 |
|        2 |
|        2 |
+----------+
20 rows in set (0.01 sec)

From: http://sphinxsearch.com/docs/current.html#sphinxql-select

user_weight (integer), the user specified per-field weight (refer to SetFieldWeights() in SphinxAPI and OPTION field_weights in SphinxQL respectively). The weights default to 1 if not specified explicitly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant