Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use a custom comparator for ID numbers and DOBs #8

Open
mbauman opened this issue Dec 2, 2016 · 2 comments
Open

Use a custom comparator for ID numbers and DOBs #8

mbauman opened this issue Dec 2, 2016 · 2 comments

Comments

@mbauman
Copy link
Member

mbauman commented Dec 2, 2016

In dedupe's logs, it reports:

INFO:dedupe.index:Removing stop word 47
INFO:dedupe.index:Removing stop word 9-
INFO:dedupe.index:Removing stop word 25

We're using String comparisons for both SSN and DOB — it shouldn't be removing any aspect of either as a stop word.

@potash
Copy link
Contributor

potash commented Dec 2, 2016

I once asked about this: dedupeio/dedupe-examples#39

@mbauman
Copy link
Member Author

mbauman commented Dec 2, 2016

Ah, interesting. Do you think this is beneficial? I've been trying to understand their comparator plugin architecture to setup custom comparisons, and then we could fine-tune this behavior for dates and SSNs… if it's useful at all.

@mbauman mbauman changed the title Why is dedupe removing stop words from SSNs or dates? Use a custom comparator for ID numbers and DOBs May 1, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants