Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alternate string metrics #131

Open
ParadaCarleton opened this issue Sep 23, 2022 · 0 comments
Open

Alternate string metrics #131

ParadaCarleton opened this issue Sep 23, 2022 · 0 comments

Comments

@ParadaCarleton
Copy link

I think a great feature would be to let the have an optional argument that changes the string metric. For example, keyboard-based distances can improve performance and accuracy. A standard QWERTY keyboard has layout:

` 1 2 3 4 5 6 7 8 9 0 - = 
  q w e r t y u i o p [ ] \
   a s d f g h j k l ; ' 
    z x c v b n m , . /

A substitution adds a distance of 1 for every step we have to take on a keyboard. Keys on a keyboard are typically spaced along a nearly-hexagonal grid with 6 roughly equidistant neighbors, e.g. "g" has:

 t y
f g h
 v b

All of these keys can be considered neighbors, and substituting g with one of these keys increases the edit distance by 1. For an edit distance of 2, we can consider keys that are neighbors of neighbors, and so on.

This both cuts back on the number of neighbors within a given edit distance, speeding up performance, and should improve accuracy when working with real-world data typed on a keyboard.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant