You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The second matching (5, 'her' ) and the last one (14, 'she') are not aliging the word boundary, how to remove them ?
or could we force them to mathcing word?
foridx, keyinenumerate('he her hers she'.split()):
A.add_word(key, key) # A.make_automaton()
needle="he here her shes"list(A.iter_long(needle))
# [(1, 'he'), (5, 'her'), (10, 'her'), (14, 'she')]
The text was updated successfully, but these errors were encountered:
Are you saying that you only want to have whole words matched? If so then you do not want to add strings characters as words, but rather sequence of words converted to numbers, otherwise the automaton will be on characters and will match characters: it does not know anything about words.
Hi @pombredanne just to make sure I understand: the idea is that each unique word in the needles would map to a distinct int and we'd add these ints as keys and the words as the values?
Do you have a recommendation for this mapping? since the haystack will also need to mapped prior to iterating it with the same resulting map.
Can we get more info on this please.
I want exact(whole) word match and I am not able to understand how to approach it.
Any insights would be greatly appreciated
The second matching
(5, 'her' )
and the last one(14, 'she')
are not aliging the word boundary, how to remove them ?or could we force them to mathcing word?
The text was updated successfully, but these errors were encountered: