Skip to content

difference between 1 nearest neighbor and 1000? in use for anomaly detection #933

Answered by seanlaw
feigin asked this question in Q&A
Discussion options

You must be logged in to vote

@feigin Indeed, this is the expected behavior. Since each dataset/problem is different, it is up to you to decide how to interpret the results. In this case, you can leverage our recently added top-k feature and do:

k = 3
mp = stumpy.stump(df['pattern_one_anomaly'].astype(float), m, k=k)
avg_mp = mp[:, 0:k].mean(axis=1)

Here, setting k=3 will return the distance to the top-3 nearest neighbors for each subsequence. Then, we compute the average distance (or sum the distances). In this case, if the anomaly is only repeated once (i.e., there is a single pair), then the average distance will NOT be zero and will therefore stand out as an anomaly. However, it is up to you to decide what k shou…

Replies: 1 comment 4 replies

Comment options

You must be logged in to vote
4 replies
@feigin
Comment options

@seanlaw
Comment options

Answer selected by feigin
@feigin
Comment options

@seanlaw
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants