Skip to content

Determining Repetition #108

Dec 21, 2022 · 1 comments · 2 replies
Discussion options

You must be logged in to vote
  1. For word repetition, remove stopwords and use its.lemma & as.freqTable to build a frequency table and shortlist outliers or tokens > 75%tile and highlight them.
  2. If you want to make to more sophisticated by looking at repetitions at short (token) distance using index() method.
  3. For sentences, use a similarity method to flag near by sentences with very high similarity values.

Replies: 1 comment 2 replies

Comment options

You must be logged in to vote
2 replies
@Chaddeus
Comment options

@sanjayaksaxena
Comment options

Answer selected by sanjayaksaxena
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants