Skip to content

Filtering and cleaning google news "processed text" field #182

Answered by lalitpagaria
edumagol asked this question in Q&A
Discussion options

You must be logged in to vote

@edumagol Thank you for kind words.
Yes we have some basic cleaning capabilities. Currently it do not have capabilities to remove hyperlinks. But adding single cleaner function in obsei for this capability would not take time.
Please refer following links -

Tutorial to showcase use of cleaner - https://colab.research.google.com/github/obsei/obsei/blob/master/tutorials/04_GoogleNews_Cleaner_Splitter_Classification_Aggregator.ipynb

All the current supported cleaner functions: https://github.com/obsei/obsei/blob/master/obsei/preprocessor/text_cleaning_function.py

Replies: 1 comment 5 replies

Comment options

You must be logged in to vote
5 replies
@edumagol
Comment options

@lalitpagaria
Comment options

@lalitpagaria
Comment options

@edumagol
Comment options

@edumagol
Comment options

Answer selected by lalitpagaria
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants