Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add sentence delimiter option to preserve original sentencization #68

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

fsimonjetz
Copy link

Hi there,
it would be nice to have an option to preserve the original sentence boundaries by overriding the internal sentence splitter. The readme seems to imply that newlines will be used exclusively as sentence boundaries, but that is not the case. I added the option for my work, it might be useful for others as well but it should be revised by someone who is more familiar with the code.

Thank you!

@fbarrios
Copy link
Contributor

fbarrios commented May 4, 2019

Hi! Thank you for your contribution! We will review this in the following this.

@fbarrios
Copy link
Contributor

Hi @fsimonjetz, looks good to me. I'm interested on knowing if you had a simple sample of a text that you wanted to summarize to maybe add it as a test.

Also, maybe this is clearer for the documentation?

Note that line breaks in the input will be used as sentence separators (along with punctuation), so be sure to preprocess your text accordingly. You can also provide a custom sentence delimiter through the sentence_delimiter argument.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants