Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sentence not split due to missing quote #38

Open
KevinDanikowski opened this issue Jun 14, 2023 · 2 comments
Open

Sentence not split due to missing quote #38

KevinDanikowski opened this issue Jun 14, 2023 · 2 comments

Comments

@KevinDanikowski
Copy link

Describe the bug
The sentence with obvious splits doesn't get split, it appears to be due to a missing first double quote at the beginning.

Text

This is worrying because history shows that governments have often abused this power to silence minorities and dissidents" (Strossen, 2018). The central thesis of this paper is that, while freedom of expression is central to the health and progress of a democratic society, there are reasonable and necessary limits to this freedom. Specifically, direct incitement to violence is considered an exception to the norm of protecting free speech. During Donald Trump\'s presidency, public rhetoric took a noticeably more divisive turn. As the FBI has documented, hate crimes increased by nearly 20 per cent during his presidency, and hate-motivated murders, mostly committed by white supremacists, reached their highest level in 28 years. (Reodriguez,2021). While it is tempting to directly correlate this rise in hate crimes with former President Trump\'s rhetoric, it is important to remember that correlation does not imply causation. Trump has made comments that have been widely criticised as xenophobic and offensive. He has characterised Mexican immigrants as "rapists" and "drug dealers", called African countries "shithole countries", and called for a total ban on Muslims entering the United States. These statements can be interpreted as perpetuating harmful stereotypes and fuelling prejudice and animosity. However, there are no documented statements by Trump that directly incite violence against minorities. (Cineas, 2021) A leader\'s words have a significant impact on society and can influence the behaviour of his or her followers, but demonstrating a direct causality between Trump\'s rhetoric and the rise in hate crimes is challenging. For example, CSHE (Center for the Study of Hate and Extremism) at California State University has noted that hate crimes have been on the rise since 2014, before Trump took office. Furthermore, CSHE cautions against oversimplifying the relationship between political rhetoric and hate crimes, noting that there are many factors contributing to this phenomenon, including political polarisation, socio-economic conflicts, and the proliferation of social media platforms that enable the spread of hate speech. This paper does not seek to excuse Trump\'s rhetoric or minimise the harm that his words can cause. However, it is important to distinguish between unpleasant or even hateful rhetoric and direct incitement to violence. The former can be harmful and destructive but falls within the bounds of free speech. The latter, on the other hand, is a direct threat to people\'s safety and well-being, and is therefore not protected by the principle of freedom of expression. But rather than limiting free speech, Dworkin suggests that the appropriate response to hate speech is more speech, not less. That is, like Dworkin, Strossen proposes a strategy based on education, dialogue, and counter-argumentation. He argues that these means are more effective in combating prejudice and discrimination, and that, unlike censorship, they foster inclusion and respect for diversity. Specifically, Strossen argues that "exposing and refuting hate speech, not suppressing it, is the most effective way to counter its harmful potential" (Strossen, 2018).MILL PERSPECTIVE ON TRUMPS DISCOURSEMill objected to the informal sanctions that an individual experiences when expressing his views. Deliberate contempt was his central concern. Of course, he condemned state repression, but he understood that particular threat to liberty as secondary. The state was merely the agent of society and, in his world, was less of a threat than society itself. The society in which we find ourselves today in the year 2023 is not quite different from Mill\'s. According to Canovan, Populism, is used as a political speech weapon in most liberal democracies. ("is a political term that can have different interpretations and applications depending on the context"), however, it generally refers to a type of politics which contrasts "the people" with "the elite", and which claims to represent and defend the rights and will of "the common people.

Actual Result

[
  {
    type: 'Sentence',
    raw: `This is worrying because history shows that governments have often abused this power to silence minorities and dissidents" (Strossen, 2018). The central thesis of this paper is that, while freedom of expression is central to the health and progress of a democratic society, there are reasonable and necessary limits to this freedom. Specifically, direct incitement to violence is considered an exception to the norm of protecting free speech. During Donald Trump's presidency, public rhetoric took a noticeably more divisive turn. As the FBI has documented, hate crimes increased by nearly 20 per cent during his presidency, and hate-motivated murders, mostly committed by white supremacists, reached their highest level in 28 years. (Reodriguez,2021). While it is tempting to directly correlate this rise in hate crimes with former President Trump's rhetoric, it is important to remember that correlation does not imply causation. Trump has made comments that have been widely criticised as xenophobic and offensive. He has characterised Mexican immigrants as "rapists" and "drug dealers", called African countries "shithole countries", and called for a total ban on Muslims entering the United States. These statements can be interpreted as perpetuating harmful stereotypes and fuelling prejudice and animosity. However, there are no documented statements by Trump that directly incite violence against minorities. (Cineas, 2021) A leader's words have a significant impact on society and can influence the behaviour of his or her followers, but demonstrating a direct causality between Trump's rhetoric and the rise in hate crimes is challenging. For example, CSHE (Center for the Study of Hate and Extremism) at California State University has noted that hate crimes have been on the rise since 2014, before Trump took office. Furthermore, CSHE cautions against oversimplifying the relationship between political rhetoric and hate crimes, noting that there are many factors contributing to this phenomenon, including political polarisation, socio-economic conflicts, and the proliferation of social media platforms that enable the spread of hate speech. This paper does not seek to excuse Trump's rhetoric or minimise the harm that his words can cause. However, it is important to distinguish between unpleasant or even hateful rhetoric and direct incitement to violence. The former can be harmful and destructive but falls within the bounds of free speech. The latter, on the other hand, is a direct threat to people's safety and well-being, and is therefore not protected by the principle of freedom of expression. But rather than limiting free speech, Dworkin suggests that the appropriate response to hate speech is more speech, not less. That is, like Dworkin, Strossen proposes a strategy based on education, dialogue, and counter-argumentation. He argues that these means are more effective in combating prejudice and discrimination, and that, unlike censorship, they foster inclusion and respect for diversity. Specifically, Strossen argues that "exposing and refuting hate speech, not suppressing it, is the most effective way to counter its harmful potential" (Strossen, 2018).MILL PERSPECTIVE ON TRUMPS DISCOURSEMill objected to the informal sanctions that an individual experiences when expressing his views. Deliberate contempt was his central concern. Of course, he condemned state repression, but he understood that particular threat to liberty as secondary. The state was merely the agent of society and, in his world, was less of a threat than society itself. The society in which we find ourselves today in the year 2023 is not quite different from Mill's. According to Canovan, Populism, is used as a political speech weapon in most liberal democracies. ("is a political term that can have different interpretations and applications depending on the context"), however, it generally refers to a type of politics which contrasts "the people" with "the elite", and which claims to represent and defend the rights and will of "the common people.`,
    loc: { start: [Object], end: [Object] },
    range: [ 0, 4064 ],
    children: [ [Object], [Object] ]
  }
]

Code:

const { split } = require('sentence-splitter')

const text = `"This is worrying because history shows that governments have often abused this power to silence minorities and dissidents" (Strossen, 2018). The central thesis of this paper is that, while freedom of expression is central to the health and progress of a democratic society, there are reasonable and necessary limits to this freedom. Specifically, direct incitement to violence is considered an exception to the norm of protecting free speech. During Donald Trump\'s presidency, public rhetoric took a noticeably more divisive turn. As the FBI has documented, hate crimes increased by nearly 20 per cent during his presidency, and hate-motivated murders, mostly committed by white supremacists, reached their highest level in 28 years. (Reodriguez,2021). While it is tempting to directly correlate this rise in hate crimes with former President Trump\'s rhetoric, it is important to remember that correlation does not imply causation. Trump has made comments that have been widely criticised as xenophobic and offensive. He has characterised Mexican immigrants as "rapists" and "drug dealers", called African countries "shithole countries", and called for a total ban on Muslims entering the United States. These statements can be interpreted as perpetuating harmful stereotypes and fuelling prejudice and animosity. However, there are no documented statements by Trump that directly incite violence against minorities. (Cineas, 2021) A leader\'s words have a significant impact on society and can influence the behaviour of his or her followers, but demonstrating a direct causality between Trump\'s rhetoric and the rise in hate crimes is challenging. For example, CSHE (Center for the Study of Hate and Extremism) at California State University has noted that hate crimes have been on the rise since 2014, before Trump took office. Furthermore, CSHE cautions against oversimplifying the relationship between political rhetoric and hate crimes, noting that there are many factors contributing to this phenomenon, including political polarisation, socio-economic conflicts, and the proliferation of social media platforms that enable the spread of hate speech. This paper does not seek to excuse Trump\'s rhetoric or minimise the harm that his words can cause. However, it is important to distinguish between unpleasant or even hateful rhetoric and direct incitement to violence. The former can be harmful and destructive but falls within the bounds of free speech. The latter, on the other hand, is a direct threat to people\'s safety and well-being, and is therefore not protected by the principle of freedom of expression. But rather than limiting free speech, Dworkin suggests that the appropriate response to hate speech is more speech, not less. That is, like Dworkin, Strossen proposes a strategy based on education, dialogue, and counter-argumentation. He argues that these means are more effective in combating prejudice and discrimination, and that, unlike censorship, they foster inclusion and respect for diversity. Specifically, Strossen argues that "exposing and refuting hate speech, not suppressing it, is the most effective way to counter its harmful potential" (Strossen, 2018).MILL PERSPECTIVE ON TRUMPS DISCOURSEMill objected to the informal sanctions that an individual experiences when expressing his views. Deliberate contempt was his central concern. Of course, he condemned state repression, but he understood that particular threat to liberty as secondary. The state was merely the agent of society and, in his world, was less of a threat than society itself. The society in which we find ourselves today in the year 2023 is not quite different from Mill\'s. According to Canovan, Populism, is used as a political speech weapon in most liberal democracies. ("is a political term that can have different interpretations and applications depending on the context"), however, it generally refers to a type of politics which contrasts "the people" with "the elite", and which claims to represent and defend the rights and will of "the common people.`
const getSplit = () => {
  const splits = split(text)
  console.log(splits)
}

getSplit()

Expected Result
(too long to share) - split into sentences.

Additional context
It's missing a double. quote in the beginning, but this shouldn't stop the sentences from being split.

@KevinDanikowski KevinDanikowski changed the title Sentence not split Sentence not split due to missing quote Jun 14, 2023
@azu
Copy link
Member

azu commented Jun 15, 2023

This result is expected.

first " is missing and sentence-splitter can not parse it correctly.
natural language does not have parse error. This makes it difficult to correct implicit errors.

playground

It is possible to issue a warning if one of the pairs is missing, but the use case is difficult.

@KevinDanikowski
Copy link
Author

Hey @azu , do you know of a potentially recommended fill solution to add the quote in this scenario? Otherwise, I'm thinking to just remove quotes if it appears the text was not split properly, and retry.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants