Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

splitAST does not split sentences within link titles #25

Open
kimburgess opened this issue May 19, 2021 · 1 comment
Open

splitAST does not split sentences within link titles #25

kimburgess opened this issue May 19, 2021 · 1 comment
Labels
Status: Proposal Request for comments Type: Feature New Feature

Comments

@kimburgess
Copy link

Closely related to #24—possibly more controversial.

When provided with an input of:

{
  type: 'Paragraph',
  children: [
    {
      type: 'Link',
      title: null,
      url: 'https://example.com',
      children: [Array],
      loc: [Object],
      range: [Array],
      raw: '[This link has more than one sentence. It has two lines.](https://example.com)'
    }
  ],
  loc: { start: { line: 1, column: 0 }, end: { line: 1, column: 78 } },
  range: [ 0, 78 ],
  raw: '[This link has more than one sentence. It has two lines.](https://example.com)'
}

The following is returned:

{
  type: 'Paragraph',
  children: [
    {
      type: 'Sentence',
      raw: '[This link has more than one sentence. It has two lines.](https://example.com)',
      loc: [Object],
      range: [Array],
      children: [Array]
    }
  ],
  loc: { start: { line: 1, column: 0 }, end: { line: 1, column: 78 } },
  range: [ 0, 78 ],
  raw: '[This link has more than one sentence. It has two lines.](https://example.com)'
}

However these are equivalent representations:

[This link has more than one sentence. It has two lines.](https://example.com)
[This link has more than one sentence.
It has two lines.](https://example.com)
@azu azu added Status: Proposal Request for comments Type: Feature New Feature labels May 19, 2021
@azu
Copy link
Member

azu commented May 19, 2021

Yes. Currently, splitAST parse only Str node and it parse shallow.
https://github.com/azu/sentence-splitter/blob/1663ab26ba4ad29d9ab36616a572231a5e9d637c/src/sentence-splitter.ts#L182-L191
As a result, this library does not psers Str of Link or Strong.

<Paragraph>
  <Str>  ← will be parsed  (depth: 1)
  <Link>
     <Str> ← will not be parsed (depth: 2)
   </Link>
   <Str> ← will be parsed (depth: 1)
</Paragraph>

The reason of this behavior is similar reason to textlint-rule-helper's isPlainStrNode(node).

For example, the link to external blog that's Str(link title) is not under your controll.

However, it is reasonable that we can handle Link's children.

If we add recursive option, we can resolve these issue #24 #25.
Or, add handling to more types Link and Strong etc...
https://github.com/azu/sentence-splitter/blob/1663ab26ba4ad29d9ab36616a572231a5e9d637c/src/sentence-splitter.ts#L191

We need to add new behavior as option.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Proposal Request for comments Type: Feature New Feature
Projects
None yet
Development

No branches or pull requests

2 participants