Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LinkRef can't be recognized when the reference name contains an underscore and there is a "*" in front #587

Open
1 task done
limingchina opened this issue Jul 31, 2023 · 1 comment

Comments

@limingchina
Copy link

limingchina commented Jul 31, 2023

  • Parser

To Reproduce
Create a java project adding flexmark dependency and run the following code.

package flexmark_linkref_bug;

import com.vladsch.flexmark.util.ast.Node;
import com.vladsch.flexmark.parser.Parser;
import com.vladsch.flexmark.test.util.AstCollectingVisitor;
import com.vladsch.flexmark.util.data.DataSet;

public class App {
    public static void main(String[] args) {
        Parser parser = Parser.builder(new DataSet()).build();
        Node document 
            = parser.parse("Note:a [INVALID_PARAMETER] error is generated.");
        System.out.println("====Good case if there is no '*'====");
        System.out.println(new AstCollectingVisitor().collectAndGetAstText(document));

        document 
            = parser.parse("Note*: [INVALID_PARAMETER] error is generated.");
        System.out.println("====Bad case: LinkRef is not recognized if there is a '*' in front====");
        System.out.println(new AstCollectingVisitor().collectAndGetAstText(document));

        document 
            = parser.parse("*Note*: [INVALID_PARAMETER] error is generated.");
        System.out.println("====Bad case: LinkRef is not recognized when using bold text which has a pair of '*'s====");
        System.out.println(new AstCollectingVisitor().collectAndGetAstText(document));

        document 
            = parser.parse("Note:* [INVALID] error is generated.");
        System.out.println("====Good case: LinkRef is recognized without underscore in the enum value====");
        System.out.println(new AstCollectingVisitor().collectAndGetAstText(document));

        document 
            = parser.parse("This is a test\n" + "\n" +"Note:* [Error.INVALID_PARAMETER] error is generated.");
        System.out.println("====Good case: LinkRef is recognized when there is sentence followed with an empty line====");
        System.out.println(new AstCollectingVisitor().collectAndGetAstText(document));

        document 
            = parser.parse("This is a tes\n" + "\n" +"Note:* [Error.INVALID_PARAMETER] error is generated.");
        System.out.println("====Bad case: LinkRef is not recognized when the first sentence is too short====");
        System.out.println(new AstCollectingVisitor().collectAndGetAstText(document));

        document 
            = parser.parse("This is a tes\n" + "\n" +"Note:* [INVALID_PARAMETER] error is generated.");
        System.out.println("====Good case: LinkRef is recognized after removing the 'Error.'====");
        System.out.println(new AstCollectingVisitor().collectAndGetAstText(document));
    }
}

Resulting Output :

====Good case if there is no '*'====
Document[0, 46]
  Paragraph[0, 46]
    Text[0, 7] chars:[0, 7, "Note:a "]
    LinkRef[7, 26] referenceOpen:[7, 8, "["] reference:[8, 25, "INVALID_PARAMETER"] referenceClose:[25, 26, "]"]
      Text[8, 25] chars:[8, 25, "INVAL … METER"]
    Text[26, 46] chars:[26, 46, " erro … ated."]

====Bad case: LinkRef is not recognized if there is a '*' in front====
Document[0, 46]
  Paragraph[0, 46]
    Text[0, 46] chars:[0, 46, "Note* … ated."]

====Bad case: LinkRef is not recognized when using bold text which has a pair of '*'s====
Document[0, 47]
  Paragraph[0, 47]
    Emphasis[0, 6] textOpen:[0, 1, "*"] text:[1, 5, "Note"] textClose:[5, 6, "*"]
      Text[1, 5] chars:[1, 5, "Note"]
    Text[6, 47] chars:[6, 47, ": [IN … ated."]

====Good case: LinkRef is recognized without underscore in the enum value====
Document[0, 36]
  Paragraph[0, 36]
    Text[0, 7] chars:[0, 7, "Note:* "]
    LinkRef[7, 16] referenceOpen:[7, 8, "["] reference:[8, 15, "INVALID"] referenceClose:[15, 16, "]"]
      Text[8, 15] chars:[8, 15, "INVALID"]
    Text[16, 36] chars:[16, 36, " erro … ated."]

====Good case: LinkRef is recognized when there is sentence followed with an empty line====
Document[0, 68]
  Paragraph[0, 15] isTrailingBlankLine
    Text[0, 14] chars:[0, 14, "This  …  test"]
  Paragraph[16, 68]
    Text[16, 23] chars:[16, 23, "Note:* "]
    LinkRef[23, 48] referenceOpen:[23, 24, "["] reference:[24, 47, "Error.INVALID_PARAMETER"] referenceClose:[47, 48, "]"]
      Text[24, 47] chars:[24, 47, "Error … METER"]
    Text[48, 68] chars:[48, 68, " erro … ated."]

====Bad case: LinkRef is not recognized when the first sentence is too short====
Document[0, 67]
  Paragraph[0, 14] isTrailingBlankLine
    Text[0, 13] chars:[0, 13, "This  … a tes"]
  Paragraph[15, 67]
    Text[15, 67] chars:[15, 67, "Note: … ated."]

====Good case: LinkRef is recognized after removing the 'Error.'====
Document[0, 61]
  Paragraph[0, 14] isTrailingBlankLine
    Text[0, 13] chars:[0, 13, "This  … a tes"]
  Paragraph[15, 61]
    Text[15, 22] chars:[15, 22, "Note:* "]
    LinkRef[22, 41] referenceOpen:[22, 23, "["] reference:[23, 40, "INVALID_PARAMETER"] referenceClose:[40, 41, "]"]
      Text[23, 40] chars:[23, 40, "INVAL … METER"]
    Text[41, 61] chars:[41, 61, " erro … ated."]

Additional context
The bug was initially reported in the heremaps/gluecodium project: heremaps/gluecodium#1542. After debugging, it's found that the issue is actually in the flexmark library.

@limingchina limingchina changed the title LinkRef can't be recognized when the value contains an underscore and there is a "*" in front LinkRef can't be recognized when the reference name contains an underscore and there is a "*" in front Jul 31, 2023
@limingchina
Copy link
Author

This diff seem to be working. However, I feel it might not be the correct fix. The problem is that the function isStraddling tries to find out a delimiter inside the brackets. However, it's unclear to me why if the delimiter is not matching, the LinkRef detection would be skipped later. It's actually quite possible that the link section contains some underscores.

diff --git a/flexmark/src/main/java/com/vladsch/flexmark/parser/core/delimiter/Bracket.java b/flexmark/src/main/java/com/vladsch/flexmark/parser/core/delimiter/Bracket.java
index 89e5050c4..e172d8cb6 100644
--- a/flexmark/src/main/java/com/vladsch/flexmark/parser/core/delimiter/Bracket.java
+++ b/flexmark/src/main/java/com/vladsch/flexmark/parser/core/delimiter/Bracket.java
@@ -92,7 +92,15 @@ public class Bracket {
         // first see if we have any closers in our span
         int startOffset = nodeChars.getStartOffset();
         int endOffset = nodeChars.getEndOffset();
-        Delimiter inner = previousDelimiter == null ? null : previousDelimiter.getNext();
+
+        Delimiter inner = null;
+        if (previousDelimiter != null) {
+            inner = previousDelimiter.getNext();
+            if (inner != null && inner.getDelimiterChar() != previousDelimiter.getDelimiterChar()) {
+                // If the delimiter chars are not matching then we are not straddling
+                inner = null;
+            }
+        }
         while (inner != null) {
             int innerOffset = inner.getEndIndex();
             if (innerOffset >= endOffset) break;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant