Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Remove The Unicode Escape \u000E In Scaladoc Code Comment Parsing
The regular expressions for `CodeBlockStartRegex` and `CodeBlockEndRegex` both contain two instances of the Unicode escape `\u000E`. This is the "Shift In" character. I expect that it was inserted as part of a copy/paste error. Unicode escapes in triple quote strings are deprecated as of 2.13.2 (scala/scala#8282). Further, this character actually makes the regular expression invalid if it is interpreted. This isn't a big deal right now, as it appears to be ignored on Scala 2.12.x, but on Scala 2.13.x this will cause the regular expressions to fail for Scaladoc using the `<pre>` tag. For example, ```scala import scala.util.matching._ object Main { val doc0: String = """ | /** A foo is a bar, for example. | * | * {{{ | * val foo: String = "bar" | * }}} | * | * <pre> | * val bar: String = "baz | * </pre> | */""".stripMargin val CodeBlockStartRegex = new Regex("""(.*?)((?:\{\{\{)|(?:\u000E<pre(?: [^>]*)?>\u000E))(.*)""") val CodeBlockStartRegex0 = new Regex("""(.*?)((?:\{\{\{)|(?:<pre(?: [^>]*)?>))(.*)""") def matchInfo(regex: Regex, value: CharSequence): Unit = { println(s"\nTarget: ${value}") println(s"Regex: ${regex}") val matches: List[Regex.Match] = regex.findAllMatchIn(value).toList println(s"Match Count: ${matches.size}") println(s"Matches: ${matches}") } def main(args: Array[String]): Unit = { matchInfo(CodeBlockStartRegex, doc0) matchInfo(CodeBlockStartRegex0, doc0) } } ``` When run with 2.13.4 yields this result, ```shell warning: 1 deprecation (since 2.13.2); re-run with -deprecation for details 1 warning Picked up JAVA_TOOL_OPTIONS: -Dsbt.supershell=false Target: /** A foo is a bar, for example. * * {{{ * val foo: String = "bar" * }}} * * <pre> * val bar: String = "baz * </pre> */ Regex: (.*?)((?:\{\{\{)|(?:�<pre(?: [^>]*)?>�))(.*) Match Count: 1 Matches: List( * {{{) Target: /** A foo is a bar, for example. * * {{{ * val foo: String = "bar" * }}} * * <pre> * val bar: String = "baz * </pre> */ Regex: (.*?)((?:\{\{\{)|(?:<pre(?: [^>]*)?>))(.*) Match Count: 2 Matches: List( * {{{, * <pre>) ``` Note how the first output only found one match, the `{{{` based one, but the second one found both. Finally, a small test was added to ensure that the change does not break comment parsing.
- Loading branch information