Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert the RegexBuilder overloads to use variadic generics #726

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

natecook1000
Copy link
Member

@natecook1000 natecook1000 commented Mar 1, 2024

This change switches all the RegexBuilder overload-based "variadics" to use proper variadic generics. As far as our tests show, this is largely compatible with the existing API, though there is one exception, detailed below. That said, I'm a bit worried about changes in output type resolution for edge cases, particularly with capture groups nested into the various structures.


The incompatibility is for composed regexes with more than 10 capture groups, which is the current limit of our overloads in RegexBuilder. Right now, when a regex with a large number of capture groups is included in a builder-syntax closure, its captures are dropped from the output type of the resulting regex. Though this is definitely a bug in the current system, fixing it represents a source break that may or may not be visible to clients.

let regexWithTooManyCaptures = #/(a)(b)(c)(d)(e)(f)(g)(h)(i)(j)(k)(l)(m)/#
let dslWithTooManyCaptures = Regex {
  Capture(OneOrMore(.word))
  ":"
  regexWithTooManyCaptures
  ":"
  TryCapture<(Substring, Int)>(OneOrMore(.word)) { Int($0) }
  #/:(\d+):/#
}

The dslWithTooManyCaptures regex currently has an output type of (Substring, Substring, Int, Substring) because the captures in regexWithTooManyCaptures are left out of the resulting output type. With this change, the output type becomes (Substring, Substring, Substring, Substring, Substring, Substring, Substring, Substring, Substring, Substring, Substring, Substring, Substring, Substring, Substring, Int, Substring). If a client matches and only accesses output.3, the value of the captured value will silently change with this change in API. Trying to access a capture group with a different type (e.g, output.2 is currently Int, but would change to Substring) or using the output in a typed context (e.g passing the tuple as an initializer's parameters) would result in a compilation failure.

(Note that when converted to Regex<AnyRegexOutput>, the full set of captures is available in both the current and new versions of the API.)

This uses parameter packs to replace the existing multitude of buildPartialBlock
implementations in RegexBuilder.
This converts the RegexBuilder quantifiers to use variadic generics
instead of explicit overloads. This transition uses a particular
generic pattern so that a variadic number of output parameters are
captured correctly, without matching a "zero captures" component
as having `()` as its output type. To accomplish this, the generic
parameters for each quantification method are like:

    some RegexComponent<(W, Capture0, repeat each Capture)>

The presence of `Capture0` requires that at least one sub-capture
beyond the whole match is present. In addition to the variadic
versions, unconstrained versions of each initializer are included,
marked `@_disfavoredOverload`. These are only selected when the
component's output type is a single type (e.g. `Regex<Substring>`).
Not totally sure why the `Capture0` trick from the quantification
variadics isn't necessary here, so more source compatibility testing
should be done before landing.
Some trickiness involved in getting the optionality of captures to
line up, but I think I got this right. Will need some more source-
compatibility validation here, as well.
This one was a suspiciously simple transition, but all seems to be
working.
@natecook1000 natecook1000 marked this pull request as ready for review April 23, 2024 21:50
@natecook1000
Copy link
Member Author

@swift-ci Please test

@natecook1000
Copy link
Member Author

@swift-ci Please test

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant