New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Check exhaustivity of pattern matches with "if" guards and custom extractors (by default) and of unsealed types (by opt-in). #9140
Conversation
1256b1f
to
62e6cdc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
I assume we'll explain all pattern matching changes in the 2.13.4 release notes. Nevertheless, it would be good to update the PR description here to summarize all changes that this brings.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Thanks for picking this up, @dwijnand! |
This comment has been minimized.
This comment has been minimized.
I'm delighted, and impressed, to see the matcher get so much smarter 🧠💪 But I have misgivings about this PR in its current form. I suggest some further thought/discussion about what the defaults should be and what mechanisms should be provided for altering those defaults. In particular, I was surprised by the choice to make the optimistic rather than the pessimistic assumption about guards. Perhaps I just need to understand the reasoning behind that choice. But I had expected that exhaustivity should never be assumed if cannot actually be proven. And I think we want to move in the direction of exhaustive matches being considered the norm and requiring the user to be explicit about it when a match isn't exhaustive, for greater safety overall. Also (orthogonally), the new compiler flags aren't a good design IMO, because they're global. I think users will want to make this decision per-pattern-match, not per-project. How about an annotation? Might And I don't understand why one of the new flags an |
That would absolutely be my preference too. But this is far, far from the truth (sadly!!). My experience maintaining MiMa leads me now to be more cautious when dealing with false positives (which are, regrettably, easy to come by in MiMa 😞), preferring an approach where a user opts into stricter-but-possibly-noisy checks (e.g. That said, where on the strict/noisy <-> quiet/risky spectrum we should pick for the various scenarios we may want to pick better choices. I've been thinking about it and it's all about cases where the compiler warns but the user understands better the situation, and knows the warning is just wrong. But I see a split in those cases: (1) cases where the compiler doesn't know if the cases are exhaustive or not e.g.:
Here the knowledgeable user could be really annoyed by the noise and want everything to go back to not warning. (2) cases where the compiler does know the cases aren't exhaustive (even if they currently are for the user), e.g.:
The latter I think we should consider being stricter by default, in priority/risk order (start with the less false positive-y):
Somethings amiss: the flags are for more strictness, which I seems right to me: you dial up the strictness across the board (forSome board) and suppress on case-by-case bases (or, actually, across the board 😄).
Predates me and I didn't give it much thought. Happy to make |
I disagree. We write |
I strongly agree with Guillaume about complementary guards and custom extractors. The examples you give are just bad code. The only case I'm on the fence about whether to warn by default is unsealed types. Re: the globalness of the flags, I'm suggesting the flags should not exist at all — they should be the defaults, and if someone doesn't like that default, they can I'm curious what other repo watchers think. |
3c3d2c8
to
1df4b16
Compare
@dwijnand I'm sort of curious if your thinking changed as a result of Guillaume and my comments, or if you just decided to go with the flow. (Perhaps idle curiosity, or, idk, perhaps understanding where we arrived at on the design thinking here will help us improve the PR description.) I suggest we wait to merge (and to finalize the PR description) until @lrytz has weighed in, since he had approved the previous version. |
small thought: |
My heart/desire was (and still is) that exhaustivity were strict by default but I was on the fence. Yours and Guillaume's certainty was enough for me to go that far, so we can assess if we like the change. Thinking about it more today I suspect that where MiMa's were actually false positives (due to lack of scala pickle metadata), here the false positives due to complementary guards/extractors I believe are easier to grasp, and the lack of clear counter-examples are unfortunate but they aren't indicators of a false positive.
Yeah, I think it would look less silly if that just said "match may not be exhaustive". |
My thoughts after looking at the impact this has on tests, and after playing around for a while. I'm happy to be asked for an I'm also happy to accept that the compiler bails on guards, buddy is a compiler after all, not a theorem prover. It's my code, and it's easy to make the match exhaustive by leaving the last guard away. As Guillaume says, it's like However, I have an issue with spurious warnings for extractors. I think a main difference here is that extractors are typically not my code, or at least defined somewhere else than the match I'm writing. I fear that introducing spurious warnings will diminish the value of extractors for library designers. For example, the following warns after this PR scala> def m(s: Stream[Int]) = s match { case Stream.Empty => 0; case x #:: _ => x }
<console>:11: warning: match may not be exhaustive.
It would fail on the following input: Cons()
def m(s: Stream[Int]) = s match { case Stream.Empty => 0; case x #:: _ => x }
^ As a user, I'm not doing anything wrong here! Extractors enable a beautiful integration of libraries into the language. Pattern matching is a core language feature and the code looks really natural here. But the spurious warning really breaks the experience. It's not the same situation as with guards, where I can just leave the last one away. I need to call the extractor for it to do its job. |
Yeah we have that warning on matching on LazyList/Stream, but to me it indicates that LazyList is just not a good fit for pattern-matching. |
Wouldn't the fix be the same as with guards? s match { case x #:: _ => x; case _ => 0; } |
The Stream is really just an example, I believe that the spurious warnings will affect library designers in ways that we don't want or don't expect. Extractors are a really powerful feature, they allow hooking into the syntax of pattern matching which is core to the language. I think the warnings are problematic. |
If we go back, then I should remember to keep them strict behind Btw, just realised this contrast: Seth:
Lukas:
As implemented unsealed types require |
For example, let's say for some reason I want my library to be designed like this: sealed trait T
final class A(val x: Int) extends T
object A {
def unapply(a: A): Option[Int] = Option(a).map(_.x)
}
final class B(val x: Int, val y: Int) extends T
object B {
def unapply(b: B): Option[(Int, Int)] = Option(b).map(b => (b.x, b.y))
} Then for users scala> def f(t: T) = t match { case A(x) => x; case B(x, y) => x + y }
<console>:14: warning: match may not be exhaustive.
It would fail on the following inputs: A(), B()
def f(t: T) = t match { case A(x) => x; case B(x, y) => x + y }
^ |
In that example you can make your extractors return |
The pattern matcher converts code like: Box(Option("hello")) match { case Box(Some(s)) => case Box(None) => } into two invocations of Box.unapply, because it can't trust that Box.unapply is idempotent. For case classes (like Some) it can, and moreso it can just access the underlying values directly ("x1.value"). The exhaustivity/unreachability analysis builds on this (e.g "pointsToBound.exists(sym => t.exists(_.symbol == sym)"). With the previous commits making the exhaustivity/unreachability checker consider custom extractors this would result in multiple symbols for each extraction (instead of all being "x1.value", for example) which results in multiple SAT variables which results, basically, in warnings that the value returned from the first extraction could None and the second value could be Some - which is only true for a non-idempotent Box.unapply. The intent of the previous commits (and the whole PR) is to make the exhaustivity checker provide warnings that were previous missing, but warning for fear of non-idempotent custom extractors would produce more noise than signal. Now, within a minor release like 2.13.4 we can't go and change the code generation and thus the code semantics (Dotty can and apparently does), but we can do the analysis with the assumption of idempotency. That's what this does: in MatchApproximator's TreeMakersToProps use a cache for the binders so when we see two extractions that are equivalent we reuse the same binder, so it translates into the same SAT variable.
In 2.13.4, the pattern match exhaustivity analysis has become much stricter. It remains active in the presence of non-sealed classes, guards, and custom extractors. This change causes a lot of new warnings in our codebase. See scala/scala#9140 We address most warnings with `@unchecked`, but for some we throw more specialized exceptions instead.
For the record: When looking at this PR previously I missed that it turned on exhaustiveness checking for unsealed types by default (Dotty doesn't). I think this is a bigger and more controversial change than anything else in this PR (besides the new warnings, this could also have a performance impact since many more expressions will be checked by default). We discussed this at the Dotty meeting today (Dale was present) and Martin was pretty strongly against this being the default. We also touched on the fact that there's no way in source to determine which match expressions are checked or not which is unfortunate, and agreed this topic in general warranted further discussion. |
Flipping the default in #9299. |
In 2.13.4, the pattern match exhaustivity analysis has become stricter. It remains active in the presence of guards and custom extractors (including `unapplySeq`). This change causes some new warnings in our codebase, which this commit addresses. See scala/scala#9140, which was somewhat toned down by scala/scala#9299.
In 2.13.4, the pattern match exhaustivity analysis has become stricter. It remains active in the presence of guards and custom extractors (including `unapplySeq`). This change causes some new warnings in our codebase, which this commit addresses. See scala/scala#9140, which was somewhat toned down by scala/scala#9299.
### What changes were proposed in this pull request? This PR aims the followings. 1. Upgrade from Scala 2.13.3 to 2.13.4 for Apache Spark 3.1 2. Fix exhaustivity issues in both Scala 2.12/2.13 (Scala 2.13.4 requires this for compilation.) 3. Enforce the improved exhaustive check by using the existing Scala 2.13 GitHub Action compilation job. ### Why are the changes needed? Scala 2.13.4 is a maintenance release for 2.13 line and improves JDK 15 support. - https://github.com/scala/scala/releases/tag/v2.13.4 Also, it improves exhaustivity check. - scala/scala#9140 (Check exhaustivity of pattern matches with "if" guards and custom extractors) - scala/scala#9147 (Check all bindings exhaustively, e.g. tuples components) ### Does this PR introduce _any_ user-facing change? Yep. Although it's a maintenance version change, it's a Scala version change. ### How was this patch tested? Pass the CIs and do the manual testing. - Scala 2.12 CI jobs(GitHub Action/Jenkins UT/Jenkins K8s IT) to check the validity of code change. - Scala 2.13 Compilation job to check the compilation Closes #30455 from dongjoon-hyun/SCALA_3.13. Authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
@@ -15,7 +15,7 @@ package scala | |||
/** An annotation to designate that the annotated entity | |||
* should not be considered for additional compiler checks. | |||
* Specific applications include annotating the subject of | |||
* a match expression to suppress exhaustiveness warnings, and | |||
* a match expression to suppress exhaustiveness and reachability warnings, and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Old PR, but this doesn't seem to be the case in Scala 2? You also mention it in #9140 (comment).
The accuracy of exhaustivity checking for pattern matches is improved. These changes intend to bring to light pattern matching code that might throw runtime
MatchError
s for unhandled cases, without any indication at compile time.Most importantly the following no longer disable exhaustivity checking:
-Xlint:strict-unsealed-patmat
As before, it remains possible to disable checking of both exhaustivity and reachability by annotating an expression with
@unchecked
, such as(foo: @unchecked) match { ... }
.But you may also use
-Xnon-strict-patmat-analysis
to disable the "strict" pattern matching for guards (1) and custom extractors (2).Guards
Guards (
case .. if .. =>
) no longer disable the exhaustivity checker. Rather, the checker checks for exhaustivity using the "pessimistic" assumption that the guard may befalse
.Thus, given the following example:
The change decided is that, even if the user may know or be able to determine that one of
g1
,g2
andg3
will always be true (at least one), it's safer for the exhaustivity checker to not back off even if it means it will produce a warning here, requiring the user to suppress the warning by either:if g3
; orcase x => ...
orcase _ => ...
) with some body (perhaps throwing or logging)Custom Extractors
Refutable custom extractors no longer disable exhaustivity checking. (Irrefutable ones already didn't disable it). It is “pessimistically” assumed that the extractor might not match.
An “irrefutable” custom extractor is one that returns
Some
rather thanOption
, such as:So if you have custom extractors that are also irrefutable (that is they can't "not match") then you can resolve some exhaustivity warnings by making it return
Some[..]
rather thanOption[..]
. As an example:Unsealed types
Types that aren't sealed are also checked by the exhaustivity checker if the
-Xlint:strict-unsealed-patmat
option (or all of-Xlint
) is enabled. This makes pattern matching safer, as the compiler will now give a warning when it can't see that a pattern covers all the cases, rather than quietly disabling the check.Known Issues/Limitations
The counter-examples provided may not always be clear, particularly around
unapplySeq
. For example it might say something likeChain()
as an attempt to say that some values of typeChain
aren't covered by theChain.unapplySeq
usage.scala/bug#12186 In some patterns, like type patterns, aliases of
case object
s (and possiblecase class
es too) won't dealias and so won't register as exhausting the match, such ascase _: Nil.type =>
(withNil
being the val aliasNil
inPredef
rather than thecase object
inscala.collection.immutable
).Because these match cases are now considered the match analysis might now fail due to excessive recursion depth, for example:
In addition to attempting to bump the threshold, you may find it easier to resolve this by breaking the match up into smaller parts (e.g. #9277).
We hope to resolve these soon... (🤞)
Migration
The best-case scenario is that the matcher identifies holes in the match that were previously ignored, so these can be addressed by the programmer.
Where it is expected that some values will throw, you can get rid of the warning by explicitly defining a default case (
case x => ...
orcase _ => ...
), perhaps throwing a more specific exception with some more domain-specific information or recovering in some way.You might also choose the (sometimes) terser route of annotating some non-total pattern match by adding
@unchecked
as mentioned above.Provenance
This PR was originally:
Fixes scala/bug#5365
Fixes scala/bug#8178
Fixes scala/bug#9232