Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check exhaustivity of pattern matches with "if" guards and custom extractors (by default) and of unsealed types (by opt-in). #9140

Merged
merged 13 commits into from Oct 23, 2020

Conversation

dwijnand
Copy link
Member

@dwijnand dwijnand commented Jul 26, 2020

The accuracy of exhaustivity checking for pattern matches is improved. These changes intend to bring to light pattern matching code that might throw runtime MatchErrors for unhandled cases, without any indication at compile time.

Most importantly the following no longer disable exhaustivity checking:

  1. guards (fixing Match Exhaustiveness Testing Ignores Guard Statements bug#5365)
  2. refutable custom extractors (irrefutable ones already didn't)
  3. unsealed types (sealed types already didn't) by -Xlint:strict-unsealed-patmat

As before, it remains possible to disable checking of both exhaustivity and reachability by annotating an expression with @unchecked, such as (foo: @unchecked) match { ... }.

But you may also use -Xnon-strict-patmat-analysis to disable the "strict" pattern matching for guards (1) and custom extractors (2).

Guards

Guards (case .. if .. =>) no longer disable the exhaustivity checker. Rather, the checker checks for exhaustivity using the "pessimistic" assumption that the guard may be false.

Thus, given the following example:

x match {
  case p if g1 => ...
  case p if g2 => ...
  case p if g3 => ...
}

The change decided is that, even if the user may know or be able to determine that one of g1, g2 and g3 will always be true (at least one), it's safer for the exhaustivity checker to not back off even if it means it will produce a warning here, requiring the user to suppress the warning by either:

  1. dropping the final if g3; or
  2. adding a default case (case x => ... or case _ => ...) with some body (perhaps throwing or logging)

Custom Extractors

Refutable custom extractors no longer disable exhaustivity checking. (Irrefutable ones already didn't disable it). It is “pessimistically” assumed that the extractor might not match.

An “irrefutable” custom extractor is one that returns Some rather than Option, such as:

object Length {
  def unapply(s: String): Some[Int] = Some(s.length)
}

So if you have custom extractors that are also irrefutable (that is they can't "not match") then you can resolve some exhaustivity warnings by making it return Some[..] rather than Option[..]. As an example:

scala> object Length { def unapply(s: String): Option[Int] = Some(s.length) }
object Length

scala> "abc" match { case Length(n) => println(s"The length was $n.") }
       ^
       warning: match may not be exhaustive.
The length was 3.

scala> object Length { def unapply(s: String): Some[Int] = Some(s.length) }
object Length

scala> "abc" match { case Length(n) => println(s"The length was $n.") }
The length was 3.

Unsealed types

Types that aren't sealed are also checked by the exhaustivity checker if the -Xlint:strict-unsealed-patmat option (or all of -Xlint) is enabled. This makes pattern matching safer, as the compiler will now give a warning when it can't see that a pattern covers all the cases, rather than quietly disabling the check.

Known Issues/Limitations

The counter-examples provided may not always be clear, particularly around unapplySeq. For example it might say something like Chain() as an attempt to say that some values of type Chain aren't covered by the Chain.unapplySeq usage.

scala/bug#12186 In some patterns, like type patterns, aliases of case objects (and possible case classes too) won't dealias and so won't register as exhausting the match, such as case _: Nil.type => (with Nil being the val alias Nil in Predef rather than the case object in scala.collection.immutable).

Because these match cases are now considered the match analysis might now fail due to excessive recursion depth, for example:

[error] /home/jenkins/workspace/scala-2.13.x-validate-main/src/compiler/scala/tools/nsc/typechecker/Typers.scala:5288:32: Exhaustivity analysis reached max recursion depth, not all missing cases are reported.
[error] (Please try with scalac -Ypatmat-exhaust-depth 160 or -Ypatmat-exhaust-depth off.)
[error]             val result = silent(_.makeAccessible(tree1, sym, pre, qual)) match {
[error]                                ^
[error] one error found

In addition to attempting to bump the threshold, you may find it easier to resolve this by breaking the match up into smaller parts (e.g. #9277).

We hope to resolve these soon... (🤞)

Migration

The best-case scenario is that the matcher identifies holes in the match that were previously ignored, so these can be addressed by the programmer.

Where it is expected that some values will throw, you can get rid of the warning by explicitly defining a default case (case x => ... or case _ => ...), perhaps throwing a more specific exception with some more domain-specific information or recovering in some way.

You might also choose the (sometimes) terser route of annotating some non-total pattern match by adding @unchecked as mentioned above.

Provenance

This PR was originally:

Fixes scala/bug#5365
Fixes scala/bug#8178
Fixes scala/bug#9232

@scala-jenkins scala-jenkins added this to the 2.12.13 milestone Jul 26, 2020
@dwijnand dwijnand force-pushed the exhaust branch 3 times, most recently from 1256b1f to 62e6cdc Compare August 2, 2020 20:38
Copy link
Member

@lrytz lrytz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

I assume we'll explain all pattern matching changes in the 2.13.4 release notes. Nevertheless, it would be good to update the PR description here to summarize all changes that this brings.

@lrytz

This comment has been minimized.

@lrytz lrytz added the release-notes worth highlighting in next release notes label Aug 18, 2020
@dwijnand

This comment has been minimized.

@sellout
Copy link
Contributor

sellout commented Aug 18, 2020

Thanks for picking this up, @dwijnand!

@SethTisue

This comment has been minimized.

@SethTisue
Copy link
Member

SethTisue commented Aug 18, 2020

I'm delighted, and impressed, to see the matcher get so much smarter 🧠💪

But I have misgivings about this PR in its current form. I suggest some further thought/discussion about what the defaults should be and what mechanisms should be provided for altering those defaults.

In particular, I was surprised by the choice to make the optimistic rather than the pessimistic assumption about guards. Perhaps I just need to understand the reasoning behind that choice. But I had expected that exhaustivity should never be assumed if cannot actually be proven. And I think we want to move in the direction of exhaustive matches being considered the norm and requiring the user to be explicit about it when a match isn't exhaustive, for greater safety overall.

Also (orthogonally), the new compiler flags aren't a good design IMO, because they're global. I think users will want to make this decision per-pattern-match, not per-project. How about an annotation? Might @nowarn be sufficient? And/or, are you aware of @unchecked's behavior (https://stackoverflow.com/questions/42526314/how-to-suppress-the-match-may-not-be-exhaustive-warning-in-scala)?

And I don't understand why one of the new flags an -Xlint:... and one isn't? (Even if it makes sense, the PR description should briefly justify it.)

@dwijnand
Copy link
Member Author

dwijnand commented Aug 18, 2020

In particular, I was surprised by the choice to make the optimistic rather than the pessimistic assumption about guards. Perhaps I just need to understand the reasoning behind that choice. But I had expected that exhaustivity should never be assumed if cannot actually be proven. And I think we want to move in the direction of exhaustive matches being considered the norm and requiring the user to be explicit about it when a match isn't exhaustive, for greater safety overall.

That would absolutely be my preference too. But this is far, far from the truth (sadly!!).

My experience maintaining MiMa leads me now to be more cautious when dealing with false positives (which are, regrettably, easy to come by in MiMa 😞), preferring an approach where a user opts into stricter-but-possibly-noisy checks (e.g. mimaReportSignatureProblems := true in MiMa) - mind you sbt-tpolecat et cetera kind of unintentionally undermines that, but 🤷‍♂️ ...

That said, where on the strict/noisy <-> quiet/risky spectrum we should pick for the various scenarios we may want to pick better choices. I've been thinking about it and it's all about cases where the compiler warns but the user understands better the situation, and knows the warning is just wrong. But I see a split in those cases:

(1) cases where the compiler doesn't know if the cases are exhaustive or not e.g.:

  • (1a) complementary guards (case Foo if n >= 0 => ... case Foo if n < 0 => ...)
  • (1b) complementary custom extractors (case NonNegative(n) => ... case Negative(n) => ...)

Here the knowledgeable user could be really annoyed by the noise and want everything to go back to not warning.

(2) cases where the compiler does know the cases aren't exhaustive (even if they currently are for the user), e.g.:

  • (2a) partial unappySeq extractors (case List(x) => ... case List(x, _*) => ...) "I know the list is always non-empty" (List might be a poor example because there's some fudging going on with List)
  • (2b) partial unsealed matches (case n: Int => ... case s: String => ...) "I know this Any is either ints or strings"

The latter I think we should consider being stricter by default, in priority/risk order (start with the less false positive-y):

  1. case class unapplySeqs (see my last commit)
  2. custom unapplySeqs
  3. unsealed types

Also (orthogonally), the new compiler flags aren't a good design IMO, because they're global. I think users will want to make this decision per-pattern-match, not per-project. How about an annotation? Might @nowarn be sufficient? And/or, are you aware of @unchecked's behavior (stackoverflow.com/questions/42526314/how-to-suppress-the-match-may-not-be-exhaustive-warning-in-scala)?

Somethings amiss: the flags are for more strictness, which I seems right to me: you dial up the strictness across the board (forSome board) and suppress on case-by-case bases (or, actually, across the board 😄).

And I don't understand why one of the new flags an -Xlint:... and one isn't? (Even if it makes sense, the PR description should briefly justify it.)

Predates me and I didn't give it much thought. Happy to make -Xstrict-patmat-analysis a lint if desired.

@smarter
Copy link
Member

smarter commented Aug 19, 2020

Here the knowledgeable user could be really annoyed by the noise and want everything to go back to not warning.

I disagree. We write val x = if (n >= 0) foo else bar and not val x = if (n >= 0) foo else if (n < 0) bar, and if someone writes the latter then we tell them they're doing it wrong if they complain that the result type is () because there's no else branch. By the same reasoning, the compiler is correct to warn about a missing case even when someone writes complimentary guards or extractors: it's not a big deal to replace the complement by something that always matches, and I'd even argue it's clearer. This is how dotty behaves and I think it'd be great if scalac aligned itself to that.

@SethTisue
Copy link
Member

SethTisue commented Aug 20, 2020

I strongly agree with Guillaume about complementary guards and custom extractors. The examples you give are just bad code.

The only case I'm on the fence about whether to warn by default is unsealed types.

Re: the globalness of the flags, I'm suggesting the flags should not exist at all — they should be the defaults, and if someone doesn't like that default, they can @nowarn their way out of it. They're just warnings, not errors.

I'm curious what other repo watchers think.

@SethTisue
Copy link
Member

SethTisue commented Sep 1, 2020

@dwijnand I'm sort of curious if your thinking changed as a result of Guillaume and my comments, or if you just decided to go with the flow. (Perhaps idle curiosity, or, idk, perhaps understanding where we arrived at on the design thinking here will help us improve the PR description.)

I suggest we wait to merge (and to finalize the PR description) until @lrytz has weighed in, since he had approved the previous version.

@SethTisue
Copy link
Member

SethTisue commented Sep 1, 2020

small thought: It would fail on the following input: _ seems like a common warning text. It feels... gnomic. Is there room for improvement there? (Should it say nothing in this case, since it isn't really adding information?)

@dwijnand
Copy link
Member Author

dwijnand commented Sep 1, 2020

@dwijnand I'm sort of curious if your thinking changed as a result of Guillaume and my comments, or if you just decided to go with the flow. (Perhaps idle curiosity, or, idk, perhaps understanding where we arrived at on the design thinking here will help us improve the PR description.)

My heart/desire was (and still is) that exhaustivity were strict by default but I was on the fence. Yours and Guillaume's certainty was enough for me to go that far, so we can assess if we like the change.

Thinking about it more today I suspect that where MiMa's were actually false positives (due to lack of scala pickle metadata), here the false positives due to complementary guards/extractors I believe are easier to grasp, and the lack of clear counter-examples are unfortunate but they aren't indicators of a false positive.

small thought: It would fail on the following input: _ seems like a common warning text. It feels... gnomic. Is there room for improvement there? (Should it say nothing in this case, since it isn't really adding information?)

t7623.scala:10: warning: match may not be exhaustive.
It would fail on the following input: _
  def f = C("") match { case C(s) => }
           ^

Yeah, I think it would look less silly if that just said "match may not be exhaustive".

@lrytz
Copy link
Member

lrytz commented Sep 7, 2020

My thoughts after looking at the impact this has on tests, and after playing around for a while.

I'm happy to be asked for an @unchecked or @nowarn when some assumption is not represented in the types, like Dale's example "I know this Any is either Int or String".

I'm also happy to accept that the compiler bails on guards, buddy is a compiler after all, not a theorem prover. It's my code, and it's easy to make the match exhaustive by leaving the last guard away. As Guillaume says, it's like if-else.

However, I have an issue with spurious warnings for extractors. I think a main difference here is that extractors are typically not my code, or at least defined somewhere else than the match I'm writing. I fear that introducing spurious warnings will diminish the value of extractors for library designers.

For example, the following warns after this PR

scala> def m(s: Stream[Int]) = s match { case Stream.Empty => 0; case x #:: _ => x }
<console>:11: warning: match may not be exhaustive.
It would fail on the following input: Cons()
       def m(s: Stream[Int]) = s match { case Stream.Empty => 0; case x #:: _ => x }
                               ^

As a user, I'm not doing anything wrong here! Extractors enable a beautiful integration of libraries into the language. Pattern matching is a core language feature and the code looks really natural here. But the spurious warning really breaks the experience. It's not the same situation as with guards, where I can just leave the last one away. I need to call the extractor for it to do its job.

@smarter
Copy link
Member

smarter commented Sep 7, 2020

Yeah we have that warning on matching on LazyList/Stream, but to me it indicates that LazyList is just not a good fit for pattern-matching.

@dwijnand
Copy link
Member Author

dwijnand commented Sep 7, 2020

For example, the following warns after this PR

scala> def m(s: Stream[Int]) = s match { case Stream.Empty => 0; case x #:: _ => x }
<console>:11: warning: match may not be exhaustive.
It would fail on the following input: Cons()
       def m(s: Stream[Int]) = s match { case Stream.Empty => 0; case x #:: _ => x }
                               ^

As a user, I'm not doing anything wrong here! Extractors enable a beautiful integration of libraries into the language. Pattern matching is a core language feature and the code looks really natural here. But the spurious warning really breaks the experience. It's not the same situation as with guards, where I can just leave the last one away. I need to call the extractor for it to do its job.

Wouldn't the fix be the same as with guards?

s match { case x #:: _ => x; case _ => 0; }

@lrytz
Copy link
Member

lrytz commented Sep 7, 2020

The Stream is really just an example, I believe that the spurious warnings will affect library designers in ways that we don't want or don't expect. Extractors are a really powerful feature, they allow hooking into the syntax of pattern matching which is core to the language. I think the warnings are problematic.

@dwijnand
Copy link
Member Author

dwijnand commented Sep 7, 2020

If we go back, then I should remember to keep them strict behind -Xstrict-patmat-analysis (or as a lint).

Btw, just realised this contrast:

Seth:

The only case I'm on the fence about whether to warn by default is unsealed types.

Lukas:

I'm happy to be asked for an @unchecked or @nowarn when some assumption is not represented in the types, like Dale's example "I know this Any is either Int or String".

As implemented unsealed types require -Xlint:strict-unsealed-patmat for the checks.

@lrytz
Copy link
Member

lrytz commented Sep 7, 2020

For example, let's say for some reason I want my library to be designed like this:

sealed trait T
final class A(val x: Int) extends T
object A {
  def unapply(a: A): Option[Int] = Option(a).map(_.x)
}
final class B(val x: Int, val y: Int) extends T
object B {
  def unapply(b: B): Option[(Int, Int)] = Option(b).map(b => (b.x, b.y))
}

Then for users

scala> def f(t: T) = t match { case A(x) => x; case B(x, y) => x + y }
<console>:14: warning: match may not be exhaustive.
It would fail on the following inputs: A(), B()
       def f(t: T) = t match { case A(x) => x; case B(x, y) => x + y }
                     ^

@smarter
Copy link
Member

smarter commented Sep 7, 2020

In that example you can make your extractors return Some[Int] instead of Option[Int] to let the compiler know that they always match and no warnings is needed (Dotty at least handles this correctly).

The pattern matcher converts code like:

    Box(Option("hello")) match {
      case Box(Some(s)) =>
      case Box(None)    =>
    }

into two invocations of Box.unapply, because it can't trust that
Box.unapply is idempotent.  For case classes (like Some) it can, and
moreso it can just access the underlying values directly ("x1.value").
The exhaustivity/unreachability analysis builds on this (e.g
"pointsToBound.exists(sym => t.exists(_.symbol == sym)").

With the previous commits making the exhaustivity/unreachability checker
consider custom extractors this would result in multiple symbols for
each extraction (instead of all being "x1.value", for example) which
results in multiple SAT variables which results, basically, in  warnings
that the value returned from the first extraction could None and the
second value could be Some - which is only true for a non-idempotent
Box.unapply.  The intent of the previous commits (and the whole PR) is
to make the exhaustivity checker provide warnings that were previous
missing, but warning for fear of non-idempotent custom extractors would
produce more noise than signal.

Now, within a minor release like 2.13.4 we can't go and change the code
generation and thus the code semantics (Dotty can and apparently does),
but we can do the analysis with the assumption of idempotency.  That's
what this does: in MatchApproximator's TreeMakersToProps use a cache for
the binders so when we see two extractions that are equivalent we reuse
the same binder, so it translates into the same SAT variable.
@dwijnand dwijnand merged commit 1ac95fd into scala:2.13.x Oct 23, 2020
@dwijnand dwijnand deleted the exhaust branch October 23, 2020 15:12
@SethTisue SethTisue removed the prio:blocker release blocker (used only by core team, only near release time) label Oct 23, 2020
sjrd added a commit to sjrd/scala-js that referenced this pull request Nov 2, 2020
In 2.13.4, the pattern match exhaustivity analysis has become much
stricter. It remains active in the presence of non-sealed classes,
guards, and custom extractors. This change causes a lot of new
warnings in our codebase.

See scala/scala#9140

We address most warnings with `@unchecked`, but for some we throw
more specialized exceptions instead.
@smarter
Copy link
Member

smarter commented Nov 2, 2020

For the record: When looking at this PR previously I missed that it turned on exhaustiveness checking for unsealed types by default (Dotty doesn't). I think this is a bigger and more controversial change than anything else in this PR (besides the new warnings, this could also have a performance impact since many more expressions will be checked by default). We discussed this at the Dotty meeting today (Dale was present) and Martin was pretty strongly against this being the default. We also touched on the fact that there's no way in source to determine which match expressions are checked or not which is unfortunate, and agreed this topic in general warranted further discussion.

@dwijnand dwijnand changed the title Exhaustivity of extractors/guards/unsealed but not try/catch Check exhaustivity of "if" guards and custom extractors (by default) and unsealed types (by opt-in). Nov 3, 2020
@dwijnand
Copy link
Member Author

dwijnand commented Nov 3, 2020

Flipping the default in #9299.

@dwijnand dwijnand changed the title Check exhaustivity of "if" guards and custom extractors (by default) and unsealed types (by opt-in). Check exhaustivity of pattern matches with "if" guards and custom extractors (by default) and of unsealed types (by opt-in). Nov 3, 2020
sjrd added a commit to sjrd/scala-js that referenced this pull request Nov 5, 2020
In 2.13.4, the pattern match exhaustivity analysis has become
stricter. It remains active in the presence of guards and custom
extractors (including `unapplySeq`). This change causes some new
warnings in our codebase, which this commit addresses.

See scala/scala#9140, which was somewhat
toned down by scala/scala#9299.
sjrd added a commit to sjrd/scala-js that referenced this pull request Nov 7, 2020
In 2.13.4, the pattern match exhaustivity analysis has become
stricter. It remains active in the presence of guards and custom
extractors (including `unapplySeq`). This change causes some new
warnings in our codebase, which this commit addresses.

See scala/scala#9140, which was somewhat
toned down by scala/scala#9299.
dongjoon-hyun added a commit to apache/spark that referenced this pull request Nov 24, 2020
### What changes were proposed in this pull request?

This PR aims the followings.
1. Upgrade from Scala 2.13.3 to 2.13.4 for Apache Spark 3.1
2. Fix exhaustivity issues in both Scala 2.12/2.13 (Scala 2.13.4 requires this for compilation.)
3. Enforce the improved exhaustive check by using the existing Scala 2.13 GitHub Action compilation job.

### Why are the changes needed?

Scala 2.13.4 is a maintenance release for 2.13 line and improves JDK 15 support.
- https://github.com/scala/scala/releases/tag/v2.13.4

Also, it improves exhaustivity check.
- scala/scala#9140 (Check exhaustivity of pattern matches with "if" guards and custom extractors)
- scala/scala#9147 (Check all bindings exhaustively, e.g. tuples components)

### Does this PR introduce _any_ user-facing change?

Yep. Although it's a maintenance version change, it's a Scala version change.

### How was this patch tested?

Pass the CIs and do the manual testing.
- Scala 2.12 CI jobs(GitHub Action/Jenkins UT/Jenkins K8s IT) to check the validity of code change.
- Scala 2.13 Compilation job to check the compilation

Closes #30455 from dongjoon-hyun/SCALA_3.13.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
xuwei-k added a commit to folone/poi.scala that referenced this pull request Nov 27, 2020
xuwei-k added a commit to unfiltered/unfiltered that referenced this pull request Nov 29, 2020
@@ -15,7 +15,7 @@ package scala
/** An annotation to designate that the annotated entity
* should not be considered for additional compiler checks.
* Specific applications include annotating the subject of
* a match expression to suppress exhaustiveness warnings, and
* a match expression to suppress exhaustiveness and reachability warnings, and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Old PR, but this doesn't seem to be the case in Scala 2? You also mention it in #9140 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-notes worth highlighting in next release notes
Projects
None yet
8 participants