Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-33513][BUILD] Upgrade to Scala 2.13.4 to improve exhaustivity #30455

Closed
wants to merge 6 commits into from
Closed

[SPARK-33513][BUILD] Upgrade to Scala 2.13.4 to improve exhaustivity #30455

wants to merge 6 commits into from

Conversation

dongjoon-hyun
Copy link
Member

@dongjoon-hyun dongjoon-hyun commented Nov 21, 2020

What changes were proposed in this pull request?

This PR aims the followings.

  1. Upgrade from Scala 2.13.3 to 2.13.4 for Apache Spark 3.1
  2. Fix exhaustivity issues in both Scala 2.12/2.13 (Scala 2.13.4 requires this for compilation.)
  3. Enforce the improved exhaustive check by using the existing Scala 2.13 GitHub Action compilation job.

Why are the changes needed?

Scala 2.13.4 is a maintenance release for 2.13 line and improves JDK 15 support.

Also, it improves exhaustivity check.

Does this PR introduce any user-facing change?

Yep. Although it's a maintenance version change, it's a Scala version change.

How was this patch tested?

Pass the CIs and do the manual testing.

  • Scala 2.12 CI jobs(GitHub Action/Jenkins UT/Jenkins K8s IT) to check the validity of code change.
  • Scala 2.13 Compilation job to check the compilation

@github-actions github-actions bot added the BUILD label Nov 21, 2020
@SparkQA

This comment has been minimized.

@dongjoon-hyun dongjoon-hyun changed the title [WIP][BUILD] Upgrade to Scala 2.13.4 [SPARK-33513][BUILD] Upgrade to Scala 2.13.4 Nov 21, 2020
@SparkQA

This comment has been minimized.

@SparkQA

This comment has been minimized.

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-33513][BUILD] Upgrade to Scala 2.13.4 [SPARK-33513][BUILD] Upgrade to Scala 2.13.4 to improve exhaustivity Nov 23, 2020
@SparkQA

This comment has been minimized.

@SparkQA

This comment has been minimized.

@github-actions github-actions bot added the MESOS label Nov 23, 2020
@SparkQA

This comment has been minimized.

@SparkQA

This comment has been minimized.

@SparkQA

This comment has been minimized.

@SparkQA

This comment has been minimized.

@SparkQA

This comment has been minimized.

@SparkQA

This comment has been minimized.

@SparkQA

This comment has been minimized.

@SparkQA

This comment has been minimized.

@SparkQA

This comment has been minimized.

@SparkQA

This comment has been minimized.

@dongjoon-hyun dongjoon-hyun marked this pull request as ready for review November 23, 2020 20:52
@dongjoon-hyun
Copy link
Member Author

This PR is ready for review finally.
Could you review this PR, @srowen , @HyukjinKwon , @maropu , @LuciferYang ?

@SparkQA

This comment has been minimized.

@SparkQA

This comment has been minimized.

@SparkQA

This comment has been minimized.

@SparkQA
Copy link

SparkQA commented Nov 23, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/36174/

@SparkQA
Copy link

SparkQA commented Nov 23, 2020

Test build #131573 has finished for PR 30455 at commit 7214398.

  • This patch fails SparkR unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@maropu maropu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left minor comments (& question) and the other parts look fine.

case (a, b) => a != null && a.equals(b)
case (a: Double, b: Double) if a.isNaN && b.isNaN => true
case (a: Float, b: Float) if a.isNaN && b.isNaN => true
case (a, b) => a != null && a == b
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Just a question) Are the changes above related to this PR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. It's a compilation error. We cannot use .equals for Any type.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see.

@@ -362,7 +362,7 @@ case class Join(
left.constraints
case RightOuter =>
right.constraints
case FullOuter =>
case _ =>
Copy link
Member

@maropu maropu Nov 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: for readability, how about leaving a comment like this?

case _ => // FullOuter

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ya. I thought like that, but actually, there are many missing patterns.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was difficult to track one-by-one. Logically, this is the rest of the previous patterns. So, I decided to skip that comment.

@@ -120,7 +120,7 @@ class GenericArrayData(val array: Array[Any]) extends ArrayData {
if (!o2.isInstanceOf[Double] || ! java.lang.Double.isNaN(o2.asInstanceOf[Double])) {
return false
}
case _ => if (!o1.equals(o2)) {
case _ => if (o1.getClass != o2.getClass || o1 != o2) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto: Is this change above related to this PR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. We cannot use equals and we cannot use != here due to the following.

scala> Float.NaN == Float.NaN
val res2: Boolean = false

scala> Float.NaN.equals(Float.NaN)
val res3: Boolean = true

@@ -293,7 +293,7 @@ private[streaming] object FileBasedWriteAheadLog {
val startTime = startTimeStr.toLong
val stopTime = stopTimeStr.toLong
Some(LogInfo(startTime, stopTime, file.toString))
case None =>
case None | Some(_) =>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We cannot write it like case _ => here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use like that~

@@ -757,15 +757,15 @@ private[spark] object JsonProtocol {

def taskResourceRequestMapFromJson(json: JValue): Map[String, TaskResourceRequest] = {
val jsonFields = json.asInstanceOf[JObject].obj
jsonFields.map { case JField(k, v) =>
jsonFields.collect { case JField(k, v) =>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. The compilation of map in this case fails in v2.13.4?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

@dongjoon-hyun
Copy link
Member Author

The new pattern match is strict and the master branch change can easily break it. I'll merge this first with @maropu 's approval and do the follow-up.

@dongjoon-hyun dongjoon-hyun deleted the SCALA_3.13 branch November 24, 2020 00:47
@dongjoon-hyun
Copy link
Member Author

Thank you always, @maropu !

Copy link
Member

@srowen srowen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Late LGTM

Copy link
Member

@HyukjinKwon HyukjinKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dongjoon-hyun
Copy link
Member Author

Thank you, @srowen and @HyukjinKwon .

@SparkQA
Copy link

SparkQA commented Nov 24, 2020

Test build #131590 has finished for PR 30455 at commit 7214398.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -298,7 +299,10 @@ private[ml] object RFormulaParser extends RegexParsers {
private val expr = (sum | term)

private val formula: Parser[ParsedRFormula] =
(label ~ "~" ~ expr) ^^ { case r ~ "~" ~ t => ParsedRFormula(r, t.asTerms.terms) }
(label ~ "~" ~ expr) ^^ {
case r ~ "~" ~ t => ParsedRFormula(r, t.asTerms.terms)
Copy link

@retronym retronym Dec 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For future reference, a slightly cleaner way to express this in a warning free way is:

((label <~ "~") ~ expr ) ^^ {
  case r ~ t => ParsedRFormula(r, t.asTerms.terms)
}

It also avoids duplicating the "~" token in the parser and the action.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to make a PR, @retronym . :)

MaxGekk pushed a commit that referenced this pull request Feb 6, 2022
### What changes were proposed in this pull request?
Migrate the following errors in QueryParsingErrors onto use error classes:
1. joinCriteriaUnimplementedError => throw IllegalStateException instead, since it should never happen and not visible to users, introduced by improving exhaustivity in [PR](#30455)
2. naturalCrossJoinUnsupportedError => UNSUPPORTED_FEATURE

### Why are the changes needed?
Porting join parsing errors to new error framework.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
UT added.

Closes #35405 from ivoson/SPARK-38105.

Authored-by: Tengfei Huang <tengfei.h@gmail.com>
Signed-off-by: Max Gekk <max.gekk@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants