New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-35496][BUILD] Upgrade Scala to 2.13.7 #32648
Conversation
@@ -283,7 +283,7 @@ private[spark] class DiskBlockObjectWriter( | |||
} | |||
|
|||
// For testing | |||
private[spark] override def flush(): Unit = { | |||
override def flush(): Unit = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[ERROR] [Error] /spark/core/src/main/scala/org/apache/spark/storage/DiskBlockObjectWriter.scala:286: weaker access privileges in overriding
def flush(): Unit (defined in class OutputStream)
override should be public
@@ -352,7 +352,7 @@ class SparkContext(config: SparkConf) extends Logging { | |||
|
|||
// Thread Local variable that can be used by users to pass information down the stack | |||
protected[spark] val localProperties = new InheritableThreadLocal[Properties] { | |||
override protected def childValue(parent: Properties): Properties = { | |||
override def childValue(parent: Properties): Properties = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[ERROR] [Error] /spark/core/src/main/scala/org/apache/spark/SparkContext.scala:355: weaker access privileges in overriding
private[package lang] def childValue(x$1: java.util.Properties): java.util.Properties (defined in class ThreadLocal)
override should at least be private[lang]
@@ -81,6 +81,7 @@ class Iso8601DateFormatter( | |||
try { | |||
formatter | |||
} catch checkLegacyFormatter(pattern, legacyFormatter.validatePatternString) | |||
() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[ERROR] [Error] /spark/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateFormatter.scala:83: type mismatch;
found : PartialFunction[Throwable,java.time.format.DateTimeFormatter]
required: Throwable => Unit
[INFO] [Info] : PartialFunction[Throwable,java.time.format.DateTimeFormatter] <: Throwable => Unit?
[INFO] [Info] : false
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This aligns with Scala 3. I guess the Unit
here comes from the method result type. I'll ask whether that is desirable; arguably it could insert the parens automatically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thx ~ @som-snytt
@@ -107,6 +107,7 @@ class Iso8601TimestampFormatter( | |||
try { | |||
formatter | |||
} catch checkLegacyFormatter(pattern, legacyFormatter.validatePatternString) | |||
() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[ERROR] [Error] /spark/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/TimestampFormatter.scala:109: type mismatch;
found : PartialFunction[Throwable,java.time.format.DateTimeFormatter]
required: Throwable => Unit
[INFO] [Info] : PartialFunction[Throwable,java.time.format.DateTimeFormatter] <: Throwable => Unit?
[INFO] [Info] : false
@@ -28,7 +28,7 @@ object EpochTracker { | |||
// update the underlying AtomicLong as it finishes epochs. Other code should only read the value. | |||
private val currentEpoch: InheritableThreadLocal[AtomicLong] = { | |||
new InheritableThreadLocal[AtomicLong] { | |||
override protected def childValue(parent: AtomicLong): AtomicLong = { | |||
override def childValue(parent: AtomicLong): AtomicLong = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[ERROR] [Error] /spark/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/EpochTracker.scala:31: weaker access privileges in overriding
private[package lang] def childValue(x$1: java.util.concurrent.atomic.AtomicLong): java.util.concurrent.atomic.AtomicLong (defined in class ThreadLocal)
override should at least be private[lang]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that this change is due to scala/scala#9542
@@ -222,13 +222,13 @@ class SQLQueryTestSuite extends QueryTest with SharedSparkSession with SQLHelper | |||
// Create a test case to ignore this case. | |||
ignore(testCase.name) { /* Do nothing */ } | |||
} else testCase match { | |||
case udfTestCase: UDFTest | |||
case udfTestCase: SQLQueryTestSuite#UDFTest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[ERROR] [Error] /spark/sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala:225: The outer reference in this type test cannot be checked at run time.
[ERROR] [Error] /spark/sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala:357: The outer reference in this type test cannot be checked at run time.
[ERROR] [Error] /spark/sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala:363: The outer reference in this type test cannot be checked at run time.
[ERROR] [Error] /spark/sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala:371: The outer reference in this type test cannot be checked at run time.
[ERROR] [Error] /spark/sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala:414: The outer reference in this type test cannot be checked at run time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@som-snytt It seems that this compilation fix is related scala/scala#9504.
So I have another question to ask you for help, UDFTest
is an internal trait
defined in SQLQueryTestSuite
, must full path be used for this case in future Scala versions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a precise answer, but I would expect ongoing improvements and also some noise. You can use -Wconf
to suppress (or escalate) the message. I will try to take another look tomorrow; I haven't followed the pattern matching changes yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, I tried to suppress this behavior in the past, but it seems that this is a compilation error now, I'll try it again
This patch is still self testing |
@@ -509,7 +509,7 @@ case class NoCloseColumnVector(wrapped: ColumnVector) extends ColumnVector(wrapp | |||
|
|||
override def getBinary(rowId: Int): Array[Byte] = wrapped.getBinary(rowId) | |||
|
|||
override protected def getChild(ordinal: Int): ColumnVector = wrapped.getChild(ordinal) | |||
override def getChild(ordinal: Int): ColumnVector = wrapped.getChild(ordinal) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/spark/sql/core/src/test/scala/org/apache/spark/sql/SparkSessionExtensionSuite.scala:512:26
weaker access privileges in overriding
def getChild(x$1: Int): org.apache.spark.sql.vectorized.ColumnVector (defined in class ColumnVector)
override should be public
override protected def getChild(ordinal: Int): ColumnVector = wrapped.getChild(ordinal)
@@ -107,10 +107,10 @@ class ThriftServerQueryTestSuite extends SQLQueryTestSuite with SharedThriftServ | |||
} | |||
|
|||
testCase match { | |||
case _: PgSQLTest => | |||
case _: SQLQueryTestSuite#PgSQLTest => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/spark/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/ThriftServerQueryTestSuite.scala:110:15
The outer reference in this type test cannot be checked at run time.
case _: PgSQLTest =>
statement.execute(s"SET ${SQLConf.ANSI_ENABLED.key} = true") | ||
statement.execute(s"SET ${SQLConf.LEGACY_INTERVAL_ENABLED.key} = true") | ||
case _: AnsiTest => | ||
case _: SQLQueryTestSuite#AnsiTest => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/spark/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/ThriftServerQueryTestSuite.scala:113:15
The outer reference in this type test cannot be checked at run time.
case _: AnsiTest =>
Kubernetes integration test starting |
maven build can pass, but sbt build failed
It looks like we need to wait for |
Kubernetes integration test status success |
Test build #138862 has finished for PR 32648 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, @LuciferYang .
BTW, could you summarize the required changes in the PR description?
It seems that those changes will affect the end-users Spark application, too.
Yes, genjavadoc
is usually ready after a few days asynchronously.
It looks like we need to wait for genjavadoc to release a new version
OK, I will do this later, it seems there are some UTs in Scala 2.13 failed, I'll fix these first. |
…to [D in MultilabelSummarizer.add
@@ -178,7 +178,11 @@ private[evaluation] class MultilabelSummarizer extends Serializable { | |||
* @return This MultilabelSummarizer object. | |||
*/ | |||
def add(predictions: Array[Double], labels: Array[Double]): this.type = { | |||
val intersection = predictions.intersect(labels) | |||
val intersection = if (predictions.isEmpty || labels.isEmpty) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Scala 2.13.5
Welcome to Scala 2.13.5 (OpenJDK 64-Bit Server VM, Java 1.8.0_232).
Type in expressions for evaluation. Or try :help.
scala> Array.empty[Double].intersect(Array(0.0))
val res0: Array[Double] = Array()
scala> Array(0.0).intersect(Array.empty[Double])
val res1: Array[Double] = Array()
but in Scala 2.13.6
Welcome to Scala 2.13.6 (OpenJDK 64-Bit Server VM, Java 1.8.0_232).
Type in expressions for evaluation. Or try :help.
scala> Array.empty[Double].intersect(Array(0.0))
java.lang.ClassCastException: [Ljava.lang.Object; cannot be cast to [D
... 32 elided
scala> Array(0.0).intersect(Array.empty[Double])
java.lang.ClassCastException: [Ljava.lang.Object; cannot be cast to [D
... 32 elided
Array()
is replaced by throw ClassCastException
, this change is uncomfortable ... @dongjoon-hyun @maropu do you think this is a bug of Scala 2.13.6?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I submitted an issue to consult the Scala community scala/bug#12403
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for filing the upstream issue. Ya, it looks like a breaking behavior change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As @LuciferYang mentioned at the Scala community, I also confirmed that only Scala 2.13.6 has this bug unlike Scala 2.13.5/3.0.0.
scala3-3.0.0:$ bin/scala
scala> Array.empty[Double].intersect(Array(0.0))
val res0: Array[Double] = Array()
scala-2.13.6:$ bin/scala
Welcome to Scala 2.13.6 (OpenJDK 64-Bit Server VM, Java 1.8.0_292).
Type in expressions for evaluation. Or try :help.
scala> Array.empty[Double].intersect(Array(0.0))
java.lang.ClassCastException: [Ljava.lang.Object; cannot be cast to [D
... 32 elided
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, we had better skip Scala 2.13.6.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dongjoon-hyun agree with you. we'd better skip this version
Kubernetes integration test starting |
Kubernetes integration test status success |
Test build #138902 has finished for PR 32648 at commit
|
close this pr and skip Scala 2.13.6 because of scala/bug#12403, thanks @maropu @dongjoon-hyun @som-snytt |
Test build #144838 has finished for PR 32648 at commit
|
Kubernetes integration test status failure |
Kubernetes integration test status failure |
Test build #144843 has finished for PR 32648 at commit
|
Test build #144844 has finished for PR 32648 at commit
|
The full UTs passed with Scala 2.13.7 and let me verify the UTs again for double check, will give feedback later |
Thank you, @LuciferYang ! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM.
@srowen @dongjoon-hyun Run the UTs twice using the following command and passed.
|
but if we run
or
There are 3 tests failed in
The error stack is
And I don't think this is related to the current pr. this problem can be reproduced in the master. I will create new a PR to investigate it |
Hm, that's weird. So this is a pre-existing problem, appears with 2.12 too? I wonder why we haven't seen it before though. |
Thank you for reporting, @LuciferYang . I also agree with @srowen 's guess. |
@srowen @dongjoon-hyun Yes, master branch using Scala 2.12 and Scala 2.13 will have the same problem. Our Jenkins maven job always executes I think the key point here is To prove this I did the following tests:
Then why does this case rely on |
And I upload |
2481e1d merge master into this pr |
Kubernetes integration test starting |
Kubernetes integration test status failure |
Test build #144916 has finished for PR 32648 at commit
|
@srowen @dongjoon-hyun Manual test 2481e1d with Scala 2.13 passed
|
Merged to master for Apache Spark 3.3. |
Thank you, @LuciferYang , @srowen , @som-snytt , @sarutak . |
thanks all |
What changes were proposed in this pull request?
This pr aims to update from Scala 2.13.5 to Scala 2.13.7 for Apache Spark 3.3. The main change of this pr as follows:
For fix
weaker access privileges in overriding
, remove the access modifier different from the parent class declared inSparkContext
,DiskBlockObjectWriter
,EpochTracker
andSparkSessionExtensionSuite
Add explicit
()
to returnUnit
type forDateTimeFormatter.validatePatternString
andTimestampFormatter.validatePatternString
to avoidtype mismatch
Add 3 new compile args to
pom.xml
andSparkBuild.scala
to suppress some compilation warnings that are converted to compilation errorWhy are the changes needed?
Scala 2.13.7 is a maintenance release for 2.13 line and the release notes as follows:
Does this PR introduce any user-facing change?
No
How was this patch tested?
All tests passed.