New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrade gRPC to support Apple Silicon #13
Conversation
Hi @pan3793, thanks for the patch. According to our experience, some tests will fail on Apple Silicon AArch64 JDK. |
Run
|
Thank for your contribution, but upgrading gRPC is not very good, because client and server in product environment are not easy to be upgraded simultaneously. |
And can you run |
I don't see a causal relationship between them. If you want a seamless upgrade, you should build a compatibility test to make sure your client and server can work w/ each other in different versions, rather than pin a library version and never do upgrading. Actually, in my experience, the larger the upgrade version span, the greater the risk. The upgrading has a clear request to add native support for Apple Silicon, even w/o urgent requests, keeping the library up-to-date is a good practice.
In our testing, building Spark using Rosetta2 costs 2x time more than native. Native is always the first choice.
Yes, I'm trying. But I think it's not a blocker for merging this one. The CI shows it does not break the current functionalities. |
Constants.SHUFFLE_SERVER_VERSION should be updated to avoid version mismatch |
|
The gRPC library upgrading should not break the protocol, and checked https://github.com/grpc/grpc-java/releases, I don't see any breaking change. |
What about |
Running, will update once completed |
|
<guava.version>30.0-jre</guava.version> | ||
<grpc.version>1.47.0</grpc.version> | ||
<gson.version>2.9.0</gson.version> | ||
<guava.version>31.0.1-jre</guava.version> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why should we also bump guava and gson version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
w/o upgrading gson and guava, it will violate rules defined by maven-enforcer-plugin
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
grpc and protobuf should work well w/ the previous version, the dependency upgrading is trivial
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest to make the changes minimal to avoid unexpected issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then we should remove requireUpperBoundDeps
rule of maven-enforcer-plugin
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minimal change was tried in Tencent/Firestorm#185, but there were some issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Found and fixed some issues in Spark3 IT
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest to make the changes minimal to avoid unexpected issues.
I think it's actually the minimal change to meet requireUpperBoundDeps
requirements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Found and fixed some issues in Spark3 IT
Thanks for digging out this issue, I remember the same issue was encountered during previous attempts.
Codecov Report
@@ Coverage Diff @@
## master #13 +/- ##
=========================================
Coverage 56.83% 56.83%
Complexity 1204 1204
=========================================
Files 152 152
Lines 8401 8401
Branches 813 813
=========================================
Hits 4775 4775
Misses 3368 3368
Partials 258 258 Continue to review full report at Codecov.
|
@pan3793 Could you show the difference between and after this pr in description? |
updated |
Hm... without this pr, I test the following commands in my Apple Silicon, all commands seem to have succeeded
I use native jdk. Will this way be slower than current pr? Is this acceptable? |
I think it's a workaround, you are using Rosetta2-based |
You are still using Rosetta2-based |
@@ -60,8 +61,7 @@ | |||
<picocli.version>4.5.2</picocli.version> | |||
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding> | |||
<prometheus.simpleclient.version>0.9.0</prometheus.simpleclient.version> | |||
<protobuf.version>3.12.0</protobuf.version> | |||
<protoc.version>3.12.0</protoc.version> | |||
<protobuf.version>3.19.2</protobuf.version> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems native M1 support was introduced in v3.20.1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not exactly, it is from v3.17.3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3.17.3 is still Rosetta, there is only protoc-3.17.3-osx-x86_64.zip.
In 3.20.1, there is protoc-3.20.1-osx-aarch_64.zip.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should check the maven repo instead of GitHub release
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the link, seems they both requires Rosetta 2.
$ file protoc-3.17.3-osx-aarch_64.exe protoc-3.20.1-osx-aarch_64.exe
protoc-3.17.3-osx-aarch_64.exe: Mach-O 64-bit executable x86_64
protoc-3.20.1-osx-aarch_64.exe: Mach-O 64-bit executable x86_64
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's a funny thing, they just copy and rename the file... both file has the same md5
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The latest version is right
➜ apache-uniffle git:(master) file ~/Downloads/protoc-3.21.2-osx-aarch_64.exe
/Users/chengpan/Downloads/protoc-3.21.2-osx-aarch_64.exe: Mach-O 64-bit executable arm64
@pan3793 I think it's ok to upgrade the version of gRPC, and do you have check the shaded configuration for client-mr, client-spark2, client-spark3, is there additional package should be shaded with this PR? |
@frankliee I think it can be documented and suggest user to have the same version of client/server. |
All checks have passed, I prefer +1 for this PR. |
I think we should document this to avoid unexpected issues. |
Would you please share your testing code? I test the official hello-world demo, the high version client works well w/ low version server. |
According to gRPC Versioning Guide, there is only backward compatibility. |
I directly test uniffle client with uniffle server. See MRIntegrationTestBase |
This pr is fine to me and agree with @jerryshao if there is really a compatibility issue |
@frankliee |
Both MR and Spark leverage a unified gRPC client. You can also manually debug it by starting a uniffle server (v1.37) with uniffle client(v1.43). Testing hello world client-server examples cannot cover all APIs used by uniffle. |
Where is the "client jar" configuration in Spark IT like |
Spark IT does not have this. |
Then it's not easy for me, a newbie to the project, to build ITs that cover all gRPC APIs used by uniffle to verify the cross-version client/server compatibility and find the broken root cause. |
I try this way again, and the test passed. Please attach your failure stack traces. |
I know this is not easy because getting MR stack is difficult in local. The reproduction is complicated 1) compile this PR to get a new MR client jar, 2) save this jar to another place, 3) switch to the master branch and recompile, 4) replace the jar with the previous one. |
What I mean "not easy“ is to build a cross-version Spark IT that covers all gRPC APIs used by RSS.
It's clear to me when you said "See MRIntegrationTestBase
I would prefer to dig out the root cause before documenting the conclusion. My first round of cross-version testing failed but the following rounds were successful. And the first-round failure is because I put the new MR client jar to a temp folder, if there are some temp files deleted by other processes during testing, the test will fail w/
And if I put the client-mr jar build w/ new gRPC version to an immutable folder, then IT passed. I finally found and checked the MR tasks log, and do not see any related stacks. |
The document says:
I don't see uniffle use gRPC EXPERIMENTAL api.
Then I think there is no compatibility issue that will be introduced by upgrading gRPC deps. |
@colinmjj I check the |
The compatibility is ok. I am sorry, and I found the wrong jar is copied... |
@frankliee thanks for the confirmation. The latest thing is should we wait for gRPC 1.48 which bundles protoc 3.21.1, or get this in first? Because @kaijchen points that protoc-3.17.3 still requires Rosetta 2 (but does not bother developers). |
+1, thanks for the investigation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM from my side.
@pan3793 thanks for the contribution |
Thanks everyone for reviews. |
What changes were proposed in this pull request?
Upgrade gRPC to support Apple Silicon.
Why are the changes needed?
To support build on Apple M1 chips.
Run
mvn clean install
on master branchDoes this PR introduce any user-facing change?
Yes, gRPC, protobuf related dependencies are updated, and developers who use Apple M1 chips can build now.
How was this patch tested?