Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-16328: [Java] Proposal to upgrade Arrow Java project to JPMS Java Platform Module System #13072

Closed
wants to merge 30 commits into from

Conversation

davisusanibar
Copy link
Contributor

@davisusanibar davisusanibar commented May 5, 2022

Proposal to upgrade Arrow Java to Module System.

Please note: This PR is not consider to be part of next release 10.0.0, the plan is to test this on 11.0.0 version.

Current status

  1. Apache Arrow Java is compiled and tested by JDK 8, 11, 17, 18 but packaged with source/target JDK8

Changes implemented by this PR

  1. Apache Arrow Java project are continuing compiled and tested by JDK 8, 11, 17, 18 but packaged with source/target JDK8
  2. If Apache Arrow Java project is building with JDK11+ validate the project and test it against the module-info.java definitions
  3. There are not CI implementation to deliver Jar library with JDK8 & JDK11+ (we need to define what will be our next step for JDK versions)
  4. We probably are breaking APIs because we are changing package name to ensure unique package names on the project required by module-info.java
  5. We are added a new project that contains Apache Arrow Netty custom implementation needed for
  6. There are remaining modules pending to implement modules with module-info.java files. This PR contains modules-info.java for: Format / Memory / Vector
  7. Is pending to review errors at Integration testing

Next steps

  1. Implement cross-compilation with JDK 11+ and packaging with release=8 (continue offering support for JDK8):
    That is not posible, if we compile with --release=8 we will have problems to access sun.misc.Unsafe that is not as exposed/public/documented API for N >= 9: -source N -target N --system . (https://openjdk.org/jeps/247)

  2. We have these options for our next steps:

  • Option 1.- Maintain all as it is now, continue packaging with source/target JDK8

  • Option 2.- Change/Update/Modify use of sun.misc.* classes dependencies to be able to support cross-compilation (--release 8) to packaging with JDK11+ and continue offering support for JDK8 also

jdeps --jdk-internals arrow-memory-unsafe-10.0.0-SNAPSHOT.jar
arrow-memory-unsafe-10.0.0-SNAPSHOT.jar -> jdk.unsupported
   org.apache.arrow.memory.unsafe.UnsafeAllocationManager -> sun.misc.Unsafe                                    JDK internal API (jdk.unsupported)

Warning: JDK internal APIs are unsupported and private to JDK implementation that are
subject to be removed or changed incompatibly and could break your application.
Please modify your code to eliminate dependence on any JDK internal APIs.
For the most recent update on JDK internal API replacements, please check:
https://wiki.openjdk.java.net/display/JDK8/Java+Dependency+Analysis+Tool

JDK Internal API                         Suggested Replacement
----------------                         ---------------------
sun.misc.Unsafe                          See http://openjdk.java.net/jeps/260

If you decided to push some of this options please use that ML to move on next steps.


Detailed steps for current implementation

JPMS Planning:

  • Migrate Arrow Java Format

  • Migrate Arrow Java Memory Core

  • Migrate Arrow Java Memory Netty
    Not possible to apply module system for the package io.netty.buffer, it was needed to create another module memory-netty-buffer-patch and patch this package with our arrow java io.netty.buffer code.

  • Migrate Arrow Java Memory Unsafe
    TODO: Need to refactor protected access methods from unit test.

  • Migrate Arrow Java Vector
    TODO: Need to refactor protected access methods from unit test.

  • Migrate Arrow Java Flight
    Blocker: Current grpc-api and grpc-context are not modularized and cannot be used as a module because both of them has the same package named io.grpc

  • JPMS Java Platform Module System

    Java modules enforced new restrictions on existing code:

    • Package need to be uniqueness

      • There are some modules with the same package name (i.e. org.apache.arrow.memory / io.netty.buffer) that is consumed by another module with the same package name to access protected methods.
        • Java Memory module problems:
          • Core / Netty / Unsafe has the same package name org.apache.arrow.memory and there are not problems to consume protected methods, if we change the name of the package we need to define a way about how Netty module could consume for example Core protected methods.
        • Java Netty extends io.netty.buffer capabilities:
          • Is needed to create a new module arrow-memory-netty-buffer-patch to add new Java functionalities added and this need to be consumed by Arrow Memory Netty.
    • Dependencies also need to be modularized:

      • Current jar dependencies need to be modularized to be consumed as a module if not there are no way to convert our project to module if any dependencies are not modularized before.
        • Arrow Format: [x]
        • Arrow Memory: [x]
        • Arrow Vector: [x]
        • Arrow Flight:
          • Blocker: Current grpc-api and grpc-context are not modularized and cannot be used as a module because both of them has the same package named io.grpc.

    Summarize changes and benefits to migrate Arrow Java with Module System:

    • JPMS offer Strong encapsulation, Well-defined interfaces, Explicit dependencies. (3)(4)
    • JPMS offer reliable configuration and security to hide platform internals.
    • JPMS offer a partial solution to solve problema about read (80%) /write (20%) code.
    • JPMS offer optimization for readability about read/write ratio (90/10) thru module-info.java.
    • Consistency logs, JPMS implement consistency logs to really use that to solve the current problem.
    • Be able to customize JRE needed with only modules needed (not java.desktop for example and others) thru JLink.
    • Modules is also been implemented by other language such as Javascript (ES2015), C++(C++20), Net (Nuget/NetCore) and this is some bus that at some time we could onboard.
    • Consider tho take a lokk at this discussion about pro/cons (5).

(3): https://nipafx.dev/java-modules-reflection-vs-encapsulation/
(4): https://github.com/nipafx/demo-jigsaw-reflection
(5): https://www.reddit.com/r/java/comments/okt3j3/do_you_use_jigsaw_modules_in_your_java_projects/

@github-actions
Copy link

github-actions bot commented May 5, 2022

@github-actions
Copy link

github-actions bot commented May 5, 2022

⚠️ Ticket has no components in JIRA, make sure you assign one.

@davisusanibar davisusanibar changed the title ARROW-16328: [Java] Initial draft to experiment with java module system ARROW-16328: [Java] Proposal to upgrade Arrow Java project to JEP 261 Module System May 10, 2022
@davisusanibar davisusanibar changed the title ARROW-16328: [Java] Proposal to upgrade Arrow Java project to JEP 261 Module System ARROW-16328: [Java] Proposal to upgrade Arrow Java project to JPMS Java Platform Module System May 11, 2022
<artifactId>maven-compiler-plugin</artifactId>
<configuration>
<compilerArgs combine.children="append">
<arg>--patch-module=io.netty.buffer=/Users/dsusanibar/.m2/repository/org/apache/arrow/memory-netty-buffer-patch/8.0.0-SNAPSHOT/memory-netty-buffer-patch-8.0.0-SNAPSHOT.jar</arg>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use ${settings.localRepository} instead of hardcoding the user

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, updated

java/pom.xml Outdated
<dep.hadoop.version>2.7.1</dep.hadoop.version>
<dep.fbs.version>1.12.0</dep.fbs.version>
<dep.avro.version>1.10.0</dep.avro.version>
<arrow.vector.classifier />
<forkCount>2</forkCount>
<forkCount>0</forkCount>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this related to JPMS ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be reviewed. Value more than "0" produce: NoClassDefFound Could not initialize class org.apache.arrow.memory.RootAllocator

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to 2

@jduo jduo requested a review from lidavidm as a code owner November 22, 2023 23:10
jduo added a commit to Bit-Quill/arrow that referenced this pull request Nov 24, 2023
Based on apache#13072
- Avoid having multiple memory modules contribute to the same package
- Introduce memory-netty-buffer-patch module for patching classes
into Netty modules
- Patch classes into netty's io.netty.buffer package (specifically use
the JDK9 compiler to do this instead of inheriting JDK8).
- Avoid using BaseAllocator in tests and use RootAllocator instead
- Move TestBaseAllocator#testMemoryUsage() to a new test in
memory-netty TestNettyAllocator because TestBaseAllocator is now in
memory-core, but that specific test has Netty dependencies.
@jduo
Copy link
Member

jduo commented Nov 24, 2023

I found this PR too difficult to rebase (there have been several changes to Netty and Memory classes and tests in 2023) so I started a new one based on this work at #38876)

@jduo
Copy link
Member

jduo commented Nov 24, 2023

This work has been migrated to #38876.

@jduo jduo closed this Nov 24, 2023
jduo added a commit to Bit-Quill/arrow that referenced this pull request Nov 26, 2023
Based on apache#13072
- Avoid having multiple memory modules contribute to the same package
- Introduce memory-netty-buffer-patch module for patching classes
into Netty modules
- Patch classes into netty's io.netty.buffer package (specifically use
the JDK9 compiler to do this instead of inheriting JDK8).
- Avoid using BaseAllocator in tests and use RootAllocator instead
- Move TestBaseAllocator#testMemoryUsage() to a new test in
memory-netty TestNettyAllocator because TestBaseAllocator is now in
memory-core, but that specific test has Netty dependencies.
jduo added a commit to Bit-Quill/arrow that referenced this pull request Nov 28, 2023
Based on apache#13072
- Avoid having multiple memory modules contribute to the same package
- Introduce memory-netty-buffer-patch module for patching classes
into Netty modules
- Patch classes into netty's io.netty.buffer package (specifically use
the JDK9 compiler to do this instead of inheriting JDK8).
- Avoid using BaseAllocator in tests and use RootAllocator instead
- Move TestBaseAllocator#testMemoryUsage() to a new test in
memory-netty TestNettyAllocator because TestBaseAllocator is now in
memory-core, but that specific test has Netty dependencies.
<artifactId>maven-compiler-plugin</artifactId>
<configuration>
<compilerArgs combine.children="append">
<arg>--patch-module=io.netty.buffer=${project.basedir}/../memory-netty-buffer-patch/target/arrow-memory-netty-buffer-patch-${project.version}.jar</arg>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davisusanibar , I'm having problems getting this step to run in the new PR and am not entirely sure if it is necessary.

The new PR just does a pure syntax compilation of module-info.java so it doesn't verify the dependencies listed and just uses the classpath to look for classes from arrow-memory-netty-buffer-patch.

Was this step added so that this module would correctly find classes in io.netty.buffer defined in arrow-memory-netty-buffer-patch at build-time only, or does this have an effect at run time? My reading online is that even if we used patch-module to compile, a user utilizing the module would still need to use patch-module to specify module extensions to netty.

If patch-module is only used here to help get this JAR to compile, it seems like it isn't necessary in the new PR since the plugin is compiling module-info.java without verifying its imports.

jduo added a commit to Bit-Quill/arrow that referenced this pull request Nov 28, 2023
Based on apache#13072
- Avoid having multiple memory modules contribute to the same package
- Introduce memory-netty-buffer-patch module for patching classes
into Netty modules
- Patch classes into netty's io.netty.buffer package (specifically use
the JDK9 compiler to do this instead of inheriting JDK8).
- Avoid using BaseAllocator in tests and use RootAllocator instead
- Move TestBaseAllocator#testMemoryUsage() to a new test in
memory-netty TestNettyAllocator because TestBaseAllocator is now in
memory-core, but that specific test has Netty dependencies.
jduo added a commit to Bit-Quill/arrow that referenced this pull request Nov 28, 2023
Based on apache#13072
- Avoid having multiple memory modules contribute to the same package
- Introduce memory-netty-buffer-patch module for patching classes
into Netty modules
- Patch classes into netty's io.netty.buffer package (specifically use
the JDK9 compiler to do this instead of inheriting JDK8).
- Avoid using BaseAllocator in tests and use RootAllocator instead
- Move TestBaseAllocator#testMemoryUsage() to a new test in
memory-netty TestNettyAllocator because TestBaseAllocator is now in
memory-core, but that specific test has Netty dependencies.
jduo added a commit to Bit-Quill/arrow that referenced this pull request Nov 28, 2023
Based on apache#13072
- Avoid having multiple memory modules contribute to the same package
- Introduce memory-netty-buffer-patch module for patching classes
into Netty modules
- Patch classes into netty's io.netty.buffer package (specifically use
the JDK9 compiler to do this instead of inheriting JDK8).
- Avoid using BaseAllocator in tests and use RootAllocator instead
- Move TestBaseAllocator#testMemoryUsage() to a new test in
memory-netty TestNettyAllocator because TestBaseAllocator is now in
memory-core, but that specific test has Netty dependencies.
jduo added a commit to Bit-Quill/arrow that referenced this pull request Nov 29, 2023
Based on apache#13072
- Avoid having multiple memory modules contribute to the same package
- Introduce memory-netty-buffer-patch module for patching classes
into Netty modules
- Patch classes into netty's io.netty.buffer package (specifically use
the JDK9 compiler to do this instead of inheriting JDK8).
- Avoid using BaseAllocator in tests and use RootAllocator instead
- Move TestBaseAllocator#testMemoryUsage() to a new test in
memory-netty TestNettyAllocator because TestBaseAllocator is now in
memory-core, but that specific test has Netty dependencies.
jduo added a commit to Bit-Quill/arrow that referenced this pull request Nov 30, 2023
Based on apache#13072
- Avoid having multiple memory modules contribute to the same package
- Introduce memory-netty-buffer-patch module for patching classes
into Netty modules
- Patch classes into netty's io.netty.buffer package (specifically use
the JDK9 compiler to do this instead of inheriting JDK8).
- Avoid using BaseAllocator in tests and use RootAllocator instead
- Move TestBaseAllocator#testMemoryUsage() to a new test in
memory-netty TestNettyAllocator because TestBaseAllocator is now in
memory-core, but that specific test has Netty dependencies.
jduo added a commit to Bit-Quill/arrow that referenced this pull request Nov 30, 2023
Based on apache#13072
- Avoid having multiple memory modules contribute to the same package
- Introduce memory-netty-buffer-patch module for patching classes
into Netty modules
- Patch classes into netty's io.netty.buffer package (specifically use
the JDK9 compiler to do this instead of inheriting JDK8).
- Avoid using BaseAllocator in tests and use RootAllocator instead
- Move TestBaseAllocator#testMemoryUsage() to a new test in
memory-netty TestNettyAllocator because TestBaseAllocator is now in
memory-core, but that specific test has Netty dependencies.
jduo added a commit to Bit-Quill/arrow that referenced this pull request Nov 30, 2023
Based on apache#13072
- Avoid having multiple memory modules contribute to the same package
- Introduce memory-netty-buffer-patch module for patching classes
into Netty modules
- Patch classes into netty's io.netty.buffer package (specifically use
the JDK9 compiler to do this instead of inheriting JDK8).
- Avoid using BaseAllocator in tests and use RootAllocator instead
- Move TestBaseAllocator#testMemoryUsage() to a new test in
memory-netty TestNettyAllocator because TestBaseAllocator is now in
memory-core, but that specific test has Netty dependencies.
jduo added a commit to Bit-Quill/arrow that referenced this pull request Nov 30, 2023
Based on apache#13072
- Avoid having multiple memory modules contribute to the same package
- Introduce memory-netty-buffer-patch module for patching classes
into Netty modules
- Avoid using BaseAllocator in tests and use RootAllocator instead
- Move TestBaseAllocator#testMemoryUsage() to a new test in
memory-netty TestNettyAllocator because TestBaseAllocator is now in
memory-core, but that specific test has Netty dependencies.
jduo added a commit to Bit-Quill/arrow that referenced this pull request Dec 1, 2023
Based on apache#13072
- Avoid having multiple memory modules contribute to the same package
- Introduce memory-netty-buffer-patch module for patching classes
into Netty modules
- Avoid using BaseAllocator in tests and use RootAllocator instead
- Move TestBaseAllocator#testMemoryUsage() to a new test in
memory-netty TestNettyAllocator because TestBaseAllocator is now in
memory-core, but that specific test has Netty dependencies.
jduo added a commit to Bit-Quill/arrow that referenced this pull request Dec 5, 2023
Based on apache#13072
- Avoid having multiple memory modules contribute to the same package
- Introduce memory-netty-buffer-patch module for patching classes
into Netty modules
- Avoid using BaseAllocator in tests and use RootAllocator instead
- Move TestBaseAllocator#testMemoryUsage() to a new test in
memory-netty TestNettyAllocator because TestBaseAllocator is now in
memory-core, but that specific test has Netty dependencies.
jduo added a commit to Bit-Quill/arrow that referenced this pull request Dec 7, 2023
Based on apache#13072
- Avoid having multiple memory modules contribute to the same package
- Introduce memory-netty-buffer-patch module for patching classes
into Netty modules
- Avoid using BaseAllocator in tests and use RootAllocator instead
- Move TestBaseAllocator#testMemoryUsage() to a new test in
memory-netty TestNettyAllocator because TestBaseAllocator is now in
memory-core, but that specific test has Netty dependencies.
jduo added a commit to Bit-Quill/arrow that referenced this pull request Dec 7, 2023
Based on apache#13072
- Avoid having multiple memory modules contribute to the same package
- Introduce memory-netty-buffer-patch module for patching classes
into Netty modules
- Avoid using BaseAllocator in tests and use RootAllocator instead
- Move TestBaseAllocator#testMemoryUsage() to a new test in
memory-netty TestNettyAllocator because TestBaseAllocator is now in
memory-core, but that specific test has Netty dependencies.
jduo added a commit to Bit-Quill/arrow that referenced this pull request Dec 11, 2023
Based on apache#13072
- Avoid having multiple memory modules contribute to the same package
- Introduce memory-netty-buffer-patch module for patching classes
into Netty modules
- Avoid using BaseAllocator in tests and use RootAllocator instead
- Move TestBaseAllocator#testMemoryUsage() to a new test in
memory-netty TestNettyAllocator because TestBaseAllocator is now in
memory-core, but that specific test has Netty dependencies.
jduo added a commit to Bit-Quill/arrow that referenced this pull request Dec 11, 2023
Based on apache#13072
- Avoid having multiple memory modules contribute to the same package
- Introduce memory-netty-buffer-patch module for patching classes
into Netty modules
- Avoid using BaseAllocator in tests and use RootAllocator instead
- Move TestBaseAllocator#testMemoryUsage() to a new test in
memory-netty TestNettyAllocator because TestBaseAllocator is now in
memory-core, but that specific test has Netty dependencies.
jduo added a commit to Bit-Quill/arrow that referenced this pull request Jan 4, 2024
Based on apache#13072
- Avoid having multiple memory modules contribute to the same package
- Introduce memory-netty-buffer-patch module for patching classes
into Netty modules
- Avoid using BaseAllocator in tests and use RootAllocator instead
- Move TestBaseAllocator#testMemoryUsage() to a new test in
memory-netty TestNettyAllocator because TestBaseAllocator is now in
memory-core, but that specific test has Netty dependencies.
jduo added a commit to Bit-Quill/arrow that referenced this pull request Jan 4, 2024
Based on apache#13072
- Avoid having multiple memory modules contribute to the same package
- Introduce memory-netty-buffer-patch module for patching classes
into Netty modules
- Avoid using BaseAllocator in tests and use RootAllocator instead
- Move TestBaseAllocator#testMemoryUsage() to a new test in
memory-netty TestNettyAllocator because TestBaseAllocator is now in
memory-core, but that specific test has Netty dependencies.
jduo added a commit to Bit-Quill/arrow that referenced this pull request Jan 5, 2024
Based on apache#13072
- Avoid having multiple memory modules contribute to the same package
- Introduce memory-netty-buffer-patch module for patching classes
into Netty modules
- Avoid using BaseAllocator in tests and use RootAllocator instead
- Move TestBaseAllocator#testMemoryUsage() to a new test in
memory-netty TestNettyAllocator because TestBaseAllocator is now in
memory-core, but that specific test has Netty dependencies.
jduo added a commit to Bit-Quill/arrow that referenced this pull request Jan 8, 2024
Based on apache#13072
- Avoid having multiple memory modules contribute to the same package
- Introduce memory-netty-buffer-patch module for patching classes
into Netty modules
- Avoid using BaseAllocator in tests and use RootAllocator instead
- Move TestBaseAllocator#testMemoryUsage() to a new test in
memory-netty TestNettyAllocator because TestBaseAllocator is now in
memory-core, but that specific test has Netty dependencies.
jduo added a commit to Bit-Quill/arrow that referenced this pull request Jan 8, 2024
Based on apache#13072
- Avoid having multiple memory modules contribute to the same package
- Introduce memory-netty-buffer-patch module for patching classes
into Netty modules
- Avoid using BaseAllocator in tests and use RootAllocator instead
- Move TestBaseAllocator#testMemoryUsage() to a new test in
memory-netty TestNettyAllocator because TestBaseAllocator is now in
memory-core, but that specific test has Netty dependencies.
jduo added a commit to Bit-Quill/arrow that referenced this pull request Jan 9, 2024
Based on apache#13072
- Avoid having multiple memory modules contribute to the same package
- Introduce memory-netty-buffer-patch module for patching classes
into Netty modules
- Avoid using BaseAllocator in tests and use RootAllocator instead
- Move TestBaseAllocator#testMemoryUsage() to a new test in
memory-netty TestNettyAllocator because TestBaseAllocator is now in
memory-core, but that specific test has Netty dependencies.
jduo added a commit to Bit-Quill/arrow that referenced this pull request Jan 10, 2024
Based on apache#13072
- Avoid having multiple memory modules contribute to the same package
- Introduce memory-netty-buffer-patch module for patching classes
into Netty modules
- Avoid using BaseAllocator in tests and use RootAllocator instead
- Move TestBaseAllocator#testMemoryUsage() to a new test in
memory-netty TestNettyAllocator because TestBaseAllocator is now in
memory-core, but that specific test has Netty dependencies.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants