Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MINOR: [Java] Reduce enum array allocation #41533

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

sullis
Copy link
Contributor

@sullis sullis commented May 3, 2024

Rationale for this change

reduce Java object allocation

context: https://www.gamlor.info/wordpress/2017/08/javas-enum-values-hidden-allocations/

What changes are included in this PR?

cache ArrowFlightConnectionProperty.values() in a static variable.

Are these changes tested?

ArrowFlightConnectionConfigImplTest unit test passes.

Are there any user-facing changes?

no

@@ -38,6 +38,9 @@
* A {@link ConnectionConfig} for the {@link ArrowFlightConnection}.
*/
public final class ArrowFlightConnectionConfigImpl extends ConnectionConfigImpl {
private static final ArrowFlightConnectionProperty[] ARROW_FLIGHT_CONNECTION_PROPERTY_VALUES
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit) Maybe we can use a truly immutable type like EnumSet + Collections#unmodifiableSet() instead of an array.

@github-actions github-actions bot added awaiting committer review Awaiting committer review and removed awaiting review Awaiting review labels May 3, 2024
@kou
Copy link
Member

kou commented May 3, 2024

Could you open a new issue instead of using MINOR:?
See also our MINOR definition: https://github.com/apache/arrow/blob/main/CONTRIBUTING.md#Minor-Fixes

@jduo
Copy link
Member

jduo commented May 3, 2024

I'm curious -- did we observe a large number of allocations here? It looks to me like this is only called once-per-JDBC-connection. Perhaps this is a connection pooling scenario?

We may want to optimize body of the noneMatch() call to use a case-insensitive TreeSet if this is called frequently (instead of searching linearly for each key as we're currently doing).

@laurentgo
Copy link
Contributor

We may want to optimize body of the noneMatch() call to use a case-insensitive TreeSet if this is called frequently (instead of searching linearly for each key as we're currently doing).

I agree this is likely to be of greater benefit. I would argue that instead of using TreeSet, the original enum values could be stored in upper case (which is already the case but better safe than sorry) and instead convert each header to uppercase once as well and check the set. This is likely to avoid extra comparisons/case conversion to be done when navigating the tree.
I also remember something about http/2 (used by gRPC) about headers possibly normalized as lower-case which (if true) could also be a better path for optimization

@lidavidm
Copy link
Member

lidavidm commented May 7, 2024

Yes, http/2 normalizes to lower-case headers so that would be best

https://datatracker.ietf.org/doc/html/rfc9113#section-8.2-2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants