Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reuse column index mapping from PostgresqlColumnMetadata #641

Closed
wants to merge 1 commit into from

Conversation

cty123
Copy link
Contributor

@cty123 cty123 commented Mar 3, 2024

[resolves #636]

Make sure that:

  • [ x ] You have read the contribution guidelines.
  • [ x ] You have created a feature request first to discuss your contribution intent. Please reference the feature request ticket number in the pull request.
  • [ x ] You use the code formatters provided here and have them applied to your changes. Don't submit any formatting related changes.
  • [ x ] You submit test cases (unit or integration tests) that back your changes.

Issue description

Following up on the previous change #640 , the field name look up is still not optimal for usecases where the result contains a large chunk of data. Take r2dbc example usecase,

Mono.from(connectionFactory.create())
  .flatMapMany(connection -> connection
    .createStatement("SELECT firstname FROM PERSON WHERE age > $1")
    .bind("$1", 42)
    .execute())
  .flatMap(result -> result
    .map((row, rowMetadata) -> row.get("firstname", String.class)))
  .doOnNext(System.out::println)
  .subscribe();

imagine the result set returns 1 million rows of data, because currently the row level column name index map is computed on the fly per row, the same mapping would need to be computed 1 million times, which is pretty much futile work, as we only need to compute the mapping once. Meanwhile, looking at the implementation of pgjdbc, the mapping is computed per result set.

Another small change I made is to avoid putting caches on case matches, and defaults the match to lower case. Since metadata object now owns the column name lookup cache, and each row might be processed concurrently, the cache can have concurrency problem when multiple threads want to put cache simultaneously. So I am now making it read only.

New Public APIs

Additional context

@mp911de
Copy link
Collaborator

mp911de commented Mar 4, 2024

Thanks for the suggestion. It makes totally sense to add this optimization.

@mp911de mp911de added the type: enhancement A general enhancement label Mar 4, 2024
@mp911de mp911de added this to the 1.0.5.RELEASE milestone Mar 4, 2024
@mp911de mp911de changed the title perf: add column name index map cache in the PostgresqlColumnMetadata. Reuse column index mapping from PostgresqlColumnMetadata Mar 4, 2024
mp911de added a commit that referenced this pull request Mar 4, 2024
Simplify code. Use putIfAbsent instead of custom check.

[#636][closes #641]
@mp911de mp911de closed this in 057f3f1 Mar 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: enhancement A general enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Performance issue with PostgresqlRow.getColumn(String name) when select many columns
2 participants