Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ESQL: Begin optimizing Block#lookup #108482

Merged
merged 3 commits into from
May 10, 2024
Merged

Conversation

nik9000
Copy link
Member

@nik9000 nik9000 commented May 9, 2024

This creates the infrastructure to allow optimizing the lookup method when applied to Vectors and then implements that optimization for constant vectors. Constant vectors now take one of six paths:

  1. An empty positions Block yields an empty result set.
  2. If positions is a Block, perform the un-optimized lookup.
  3. If the min of the positions Vector is less that 0 then throw an exception.
  4. If the min of the positions Vector is greater than the number of positions in the lookup block then return a single ConstantNullBlock because you are looking up outside the range.
  5. If the max of the positions Vector is less than the number of positions in the lookup block then return a Constant$Type$Block with the same value as the lookup block. This is a lookup that's entirely within range.
  6. Otherwise return the unoptimized lookup.

This is fairly simple but demonstrates how we can plug in more complex optimizations later.

This creates the infrastructure to allow optimizing the `lookup` method
when applied to `Vector`s and then implements that optimization for
constant vectors. Constant vectors now take one of six paths:
1. An empty positions `Block` yields an empty result set.
2. If `positions` is a `Block`, perform the un-optimized lookup.
3. If the `min` of the `positions` *Vector* is less that 0 then throw an
   exception.
4. If the `min` of the positions Vector is greater than the number of
   positions in the lookup block then return a single
   `ConstantNullBlock` because you are looking up outside the range.
5. If the `max` of the positions Vector is less than the number of
   positions in the lookup block then return a `Constant$Type$Block`
   with the same value as the lookup block. This is a lookup that's
   entirely within range.
6. Otherwise return the unoptimized lookup.

This is *fairly* simple but demonstrates how we can plug in more complex
optimizations later.
@nik9000 nik9000 requested a review from dnhatn May 9, 2024 18:48
@nik9000 nik9000 requested a review from a team as a code owner May 9, 2024 18:48
@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label May 9, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

Copy link
Member

@rjernst rjernst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Core lib change looks good to me, one small suggestion.


@Override
public T next() {
return null;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could assert false here? next should never be called in this case

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good!

Copy link
Member

@dnhatn dnhatn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks Nik!

@nik9000 nik9000 merged commit 04d3b99 into elastic:main May 10, 2024
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL >non-issue Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.15.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants