Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent column references when EXPLAINing a fast path lookup #27154

Open
ggevay opened this issue May 17, 2024 · 2 comments
Open

Inconsistent column references when EXPLAINing a fast path lookup #27154

ggevay opened this issue May 17, 2024 · 2 comments
Labels
A-CLUSTER Topics related to the CLUSTER layer A-compute Area: compute A-optimization Area: query optimization and transformation C-bug Category: something is broken

Comments

@ggevay
Copy link
Contributor

ggevay commented May 17, 2024

What version of Materialize are you using?

be1a89b

What is the issue?

The following EXPLAINs should be printing the same thing, except for whether it's fast path, but they are showing different projections:

materialize=> 
create table t1(t text, y int);
insert into t1 values ('a',6), ('b',8), ('c',10), ('c',11);

create index i1 on t1(y+1);

explain with(humanized expressions)
select * from t1
where y+1 = 7;

explain with(humanized expressions, no fast path)
select * from t1
where y+1 = 7;

CREATE TABLE
INSERT 0 4
CREATE INDEX
                        Optimized Plan                        
--------------------------------------------------------------
 Explained Query (fast path):                                +
   Project (#1, #2)                                          +
     ReadIndex on=materialize.public.t1 i1=[lookup value=(7)]+
                                                             +
 Used Indexes:                                               +
   - materialize.public.i1 (lookup)                          +
                                                             +
 Target cluster: quickstart                                  +
 
(1 row)

                        Optimized Plan                        
--------------------------------------------------------------
 Explained Query:                                            +
   Project (#0, #1)                                          +
     ReadIndex on=materialize.public.t1 i1=[lookup value=(7)]+
                                                             +
 Used Indexes:                                               +
   - materialize.public.i1 (lookup)                          +
                                                             +
 Target cluster: quickstart                                  +
 
(1 row)

The inconsistency arises because the ReadIndex in the fast path case is actually a Join under the hood, which adds an extra column at the beginning, thus offsetting the column references.

Also happening with Maps:

materialize=> 
explain with(humanized expressions)
select y+3 from t1
where y+1 = 7;

explain with(humanized expressions, no fast path)
select y+3 from t1
where y+1 = 7;

                         Optimized Plan                         
----------------------------------------------------------------
 Explained Query (fast path):                                  +
   Project (#4)                                                +
     Map ((#2{y} + 3))                                         +
       ReadIndex on=materialize.public.t1 i1=[lookup value=(7)]+
                                                               +
 Used Indexes:                                                 +
   - materialize.public.i1 (lookup)                            +
                                                               +
 Target cluster: quickstart                                    +
 
(1 row)

                         Optimized Plan                         
----------------------------------------------------------------
 Explained Query:                                              +
   Project (#3)                                                +
     Map ((#1{y} + 3))                                         +
       ReadIndex on=materialize.public.t1 i1=[lookup value=(7)]+
                                                               +
 Used Indexes:                                                 +
   - materialize.public.i1 (lookup)                            +
                                                               +
 Target cluster: quickstart                                    +
 
(1 row)

I'm not sure what's the best way to fix this:

  1. Maybe we can undo the column offsetting for the whole plan at the beginning of EXPLAIN, but this sounds a bit hacky maybe.
  2. Or just invent a different string to print instead of ReadIndex, e.g., FastPathReadIndex, and then we can just say that FastPathReadIndex emits the columns with an offset for unspecified reasons.
@ggevay ggevay added C-bug Category: something is broken A-optimization Area: query optimization and transformation A-compute Area: compute labels May 17, 2024
@aalexandrov
Copy link
Contributor

AFAICT after fixing the issue that you point out here both queries return consistently #_{y} + 3 when printing output with humanized columns. The underlying mapping of column offsets to column names changes, but this reflects the differences in the physical layout of a columns in an arrangement, so the printed indexes are still correct. I'm not sure I would classify this as a bug.

@aalexandrov
Copy link
Contributor

Created a related issue while looking at this: #27167.

@ggevay ggevay added the A-CLUSTER Topics related to the CLUSTER layer label Jun 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-CLUSTER Topics related to the CLUSTER layer A-compute Area: compute A-optimization Area: query optimization and transformation C-bug Category: something is broken
Projects
None yet
Development

No branches or pull requests

2 participants