Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] data is lost when inserting into using nereids planner when type is inconsistent(DATETIME TO DATE) #34626

Open
2 of 3 tasks
yx-keith opened this issue May 10, 2024 · 0 comments

Comments

@yx-keith
Copy link

yx-keith commented May 10, 2024

Search before asking

  • I had searched in the issues and found no similar issues.

Version

2.0.3

What's Wrong?

when I insert data into sink_table from source_table, START_CALL_TIME filed is of DATE type in sink_table and is of DATETYPE type in source_table, the inconsistent types lead to data loss when inserting data using nereids planner, but it is ok using old planner。

this is my sql:
INSERT INTO sink_table SELECT *
FROM (
SELECT DISTINCT acct_month, call_date, device_number, OPPOSE_NUMBER, org_trm_id
, START_CALL_TIME, call_duration
FROM source_table
WHERE acct_month = SUBSTR("20240131", 1, 6) AND call_date = SUBSTR("20240131", -2)
UNION
SELECT DISTINCT acct_month, call_date, OPPOSE_NUMBER AS device_number, device_number AS OPPOSE_NUMBER, org_trm_id
, START_CALL_TIME, call_duration
FROM source_table
WHERE acct_month = SUBSTR("20240131", 1, 6) AND call_date = SUBSTR("20240131", -2)
) alias_17125550933410;

correct result is 27837957 rows in old planner, but the wrong result is 27531867 rows in nereids planner.
but when change START_CALL_TIME to DATETIME type in sink_table, nereids planner is right.

What You Expected?

is there something wrong in nereids optimizer?

How to Reproduce?

Because this problem occurs in a production environment, we can mock table to get similiar plan:

sink_table:
image

source_table:
image

data in test.csv:
10000,2017-10-01,北京,20,0,2017-10-01 06:00:00,20,10,10
10000,2017-10-01,北京,20,0,2017-10-01 07:00:00,15,2,2
10001,2017-10-01,北京,30,1,2017-10-01 17:05:45,2,22,22
10002,2017-10-02,上海,20,1,2017-10-02 12:59:12,200,5,5
10003,2017-10-02,广州,32,0,2017-10-02 11:20:00,30,11,11
10004,2017-10-01,深圳,35,0,2017-10-01 10:00:15,100,3,3
10004,2017-10-03,深圳,35,0,2017-10-03 10:20:22,11,6,6

load test.csv to source_table:
curl --location-trusted -u root: -T test.csv -H "column_separator:," http://127.0.0.1:8030/api/demo/source_table/_stream_load

insert into sink_table from source_table:
insert into sink_table
select * from (
select DISTINCT user_id, date, city, age, sex,last_visit_date,cost,max_dwell_time,min_dwell_time from source_table where city='北京'
union
select DISTINCT user_id, date, city, age, sex,last_visit_date,cost,max_dwell_time,min_dwell_time from source_table where city='深圳'
) t;

nereids plan:
image
image

old planner:
image
image
image
image
image

Anything Else?

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@yx-keith yx-keith changed the title [Bug] data is lost when inserting into using nereids optimizer [Bug] data is lost when inserting into using nereids planner May 10, 2024
@yx-keith yx-keith changed the title [Bug] data is lost when inserting into using nereids planner [Bug] data is lost when inserting into using nereids planner when type is inconsistent May 10, 2024
@yx-keith yx-keith changed the title [Bug] data is lost when inserting into using nereids planner when type is inconsistent [Bug] data is lost when inserting into using nereids planner when type is inconsistent(DATETIME TO DATE) May 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant