Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[VL] Results are mismatch with vanilla Spark when uses from_unixtime with overflowed parameter and query config setted time zone #5701

Open
NEUpanning opened this issue May 11, 2024 · 0 comments
Labels
bug Something isn't working triage

Comments

@NEUpanning
Copy link
Contributor

Backend

VL (Velox)

Bug description

SQL for reproduce:

DROP table if EXISTS tbl;
CREATE TABLE tbl(a STRING) USING parquet;
INSERT INTO tbl values('9223372036854775807');
select from_unixtime(a, "yyyy-MM-dd HH:mm:ss") from tbl;

vanilla result :

from_unixtime(CAST(a AS BIGINT), yyyy-MM-dd HH:mm:ss)
1970-01-01 07:59:59

gluten result:

Top-Level Context: Same as context.
Function: toMillis
File: ../.././velox/type/Timestamp.h
Line: 134
Stack trace:
# 0  _ZN8facebook5velox7process10StackTraceC1Ei
# 1  _ZN8facebook5velox14VeloxExceptionC1EPKcmS3_St17basic_string_viewIcSt11char_traitsIcEES7_S7_S7_bNS1_4TypeES7_
# 2  _ZN8facebook5velox6detail14veloxCheckFailINS0_14VeloxUserErrorERKSsEEvRKNS1_18VeloxCheckFailArgsET0_
# 3  _ZNK8facebook5velox9Timestamp11toTimePointEb
# 4  _ZN8facebook5velox9Timestamp10toTimezoneERKN4date9time_zoneE
# 5  _ZNK8facebook5velox9functions17DateTimeFormatter6formatERKNS0_9TimestampEPKN4date9time_zoneEjPcb
# 6  _ZNK8facebook5velox17SelectivityVector15applyToSelectedIZNS0_4exec7EvalCtx22applyToSelectedNoThrowIZNKS3_21SimpleFunctionAdapterINS0_4core9UDFHolderINS0_9functions8sparksql20FromUnixtimeFunctionINS3_10VectorExecEEESC_NS0_7VarcharEJlSE_EEEE8applyUdfIZNKSG_7iterateIJNS3_16FlatVectorReaderIlEENS3_20ConstantVectorReaderISE_EEEEEvRNSG_12ApplyContextEDpRT_EUlRT_T0_E1_EEvSO_SS_EUlSS_E_EEvRKS1_SS_EUlSS_E_EEvSS_
# 7  _ZNK8facebook5velox4exec21SimpleFunctionAdapterINS0_4core9UDFHolderINS0_9functions8sparksql20FromUnixtimeFunctionINS1_10VectorExecEEES8_NS0_7VarcharEJlSA_EEEE31unpackSpecializeForAllEncodingsILi0EJEEEvRNSC_12ApplyContextERKSt6vectorISt10shared_ptrINS0_10BaseVectorEESaISJ_EEDpRT0_
# 8  _ZNK8facebook5velox4exec21SimpleFunctionAdapterINS0_4core9UDFHolderINS0_9functions8sparksql20FromUnixtimeFunctionINS1_10VectorExecEEES8_NS0_7VarcharEJlSA_EEEE5applyERKNS0_17SelectivityVectorERSt6vectorISt10shared_ptrINS0_10BaseVectorEESaISJ_EERKSH_IKNS0_4TypeEERNS1_7EvalCtxERSJ_
# 9  _ZN8facebook5velox4exec4Expr13applyFunctionERKNS0_17SelectivityVectorERNS1_7EvalCtxERSt10shared_ptrINS0_10BaseVectorEE
# 10 _ZN8facebook5velox4exec4Expr11evalAllImplERKNS0_17SelectivityVectorERNS1_7EvalCtxERSt10shared_ptrINS0_10BaseVectorEE
# 11 _ZN8facebook5velox4exec4Expr4evalERKNS0_17SelectivityVectorERNS1_7EvalCtxERSt10shared_ptrINS0_10BaseVectorEEPKNS1_7ExprSetE
# 12 _ZN8facebook5velox4exec7ExprSet4evalEiibRKNS0_17SelectivityVectorERNS1_7EvalCtxERSt6vectorISt10shared_ptrINS0_10BaseVectorEESaISB_EE
# 13 _ZN8facebook5velox4exec13FilterProject7projectERKNS0_17SelectivityVectorERNS1_7EvalCtxE
# 14 _ZN8facebook5velox4exec13FilterProject9getOutputEv
# 15 _ZN8facebook5velox4exec6Driver11runInternalERSt10shared_ptrIS2_ERS3_INS1_13BlockingStateEERS3_INS0_9RowVectorEE
# 16 _ZN8facebook5velox4exec6Driver4nextERSt10shared_ptrINS1_13BlockingStateEE
# 17 _ZN8facebook5velox4exec4Task4nextEPN5folly10SemiFutureINS3_4UnitEEE
# 18 _ZN6gluten24WholeStageResultIterator4nextEv
# 19 Java_io_glutenproject_vectorized_ColumnarBatchOutIterator_nativeHasNext
# 20 0x00007f553dfa3a34

	at io.glutenproject.vectorized.ColumnarBatchOutIterator.nativeHasNext(Native Method)
	at io.glutenproject.vectorized.ColumnarBatchOutIterator.hasNextInternal(ColumnarBatchOutIterator.java:65)
	at io.glutenproject.vectorized.GeneralOutIterator.hasNext(GeneralOutIterator.java:37)
	... 27 more

I've found the root cause and reported a issue to velox. See #9778
I think we should add unit test on gluten for this case and i would like to work on this.

Spark version

None

Spark configurations

No response

System information

No response

Relevant logs

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant