Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TestFullParquetReader>AbstractTestParquetReader.testDecimalBackedByINT32 #22130

Closed
elharo opened this issue Mar 8, 2024 · 19 comments · Fixed by #22296 or #22719
Closed

TestFullParquetReader>AbstractTestParquetReader.testDecimalBackedByINT32 #22130

elharo opened this issue Mar 8, 2024 · 19 comments · Fixed by #22296 or #22719
Assignees

Comments

@elharo
Copy link
Contributor

elharo commented Mar 8, 2024

Haven't fully traced through the code yet, but this one might be caused by an unfortunate choice of random values in the test.

2024-03-06T21:03:24.6089444Z [ERROR] Tests run: 2934, Failures: 1, Errors: 0, Skipped: 90, Time elapsed: 2,531.069 s <<< FAILURE! - in TestSuite
2024-03-06T21:03:24.6091508Z [ERROR] com.facebook.presto.hive.parquet.TestFullParquetReader.testDecimalBackedByINT32  Time elapsed: 5.949 s  <<< FAILURE!
2024-03-06T21:03:24.6093559Z java.lang.IllegalArgumentException: maxCapacityHint can't be less than initialSlabSize 1024 100
2024-03-06T21:03:24.6094940Z 	at org.apache.parquet.Preconditions.checkArgument(Preconditions.java:97)
2024-03-06T21:03:24.6098576Z 	at org.apache.parquet.bytes.CapacityByteArrayOutputStream.<init>(CapacityByteArrayOutputStream.java:153)
2024-03-06T21:03:24.6100792Z 	at org.apache.parquet.column.values.delta.DeltaBinaryPackingValuesWriter.<init>(DeltaBinaryPackingValuesWriter.java:88)
2024-03-06T21:03:24.6103134Z 	at org.apache.parquet.column.values.delta.DeltaBinaryPackingValuesWriterForInteger.<init>(DeltaBinaryPackingValuesWriterForInteger.java:71)
2024-03-06T21:03:24.6105615Z 	at org.apache.parquet.column.values.delta.DeltaBinaryPackingValuesWriterForInteger.<init>(DeltaBinaryPackingValuesWriterForInteger.java:66)
2024-03-06T21:03:24.6108456Z 	at org.apache.parquet.column.values.factory.DefaultV2ValuesWriterFactory.getInt32ValuesWriter(DefaultV2ValuesWriterFactory.java:93)
2024-03-06T21:03:24.6110796Z 	at org.apache.parquet.column.values.factory.DefaultV2ValuesWriterFactory.newValuesWriter(DefaultV2ValuesWriterFactory.java:63)
2024-03-06T21:03:24.6113030Z 	at org.apache.parquet.column.values.factory.DefaultValuesWriterFactory.newValuesWriter(DefaultValuesWriterFactory.java:52)
2024-03-06T21:03:24.6114935Z 	at org.apache.parquet.column.ParquetProperties.newValuesWriter(ParquetProperties.java:167)
2024-03-06T21:03:24.6116395Z 	at org.apache.parquet.column.impl.ColumnWriterBase.<init>(ColumnWriterBase.java:84)
2024-03-06T21:03:24.6117710Z 	at org.apache.parquet.column.impl.ColumnWriterV2.<init>(ColumnWriterV2.java:65)
2024-03-06T21:03:24.6119207Z 	at org.apache.parquet.column.impl.ColumnWriteStoreV2.createColumnWriter(ColumnWriteStoreV2.java:44)
2024-03-06T21:03:24.6120805Z 	at org.apache.parquet.column.impl.ColumnWriteStoreBase.<init>(ColumnWriteStoreBase.java:125)
2024-03-06T21:03:24.6122263Z 	at org.apache.parquet.column.impl.ColumnWriteStoreV2.<init>(ColumnWriteStoreV2.java:38)
2024-03-06T21:03:24.6123782Z 	at org.apache.parquet.column.ParquetProperties.newColumnWriteStore(ParquetProperties.java:222)
2024-03-06T21:03:24.6125853Z 	at org.apache.parquet.hadoop.InternalParquetRecordWriter.initStore(InternalParquetRecordWriter.java:116)
2024-03-06T21:03:24.6127582Z 	at org.apache.parquet.hadoop.InternalParquetRecordWriter.<init>(InternalParquetRecordWriter.java:101)
2024-03-06T21:03:24.6129102Z 	at org.apache.parquet.hadoop.ParquetRecordWriter.<init>(ParquetRecordWriter.java:152)
2024-03-06T21:03:24.6130581Z 	at org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:505)
2024-03-06T21:03:24.6132154Z 	at org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:432)
2024-03-06T21:03:24.6133753Z 	at org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:427)
2024-03-06T21:03:24.6135536Z 	at org.apache.hadoop.hive.ql.io.parquet.write.ParquetRecordWriterWrapper.<init>(ParquetRecordWriterWrapper.java:70)
2024-03-06T21:03:24.6137725Z 	at org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat.getParquerRecordWriterWrapper(MapredParquetOutputFormat.java:137)
2024-03-06T21:03:24.6140128Z 	at com.facebook.presto.hive.parquet.write.TestMapredParquetOutputFormat.getHiveRecordWriter(TestMapredParquetOutputFormat.java:61)
2024-03-06T21:03:24.6142089Z 	at com.facebook.presto.hive.parquet.ParquetTester.writeParquetColumn(ParquetTester.java:698)
2024-03-06T21:03:24.6143925Z 	at com.facebook.presto.hive.parquet.ParquetTester.assertRoundTrip(ParquetTester.java:363)
2024-03-06T21:03:24.6145444Z 	at com.facebook.presto.hive.parquet.ParquetTester.testRoundTrip(ParquetTester.java:307)
2024-03-06T21:03:24.6147095Z 	at com.facebook.presto.hive.parquet.ParquetTester.testRoundTrip(ParquetTester.java:245)
2024-03-06T21:03:24.6148945Z 	at com.facebook.presto.hive.parquet.AbstractTestParquetReader.testDecimalBackedByINT32(AbstractTestParquetReader.java:894)
2024-03-06T21:03:24.6150533Z 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
2024-03-06T21:03:24.6151688Z 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
2024-03-06T21:03:24.6153065Z 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
2024-03-06T21:03:24.6154185Z 	at java.lang.reflect.Method.invoke(Method.java:498)
2024-03-06T21:03:24.6155438Z 	at org.testng.internal.invokers.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:135)
2024-03-06T21:03:24.6156944Z 	at org.testng.internal.invokers.TestInvoker.invokeMethod(TestInvoker.java:673)
2024-03-06T21:03:24.6158214Z 	at org.testng.internal.invokers.TestInvoker.invokeTestMethod(TestInvoker.java:220)
2024-03-06T21:03:24.6159559Z 	at org.testng.internal.invokers.MethodRunner.runInSequence(MethodRunner.java:50)
2024-03-06T21:03:24.6160997Z 	at org.testng.internal.invokers.TestInvoker$MethodInvocationAgent.invoke(TestInvoker.java:945)
2024-03-06T21:03:24.6162441Z 	at org.testng.internal.invokers.TestInvoker.invokeTestMethods(TestInvoker.java:193)
2024-03-06T21:03:24.6163831Z 	at org.testng.internal.invokers.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:146)
2024-03-06T21:03:24.6165204Z 	at org.testng.internal.invokers.TestMethodWorker.run(TestMethodWorker.java:128)
2024-03-06T21:03:24.6166452Z 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
2024-03-06T21:03:24.6167717Z 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
2024-03-06T21:03:24.6168669Z 	at java.lang.Thread.run(Thread.java:750)
2024-03-06T21:03:24.6169063Z 
2024-03-06T21:03:25.0839602Z [INFO] 
2024-03-06T21:03:25.0840062Z [INFO] Results:
2024-03-06T21:03:25.0840482Z [INFO] 
2024-03-06T21:03:25.0842587Z [ERROR] Failures: 
2024-03-06T21:03:25.0874877Z [ERROR]   TestFullParquetReader>AbstractTestParquetReader.testDecimalBackedByINT32:894 » IllegalArgument maxCapacityHint can't be less than initialSlabSize 1024 100
@elharo elharo self-assigned this Mar 21, 2024
@elharo elharo mentioned this issue Mar 22, 2024
6 tasks
@ClarenceThreepwood
Copy link
Contributor

I encountered this again today (https://github.com/prestodb/presto/actions/runs/8485176805/job/23249502331?pr=22064)

Error: Tests run: 2956, Failures: 1, Errors: 0, Skipped: 90, Time elapsed: 2,788.605 s <<< FAILURE! - in TestSuite
Error: com.facebook.presto.hive.parquet.TestFullParquetReader.testDecimalBackedByFixedLenByteArray Time elapsed: 2.561 s <<< FAILURE!
java.lang.IllegalArgumentException: maxCapacityHint can't be less than initialSlabSize 1024 100
at org.apache.parquet.Preconditions.checkArgument(Preconditions.java:97)
at org.apache.parquet.bytes.CapacityByteArrayOutputStream.(CapacityByteArrayOutputStream.java:153)
at org.apache.parquet.column.values.plain.FixedLenByteArrayPlainValuesWriter.(FixedLenByteArrayPlainValuesWriter.java:49)
at org.apache.parquet.column.values.factory.DefaultV1ValuesWriterFactory.getFixedLenByteArrayValuesWriter(DefaultV1ValuesWriterFactory.java:80)
at org.apache.parquet.column.values.factory.DefaultV1ValuesWriterFactory.newValuesWriter(DefaultV1ValuesWriterFactory.java:55)
at org.apache.parquet.column.values.factory.DefaultValuesWriterFactory.newValuesWriter(DefaultValuesWriterFactory.java:52)
at org.apache.parquet.column.ParquetProperties.newValuesWriter(ParquetProperties.java:167)
at org.apache.parquet.column.impl.ColumnWriterBase.(ColumnWriterBase.java:84)
at org.apache.parquet.column.impl.ColumnWriterV1.(ColumnWriterV1.java:43)
at org.apache.parquet.column.impl.ColumnWriteStoreV1.createColumnWriter(ColumnWriteStoreV1.java:50)
at org.apache.parquet.column.impl.ColumnWriteStoreBase.(ColumnWriteStoreBase.java:125)
at org.apache.parquet.column.impl.ColumnWriteStoreV1.(ColumnWriteStoreV1.java:44)
at org.apache.parquet.column.ParquetProperties.newColumnWriteStore(ParquetProperties.java:220)
at org.apache.parquet.hadoop.InternalParquetRecordWriter.initStore(InternalParquetRecordWriter.java:116)
at org.apache.parquet.hadoop.InternalParquetRecordWriter.(InternalParquetRecordWriter.java:101)
at org.apache.parquet.hadoop.ParquetRecordWriter.(ParquetRecordWriter.java:152)
at org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:505)
at org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:432)
at org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:427)

@elharo
Copy link
Contributor Author

elharo commented Mar 31, 2024

slightly longer stack trace:

2024-03-29T21:06:31.3170252Z [ERROR] com.facebook.presto.hive.parquet.TestFullParquetReader.testDecimalBackedByFixedLenByteArray  Time elapsed: 2.561 s  <<< FAILURE!
2024-03-29T21:06:31.3172579Z java.lang.IllegalArgumentException: maxCapacityHint can't be less than initialSlabSize 1024 100
2024-03-29T21:06:31.3174003Z 	at org.apache.parquet.Preconditions.checkArgument(Preconditions.java:97)
2024-03-29T21:06:31.3262556Z 	at org.apache.parquet.bytes.CapacityByteArrayOutputStream.<init>(CapacityByteArrayOutputStream.java:153)
2024-03-29T21:06:31.3264720Z 	at org.apache.parquet.column.values.plain.FixedLenByteArrayPlainValuesWriter.<init>(FixedLenByteArrayPlainValuesWriter.java:49)
2024-03-29T21:06:31.3267628Z 	at org.apache.parquet.column.values.factory.DefaultV1ValuesWriterFactory.getFixedLenByteArrayValuesWriter(DefaultV1ValuesWriterFactory.java:80)
2024-03-29T21:06:31.3270440Z 	at org.apache.parquet.column.values.factory.DefaultV1ValuesWriterFactory.newValuesWriter(DefaultV1ValuesWriterFactory.java:55)
2024-03-29T21:06:31.3272818Z 	at org.apache.parquet.column.values.factory.DefaultValuesWriterFactory.newValuesWriter(DefaultValuesWriterFactory.java:52)
2024-03-29T21:06:31.3274791Z 	at org.apache.parquet.column.ParquetProperties.newValuesWriter(ParquetProperties.java:167)
2024-03-29T21:06:31.3276301Z 	at org.apache.parquet.column.impl.ColumnWriterBase.<init>(ColumnWriterBase.java:84)
2024-03-29T21:06:31.3277709Z 	at org.apache.parquet.column.impl.ColumnWriterV1.<init>(ColumnWriterV1.java:43)
2024-03-29T21:06:31.3279267Z 	at org.apache.parquet.column.impl.ColumnWriteStoreV1.createColumnWriter(ColumnWriteStoreV1.java:50)
2024-03-29T21:06:31.3280977Z 	at org.apache.parquet.column.impl.ColumnWriteStoreBase.<init>(ColumnWriteStoreBase.java:125)
2024-03-29T21:06:31.3282524Z 	at org.apache.parquet.column.impl.ColumnWriteStoreV1.<init>(ColumnWriteStoreV1.java:44)
2024-03-29T21:06:31.3284122Z 	at org.apache.parquet.column.ParquetProperties.newColumnWriteStore(ParquetProperties.java:220)
2024-03-29T21:06:31.3285897Z 	at org.apache.parquet.hadoop.InternalParquetRecordWriter.initStore(InternalParquetRecordWriter.java:116)
2024-03-29T21:06:31.3287723Z 	at org.apache.parquet.hadoop.InternalParquetRecordWriter.<init>(InternalParquetRecordWriter.java:101)
2024-03-29T21:06:31.3289315Z 	at org.apache.parquet.hadoop.ParquetRecordWriter.<init>(ParquetRecordWriter.java:152)
2024-03-29T21:06:31.3290885Z 	at org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:505)
2024-03-29T21:06:31.3292541Z 	at org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:432)
2024-03-29T21:06:31.3294211Z 	at org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:427)
2024-03-29T21:06:31.3296101Z 	at org.apache.hadoop.hive.ql.io.parquet.write.ParquetRecordWriterWrapper.<init>(ParquetRecordWriterWrapper.java:70)
2024-03-29T21:06:31.3298404Z 	at org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat.getParquerRecordWriterWrapper(MapredParquetOutputFormat.java:137)
2024-03-29T21:06:31.3300787Z 	at org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat.getHiveRecordWriter(MapredParquetOutputFormat.java:126)
2024-03-29T21:06:31.3303137Z 	at com.facebook.presto.hive.parquet.write.TestMapredParquetOutputFormat.getHiveRecordWriter(TestMapredParquetOutputFormat.java:63)
2024-03-29T21:06:31.3305169Z 	at com.facebook.presto.hive.parquet.ParquetTester.writeParquetColumn(ParquetTester.java:698)
2024-03-29T21:06:31.3306757Z 	at com.facebook.presto.hive.parquet.ParquetTester.assertRoundTrip(ParquetTester.java:363)
2024-03-29T21:06:31.3308439Z 	at com.facebook.presto.hive.parquet.ParquetTester.testRoundTripType(ParquetTester.java:330)
2024-03-29T21:06:31.3310638Z 	at com.facebook.presto.hive.parquet.ParquetTester.testRoundTrip(ParquetTester.java:223)
2024-03-29T21:06:31.3313016Z 	at 

@elharo
Copy link
Contributor Author

elharo commented Apr 1, 2024

Wait, this is not quite the same flake. This is in TestFullParquetReader.testDecimalBackedByFixedLenByteArray whereas the original was in testDecimalBackedByINT32

@elharo
Copy link
Contributor Author

elharo commented Apr 1, 2024

initialSlabSize and maxCapacityHint come from parquetProperties in DefaultV1ValuesWriterFactory:

  private ValuesWriter getFixedLenByteArrayValuesWriter(ColumnDescriptor path) {
    // dictionary encoding was not enabled in PARQUET 1.0
    return new FixedLenByteArrayPlainValuesWriter(path.getTypeLength(), parquetProperties.getInitialSlabSize(), parquetProperties.getPageSizeThreshold(), parquetProperties.getAllocator());
  }

initialSlabSize == parquetProperties.getInitialSlabSize()

maxCapacityHint == parquetProperties.getPageSizeThreshold()

These properties are set in ParquetOutputFormat.getRecordWriter with a ParquetProperties.builder

pageSizeThreshold is set by

    ParquetProperties.Builder propsBuilder = ParquetProperties.builder()
        .withPageSize(getPageSize(conf))

(Yes, it's pageSize in the builder but pageSizeThreshold later)

initialSlabSize is set in the constrcutor by

    this.initialSlabSize = CapacityByteArrayOutputStream
      .initialSlabSizeHeuristic(MIN_SLAB_SIZE, pageSizeThreshold, 10);

@elharo
Copy link
Contributor Author

elharo commented Apr 1, 2024

I still don't see how this becomes flaky, though perhaps there's a bug in initialSlabSizeHeuristic?

@elharo
Copy link
Contributor Author

elharo commented Apr 1, 2024

Is there any chance the configuration is shared between tests?

@elharo
Copy link
Contributor Author

elharo commented Apr 1, 2024

I don't immediately see how but I think ParquetTester might be sharing mutable state between test methods. Making the test single threaded and using a new ParquetTester for each method might fix this.

@elharo
Copy link
Contributor Author

elharo commented Apr 1, 2024

similar flake:

TestFullParquetReader>AbstractTestParquetReader.testArrayOfMapOfStruct:383 » IllegalArgument maxCapacityHint can't be less than initialSlabSize 1024 100

https://github.com/prestodb/presto/actions/runs/8485176805/job/23249502331?pr=22064

@elharo
Copy link
Contributor Author

elharo commented Apr 3, 2024

#22390 tried to fix this by avoiding shared state but failed. That doesn't seem to be the issue here. More work is needed.

@tdcmeehan
Copy link
Contributor

Encountered in #22436

@hantangwangd
Copy link
Member

hantangwangd commented May 2, 2024

Meet again in TestFullParquetReader>AbstractTestParquetReader.testDecimalBackedByINT64:937 (https://github.com/prestodb/presto/actions/runs/8926559837/job/24517874564?pr=22609)

Error:  com.facebook.presto.hive.parquet.TestFullParquetReader.testDecimalBackedByINT64  Time elapsed: 1.363 s  <<< FAILURE!
java.lang.IllegalArgumentException: maxCapacityHint can't be less than initialSlabSize 1024 100
	at org.apache.parquet.Preconditions.checkArgument(Preconditions.java:97)
	at org.apache.parquet.bytes.CapacityByteArrayOutputStream.<init>(CapacityByteArrayOutputStream.java:153)
	at org.apache.parquet.column.values.plain.PlainValuesWriter.<init>(PlainValuesWriter.java:47)
	at org.apache.parquet.column.values.factory.DefaultV1ValuesWriterFactory.getInt64ValuesWriter(DefaultV1ValuesWriterFactory.java:94)
	at org.apache.parquet.column.values.factory.DefaultV1ValuesWriterFactory.newValuesWriter(DefaultV1ValuesWriterFactory.java:61)
	at org.apache.parquet.column.values.factory.DefaultValuesWriterFactory.newValuesWriter(DefaultValuesWriterFactory.java:52)
	at org.apache.parquet.column.ParquetProperties.newValuesWriter(ParquetProperties.java:167)
	at org.apache.parquet.column.impl.ColumnWriterBase.<init>(ColumnWriterBase.java:84)
	at org.apache.parquet.column.impl.ColumnWriterV1.<init>(ColumnWriterV1.java:43)
	at org.apache.parquet.column.impl.ColumnWriteStoreV1.createColumnWriter(ColumnWriteStoreV1.java:50)
	at org.apache.parquet.column.impl.ColumnWriteStoreBase.<init>(ColumnWriteStoreBase.java:125)
	at org.apache.parquet.column.impl.ColumnWriteStoreV1.<init>(ColumnWriteStoreV1.java:44)
	at org.apache.parquet.column.ParquetProperties.newColumnWriteStore(ParquetProperties.java:220)
	at org.apache.parquet.hadoop.InternalParquetRecordWriter.initStore(InternalParquetRecordWriter.java:116)
	at org.apache.parquet.hadoop.InternalParquetRecordWriter.<init>(InternalParquetRecordWriter.java:101)
	at org.apache.parquet.hadoop.ParquetRecordWriter.<init>(ParquetRecordWriter.java:152)
	at org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:505)
	at org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:432)
	at org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:427)
	at org.apache.hadoop.hive.ql.io.parquet.write.ParquetRecordWriterWrapper.<init>(ParquetRecordWriterWrapper.java:70)
	at org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat.getParquerRecordWriterWrapper(MapredParquetOutputFormat.java:137)
	at com.facebook.presto.hive.parquet.write.TestMapredParquetOutputFormat.getHiveRecordWriter(TestMapredParquetOutputFormat.java:61)
	at com.facebook.presto.hive.parquet.ParquetTester.writeParquetColumn(ParquetTester.java:698)
	at com.facebook.presto.hive.parquet.ParquetTester.assertRoundTrip(ParquetTester.java:363)
	at com.facebook.presto.hive.parquet.ParquetTester.testRoundTripType(ParquetTester.java:324)
	at com.facebook.presto.hive.parquet.ParquetTester.testRoundTrip(ParquetTester.java:304)
	at com.facebook.presto.hive.parquet.ParquetTester.testRoundTrip(ParquetTester.java:245)
	at com.facebook.presto.hive.parquet.AbstractTestParquetReader.testDecimalBackedByINT64(AbstractTestParquetReader.java:937)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.testng.internal.invokers.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:135)
	at org.testng.internal.invokers.TestInvoker.invokeMethod(TestInvoker.java:673)
	at org.testng.internal.invokers.TestInvoker.invokeTestMethod(TestInvoker.java:220)
	at org.testng.internal.invokers.MethodRunner.runInSequence(MethodRunner.java:50)
	at org.testng.internal.invokers.TestInvoker$MethodInvocationAgent.invoke(TestInvoker.java:945)
	at org.testng.internal.invokers.TestInvoker.invokeTestMethods(TestInvoker.java:193)
	at org.testng.internal.invokers.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:146)
	at org.testng.internal.invokers.TestMethodWorker.run(TestMethodWorker.java:128)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:750)

@hantangwangd
Copy link
Member

Meet a failure in another method TestParquetReader>AbstractTestParquetReader.testDoubleNaNInfinity:1470 with the same reason: IllegalArgument maxCapacityHint can't be less than initialSlabSize 1024 100. I suspect they may have the same root cause.

(https://github.com/prestodb/presto/actions/runs/9006379854/job/24743797907?pr=22080#step:7:11522)

@elharo
Copy link
Contributor Author

elharo commented May 9, 2024

Yes, I think once we understand why this one is flaking the fix will be obvious. So far no one's found the root cause. Given the broad range of places where this occurs I'm a little concerned that this is a real bug in the non test code somewhere.

@hantangwangd
Copy link
Member

Yes, agree. It seems very likely a real bug in parquet module.

@hantangwangd
Copy link
Member

hantangwangd commented May 9, 2024

@elharo, After check the code, I believe the root cause is (although I could not reproduce the test failure in my local, I wrote some code to simulate the situation, and finally caught the inconsistency), if we do not explicitly set a ValuesWriterFactory instance to a ParquetProperties on its creation, then all ParquetProperties instances will share the same ValuesWriterFactory: ParquetProperties.DEFAULT_VALUES_WRITER_FACTORY. So that each ParquetProperties instance on its creation will use itself to initialize the shared ValuesWriterFactory instance.

That's not thread safe, especially when setting different page sizes alternately, as when we invoke parquetProperties.newValuesWriter() to leverage valuesWriterFactory for creating a ValuesWriter, we might find that the factory's field parquetProperties has been reset by another ParquetProperties instance in another thread. Moreover this reset might occur precisely between the valuesWriteFactory reading value parquetProperties.getInitialSlabSize() and parquetProperties.getPageSizeThreshold, that is to say, using different parquetProperties instances to get initialSlabSize and pageSizeThreshold. That would bring in inconsistency and cause a verification error as this flaky test shows.

So, to fix the flaky test, I think set the max page size to 1024 rather than 100 when creating ParquetWriter in ParquetTester.writeParquetFileFromPresto() would work well. But to address this potential bug, we may need to consider creating separate factory instances for the non-default page size when building ParquetProperties.

Any misunderstanding please let me know, thanks!

cc: @tdcmeehan

@tdcmeehan
Copy link
Contributor

tdcmeehan commented May 10, 2024

@hantangwangd this seems pretty plausible. I also see that Trino has made their Parquet tests single threaded for this exact issue (see: trinodb/trino#17022).

@hantangwangd
Copy link
Member

@tdcmeehan I think this problem is not just a flaky test, that is, it's possible to happen in actual usage, although the probability is extremely low. Imagine two users, one of them has set the session property iceberg.parquet_writer_page_size to 100, while the other one uses the default value. While they continuously write data to Iceberg tables with parquet format, this issue may arise.

After studying the code of ParquetProperties, I think it encourages using separate valuesWriterFactory when necessary. And the implementations of interface ValuesWriterFactory are very lightweight. So I think we could set a new separate ValuesWriterFactory to ParquetProperties when using a not-default pageSize in ParquetWriter's constructor method. That might potentially completely resolve this issue, not just fix the flaky test. What do you think?

@elharo
Copy link
Contributor Author

elharo commented May 10, 2024

Is this a bug in parquet then? I don't immediately see where we could fix this in presto code, though we can certainly try to fix it in parquet.

@hantangwangd
Copy link
Member

Is this a bug in parquet then? I don't immediately see where we could fix this in presto code, though we can certainly try to fix it in parquet.

I believe it's a problem caused by improperly usage of ParquetProperties when creating ParquetWriter in our presto-parquet module. I will rise a PR to fix it, so that ideally it shouldn't appear again even in multi-thread test environment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment