New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suppress BigQuery read stream splitAtFraction when API busy call or timeout #31125
Conversation
…imeout * Reject split attempt early when there is an ongoing hasNext call in work item thread * TImeout hasNext call in splitAtFraction in alignment with client splitReadStreamSettings
try { | ||
// The intended wait time is in sync with splitReadStreamSettings.setRetrySettings in | ||
// StorageClientImpl. | ||
future.get(30, TimeUnit.SECONDS); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is correspond to the config here
Lines 1735 to 1742 in 673da54
splitReadStreamSettings.setRetrySettings( | |
splitReadStreamSettings | |
.getRetrySettings() | |
.toBuilder() | |
.setInitialRpcTimeout(org.threeten.bp.Duration.ofSeconds(30)) | |
.setMaxRpcTimeout(org.threeten.bp.Duration.ofSeconds(30)) | |
.setTotalTimeout(org.threeten.bp.Duration.ofSeconds(30)) | |
.build()); |
It intended to set the API call timeout during split request to 30s (and otherwise reject split). However, a bug fix #21739 added this call here, which has internal retry logic, but this added call is not covered by the split-stream-retry config above. When there is quota issue, this is causing pipeline stuck
latch.await(); | ||
|
||
// Beam does not split storage read api v2 stream | ||
assertNull(reader.splitAtFraction(0.5)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
without the changes in main scope code, the test fails with
java.lang.NullPointerException
at org.apache.beam.sdk.io.gcp.bigquery.BigQueryStorageStreamSource$BigQueryStorageStreamReader.splitAtFraction(BigQueryStorageStreamSource.java:372)
because it passed "splitAllowed" check
Checks are failing. Will not request review until checks are succeeding. If you'd like to override that behavior, comment |
b3dd647
to
6169dd1
Compare
@@ -388,8 +398,22 @@ public synchronized BigQueryStorageStreamSource<T> getCurrentSource() { | |||
// The following line is required to trigger the `FailedPreconditionException` on which | |||
// the SplitReadStream validation logic depends. Removing it will cause incorrect | |||
// split operations to succeed. | |||
newResponseIterator.hasNext(); | |||
storageClient.reportPendingMetrics(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed reportPendingMetrics call here otherwise same warning as #31096 seen, because splitAtFraction call thread isn't a work item thread
Assigning reviewers. If you would like to opt out of this review, comment R: @damondouglas for label java. Available commands:
The PR bot will only process comments in the main thread (not review comments). |
R: @ahmedabu98 (since you reviewed #31096) |
Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just some cleanup
...ud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIOStorageReadTest.java
Outdated
Show resolved
Hide resolved
...-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryStorageStreamSource.java
Outdated
Show resolved
Hide resolved
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #31125 +/- ##
=============================================
- Coverage 71.44% 60.41% -11.04%
- Complexity 1474 2981 +1507
=============================================
Files 906 659 -247
Lines 113271 66095 -47176
Branches 1076 3233 +2157
=============================================
- Hits 80931 39930 -41001
+ Misses 30327 23066 -7261
- Partials 2013 3099 +1086
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Resolve
#30646 (comment)Reject split attempt early when there is an ongoing hasNext call in work item thread
TImeout hasNext call in splitAtFraction in alignment with client splitReadStreamSettings
The issue exist on both released Beam and after #31096
Tested on read stream API v1 (default) and Dataflow legacy runner in a quota restricted project:
Without the change, the pipeline appears to stuck indefinitely. the onRetryAttempt callback get called periodically.
Please add a meaningful description for your change here
Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
addresses #123
), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, commentfixes #<ISSUE NUMBER>
instead.CHANGES.md
with noteworthy changes.See the Contributor Guide for more tips on how to make review process smoother.
To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md
GitHub Actions Tests Status (on master branch)
See CI.md for more information about GitHub Actions CI or the workflows README to see a list of phrases to trigger workflows.