Track number of failure in batching when Batcher#close is called #800
Conversation
Codecov Report
@@ Coverage Diff @@
## master #800 +/- ##
============================================
+ Coverage 78.41% 78.76% +0.35%
- Complexity 1105 1126 +21
============================================
Files 198 200 +2
Lines 4897 4983 +86
Branches 385 398 +13
============================================
+ Hits 3840 3925 +85
- Misses 887 888 +1
Partials 170 170
Continue to review full report at Codecov.
|
gax/src/main/java/com/google/api/gax/batching/BatchingException.java
Outdated
Show resolved
Hide resolved
gax/src/main/java/com/google/api/gax/batching/BatchingException.java
Outdated
Show resolved
Hide resolved
2219101
to
30f23f5
Compare
Adding test case for BatchingException.
Now BatchStats will keep the counter for each type of exception happened at RPC as well as ElementT/entry object level. Also refactored exception message to be more detailed.
30f23f5
to
ecfe7f0
Compare
@igorbernstein2 Thanks again for explaining this to me.
With the latest commit, Now we are logging partial failures only if the RPC is successful(If not then we are providing only batch failure cause & count). Please have a look and let me know your thoughts. |
I think we can simplify this a bit more:
|
Addressed feedback comments to simplify BatcherStats
@@ -176,6 +177,7 @@ public void onSuccess(ResponseT response) { | |||
@Override | |||
public void onFailure(Throwable throwable) { | |||
try { | |||
batcherStats.recordBatchFailure(throwable); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we push this down into Batch to make it symmetrical with onBatchSuccess?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually tried with the same approach, but If some exception occurs while post-processing in splitResponse()
then we would end up with 1 batch and n partial failures even though some elements in the batch could be successful.
gax-java/gax/src/main/java/com/google/api/gax/batching/BatcherImpl.java
Lines 251 to 267 in 39fb471
void onBatchSuccess(ResponseT response) { | |
try { | |
descriptor.splitResponse(response, results); | |
} catch (Exception ex) { | |
onBatchFailure(ex); | |
} | |
} | |
void onBatchFailure(Throwable throwable) { | |
try { | |
descriptor.splitException(throwable, results); | |
} catch (Exception ex) { | |
for (SettableApiFuture<ElementResultT> result : results) { | |
result.setException(ex); | |
} | |
} | |
} |
I thought that would be a misleading figure, that why I kept the BatcherStats#recordBatchFailure
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better to keep the code symmetrical and then have a follow up PR that handles failures from splitResponse
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion, Updated this change with symmetrical code and added a TODO
as well for splitResponse()
.
gax/src/main/java/com/google/api/gax/batching/BatcherStats.java
Outdated
Show resolved
Hide resolved
gax/src/main/java/com/google/api/gax/batching/BatcherStats.java
Outdated
Show resolved
Hide resolved
gax/src/main/java/com/google/api/gax/batching/BatcherStats.java
Outdated
Show resolved
Hide resolved
final AtomicBoolean markBatchFailure = new AtomicBoolean(); | ||
|
||
for (ApiFuture<T> elementResult : batchElementResultFutures) { | ||
ApiFutures.addCallback( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think try/catch would be easier to read:
for (ApiFuture<T> elementFuture : batchElementResultFutures) {
try {
elementFuture.get();
} catch (Throwable t) {
....
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also remove the atomic above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understood correctly, you meant to block and check if any entry is being failed after splitResponse()
is executed. I believe going with this would create a lot of blocking time until each entry in the batch finishes applying the mutation. If possible I would like to avoid that.
I hope after the last commit, this method looks better, Can you please take a look and let me know your thoughts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of the element results should be resolved at this point, so nothing should block. If an entry is not resolved at this point then it never will be. This would be due to a bug in the splitResponse. We can address this issue in a follow up PR that would iterate over all of the element results and sets an IllegalStateException for unresolved entries
gax/src/main/java/com/google/api/gax/batching/BatcherStats.java
Outdated
Show resolved
Hide resolved
gax/src/main/java/com/google/api/gax/batching/BatcherStats.java
Outdated
Show resolved
Hide resolved
gax/src/main/java/com/google/api/gax/batching/BatcherStats.java
Outdated
Show resolved
Hide resolved
gax/src/main/java/com/google/api/gax/batching/BatcherStats.java
Outdated
Show resolved
Hide resolved
gax/src/main/java/com/google/api/gax/batching/BatcherStats.java
Outdated
Show resolved
Hide resolved
gax/src/main/java/com/google/api/gax/batching/BatchingException.java
Outdated
Show resolved
Hide resolved
gax/src/test/java/com/google/api/gax/batching/BatcherStatsTest.java
Outdated
Show resolved
Hide resolved
gax/src/test/java/com/google/api/gax/batching/BatcherStatsTest.java
Outdated
Show resolved
Hide resolved
@@ -258,6 +268,7 @@ void onBatchSuccess(ResponseT response) { | |||
|
|||
void onBatchFailure(Throwable throwable) { | |||
try { | |||
batcherStats.recordBatchFailure(throwable); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should come after splitException
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shifted batcherStats.recordBatchFailure(throwable);
outside try/catch as we would be record failure in either of the case. I think Batcher#recordBatchFailure
is just keeping a track of element so it is no needed to the inside try block.
gax/src/main/java/com/google/api/gax/batching/BatcherStats.java
Outdated
Show resolved
Hide resolved
gax/src/main/java/com/google/api/gax/batching/BatcherStats.java
Outdated
Show resolved
Hide resolved
gax/src/main/java/com/google/api/gax/batching/BatcherStats.java
Outdated
Show resolved
Hide resolved
gax/src/main/java/com/google/api/gax/batching/BatcherStats.java
Outdated
Show resolved
Hide resolved
gax/src/main/java/com/google/api/gax/batching/BatcherStats.java
Outdated
Show resolved
Hide resolved
gax/src/main/java/com/google/api/gax/batching/BatcherStats.java
Outdated
Show resolved
Hide resolved
- Refactored record methods in BatcherImpl.Batch. - Removed callback and implemented try/catch for entry failures. - Made all three default method synchronized and removed lock as all the content of method needed to be inside lock. - Fixed test case and added a todo for followUp PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, except for a minor nit
@vam-google or @kolea2 can you take a pass?
Code code = ((ApiException) actualCause).getStatusCode().getCode(); | ||
exceptionClass = ApiException.class; | ||
|
||
int oldExceptionCount = MoreObjects.firstNonNull(entryStatusCounts.get(code), 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be oldStatusCount
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for this, I have updated it now.
|
||
if (ApiException.class.equals(request.getKey())) { | ||
keyValue.append("("); | ||
Iterator<Map.Entry<Code, Integer>> statusItrator = statusCounts.entrySet().iterator(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: statusIterator
*/ | ||
private String printKeyValue( | ||
Map<Class, Integer> exceptionCounts, Map<Code, Integer> statusCounts) { | ||
StringBuilder keyValue = new StringBuilder(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Naming is hard, but keyValue
doesn't really describe this variable. Maybe messageBuilder
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing this out, Have updated the same.
This change will enable
Batcher
to throwBatchingException
when the user will close the batching connection.BatcherImpl
maintains total numOfFailure, count of ExceptionTypes and count of StatusCode in caseApiException
has occurred./ cc: @igorbernstein2 , @kolea2