Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When a client adds entries synchronously to an opened ledger and a bookie crashes, the client may get stuck. #4261

Open
M1eyu2018 opened this issue Apr 2, 2024 · 5 comments
Labels

Comments

@M1eyu2018
Copy link

M1eyu2018 commented Apr 2, 2024

BUG REPORT
When a client adds entries synchronously to an opened ledger and a bookie crashes, the client may get stuck.

Release
My bookkeeper is release 4.14.1, however, Release 4.16.4 can reproduce this bug too.

Describe the bug

When a client adds entries synchronously to an opened ledger and a bookie crashes, the ensemble change for the crashed bookie may be called twice.
The first ensemble change is caused by the third failed response of 'Bookie handle was not available'.
A moment later, The Second ensemble change is caused by the third failed response of 'Bookie operation timeout'.
As the same crashed bookie is replaced twice, in the second time unsetSuccessAndSendWriteRequest can't be called because no bookie is replaced so that successful callback of current adding entry can't be sent and client gets stuck.

Example
In this example, a client add 81920 entries for a ledger of 10M with 3-3-2 policy, and the ensemble is (A,B,C).
1、At the beginning,entry#0-#6773 is normally written.
2、When add entry#6774, the bookie A crashes for some reason like power outage or run 'kill -9 bookie A process id'.
3、However, two successful responses are received, so it does not affect the ability to continue adding entry#6774-#11604.
4、Before add entry#11605, the third responses for entry#6774-#11604 come back one after another. As the failed response is 'Bookie handle was not available', the failed bookie A is put into delayedWriteFailedBookies.
5、When add entry#11605, maybeHandleDelayedWriteBookieFailure is called, as delayedWriteFailedBookies is not empty, ensemble change begins.
6、After two successful responses of entry#11605 are received, sendAddSuccessCallbacks is called. However, pendingAddOp.submitCallback is not called until ensemble change finishes.
7、When ensemble change finishes, bookie A is replaced by bookie D. Successful callback of entry#11605 is also sent and adding entry is continue.

So far, the logic is correct. But there will be a problem below.

8、entry#11606-#42623 is normally written to (D,B,C) after ensemble change.
9、Before add entry #42624, the third responses for entry#6774-#11604 which has not come back still come back one after another. But in this time, the failed response is 'Bookie operation timeout', the failed bookie A is put into delayedWriteFailedBookies again.
10、When add entry#42624, maybeHandleDelayedWriteBookieFailure is called, as delayedWriteFailedBookies is not empty, ensemble change begin again.
11、After three successful responses of entry#42624 from (D,B,C) are received, sendAddSuccessCallbacks is called. However, pendingAddOp.submitCallback is not called until ensemble change finishes.
12、In this time, as failed bookie A need to be replaced again, but ensemble has been (D,B,C), so no bookie is replaced. Successful callback of entry#42624 can't be sent as unsetSuccessAndSendWriteRequest is not called.
13、As add entries synchronously, the client gets stuck.

To Reproduce

1、create bookkeeper client
2、open a ledger
3、add entries synchronously
4、kill -9 one bookie process id when add entries
5、the client may get stuck forever

How to fix
In my opinion, there are two solutions:
1、After each ensemble change, sendAddSuccessCallbacks must be called, which ensure that the successful callback of current adding entry which is not sent as ensemble change is running can be sent after ensemble change.
2、Before ensemble change begins, check if the failed bookie has not been in current ensemble, if so, skip ensemble change so that successful callback of current adding entry can be sent in function writeComplete normally.

@thetumbled
Copy link
Contributor

PTAL, thanks. @hangc0276 @ivankelly @horizonzy @shoothzj @wenbingshen

@horizonzy
Copy link
Member

Thanks for report, I will check it.

@horizonzy
Copy link
Member

Nice catch!

@wenbingshen
Copy link
Member

Nice Catch!

@lhotari
Copy link
Member

lhotari commented Apr 16, 2024

Is this similar or related to #4097?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants