Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recovery log is missing #4105

Open
Meikelrizkyhartawan opened this issue Oct 16, 2023 · 1 comment
Open

Recovery log is missing #4105

Meikelrizkyhartawan opened this issue Oct 16, 2023 · 1 comment
Labels

Comments

@Meikelrizkyhartawan
Copy link

i'm using 3 nodes of bookkeeper and then suddenly the error occure show recovery log is missing, how to trace the problem , how to solve this issue ?

java.io.IOException: Recovery log 1693588482171 is missing
at org.apache.bookkeeper.bookie.Bookie.replay(Bookie.java:982) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3.jar:4.14.3]
at org.apache.bookkeeper.bookie.Bookie.readJournal(Bookie.java:961) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3.jar:4.14.3]
at org.apache.bookkeeper.bookie.Bookie.start(Bookie.java:1015) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3.jar:4.14.3]
at org.apache.bookkeeper.proto.BookieServer.start(BookieServer.java:156) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3.jar:4.14.3]
at org.apache.bookkeeper.server.service.BookieService.doStart(BookieService.java:68) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3.jar:4.14.3]
at org.apache.bookkeeper.common.component.AbstractLifecycleComponent.start(AbstractLifecycleComponent.java:83) ~[org.apache.bookkeeper-bookkeeper-common-4.14.3.jar:4.14.3]
at org.apache.bookkeeper.common.component.LifecycleComponentStack.lambda$start$4(LifecycleComponentStack.java:144) ~[org.apache.bookkeeper-bookkeeper-common-4.14.3.jar:4.14.3]
11:43:35.444 [main-SendThread(milvus-ground-pulsar-zookeeper:2181)] WARN org.apache.zookeeper.ClientCnxn - An exception was thrown while closing send thread for session 0x2001f9bccf10005.
org.apache.zookeeper.ClientCnxn$EndOfStreamException: Unable to read additional data from server sessionid 0x2001f9bccf10005, likely server has closed socket
at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:77) ~[org.apache.zookeeper-zookeeper-3.6.3.jar:3.6.3]
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350) ~[org.apache.zookeeper-zookeeper-3.6.3.jar:3.6.3]
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1290) [org.apache.zookeeper-zookeeper-3.6.3.jar:3.6.3]

11:15:29.609 [main-SendThread(milvus-ground-pulsar-zookeeper:2181)] INFO org.apache.zookeeper.ClientCnxn - SASL config status: Will not attempt to authenticate using SASL (unknown error)
11:15:59.638 [main-SendThread(milvus-ground-pulsar-zookeeper:2181)] WARN org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard from server in 31244ms for session id 0x200a10385d50003
11:15:59.638 [main-SendThread(milvus-ground-pulsar-zookeeper:2181)] WARN org.apache.zookeeper.ClientCnxn - Session 0x200a10385d50003 for sever milvus-ground-pulsar-zookeeper/10.244.11.21:2181, Closing socket connection. Attempting reconnect except it is a SessionExpiredException.
org.apache.zookeeper.ClientCnxn$SessionTimeoutException: Client session timed out, have not heard from server in 31244ms for session id 0x200a10385d50003
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1258) [org.apache.zookeeper-zookeeper-3.6.3.jar:3.6.3]
at com.google.common.collect.ImmutableList.forEach(ImmutableList.java:406) [com.google.guava-guava-30.1-jre.jar:?]
at org.apache.bookkeeper.common.component.LifecycleComponentStack.start(LifecycleComponentStack.java:144) [org.apache.bookkeeper-bookkeeper-common-4.14.3.jar:4.14.3]
at org.apache.bookkeeper.common.component.ComponentStarter.startComponent(ComponentStarter.java:85) [org.apache.bookkeeper-bookkeeper-common-4.14.3.jar:4.14.3]
at org.apache.bookkeeper.server.Main.doMain(Main.java:234) [org.apache.bookkeeper-bookkeeper-server-4.14.3.jar:4.14.3]
at org.apache.bookkeeper.server.Main.main(Main.java:208) [org.apache.bookkeeper-bookkeeper-server-4.14.3.jar:4.14.3]

11:16:01.383 [main-SendThread(milvus-ground-pulsar-zookeeper:2181)] INFO org.apache.zookeeper.ClientCnxn - Opening socket connection to server milvus-ground-pulsar-zookeeper/10.244.10.102:2181.
11:16:01.383 [main-SendThread(milvus-ground-pulsar-zookeeper:2181)] INFO org.apache.zookeeper.ClientCnxn - SASL config status: Will not attempt to authenticate using SASL (unknown error)
11:16:01.384 [main-SendThread(milvus-ground-pulsar-zookeeper:2181)] INFO org.apache.zookeeper.ClientCnxn - Socket connection established, initiating session, client: /10.244.10.47:39714, server: milvus-ground-pulsar-zookeeper/10.244.10.102:2181
11:16:01.386 [main-EventThread] ERROR org.apache.bookkeeper.zookeeper.ZooKeeperWatcherBase - ZooKeeper client connection to the ZooKeeper server has expired!
11:16:01.386 [main-SendThread(milvus-ground-pulsar-zookeeper:2181)] WARN org.apache.zookeeper.ClientCnxn - Unable to reconnect to ZooKeeper service, session 0x200a10385d50003 has expired
11:16:01.386 [main-EventThread] INFO org.apache.bookkeeper.zookeeper.ZooKeeperClient - ZooKeeper session 200a10385d50003 is expired from milvus-ground-pulsar-zookeeper:2181.
11:16:01.387 [main-SendThread(milvus-ground-pulsar-zookeeper:2181)] WARN org.apache.zookeeper.ClientCnxn - Session 0x200a10385d50003 for sever milvus-ground-pulsar-zookeeper/10.244.10.102:2181, Closing socket connection. Attempting reconnect except it is a SessionExpiredException.
org.apache.zookeeper.ClientCnxn$SessionExpiredException: Unable to reconnect to ZooKeeper service, session 0x200a10385d50003 has expired
at org.apache.zookeeper.ClientCnxn$SendThread.onConnected(ClientCnxn.java:1434) ~[org.apache.zookeeper-zookeeper-3.6.3.jar:3.6.3]
at org.apache.zookeeper.ClientCnxnSocket.readConnectResult(ClientCnxnSocket.java:154) ~[org.apache.zookeeper-zookeeper-3.6.3.jar:3.6.3]
at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:86) ~[org.apache.zookeeper-zookeeper-3.6.3.jar:3.6.3]
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350) ~[org.apache.zookeeper-zookeeper-3.6.3.jar:3.6.3]

@hangc0276
Copy link
Contributor

The reason it that the journal file is missing and you can do the following operation to reply all the journal files instead of reply by specific posision.

  • Rename all the lastMark file under {ledger_dir}/ledgers/current directory
  • Restart the bookie

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants