-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disable log4j2 shutdown hook [HZ-3543] #26006
Conversation
A common issue, especially when Hazelcast runs in Kubernetes pods with log4j2 logging, is that shutdown logs are missing. The reason is that log4j2 registers its own shutdown hook, alongside Hazelcast's own shutdown hook. Since the JVM does not guarantee any ordering of shutdown hook execution, it can be the case that the log4j shutdown hook shuts down loggers while Hazelcast is still performing shut down activities. This results in missing important logs from the shutdown of Hazelcast members that can be useful for troubleshooting. This change introduces explicit shutdown of the log4j2 LogManager after Hazelcast Node has been shut down: this does no harm when the log4j shutdown hook is enabled (as far as I navigated the log4j code & tested the changes locally). When the shutdown hook is disabled, it allows for clean logging of shutdown information. Example of Hazelcast shutdown logging with GRACEFUL shutdown hook policy: - without this change: ^C2023-11-20 16:26:02,988 [ INFO] [hz.ShutdownThread] [c.h.i.i.Node]: [127.0.0.1]:5701 [dev] [5.4.0-SNAPSHOT] Running shutdown hook... Current node state: ACTIVE - with this change: ^C2023-11-20 16:26:31,183 [ INFO] [hz.ShutdownThread] [c.h.i.i.Node]: [127.0.0.1]:5701 [dev] [5.4.0-SNAPSHOT] Running shutdown hook... Current node state: ACTIVE 2023-11-20 16:26:31,185 [ INFO] [hz.ShutdownThread] [c.h.c.LifecycleService]: [127.0.0.1]:5701 [dev] [5.4.0-SNAPSHOT] [127.0.0.1]:5701 is SHUTTING_DOWN 2023-11-20 16:26:31,191 [ INFO] [hz.ShutdownThread] [c.h.i.i.Node]: [127.0.0.1]:5701 [dev] [5.4.0-SNAPSHOT] Shutting down connection manager... 2023-11-20 16:26:31,195 [ INFO] [hz.ShutdownThread] [c.h.i.i.Node]: [127.0.0.1]:5701 [dev] [5.4.0-SNAPSHOT] Shutting down node engine... 2023-11-20 16:26:31,205 [ INFO] [hz.ShutdownThread] [c.h.i.i.NodeExtension]: [127.0.0.1]:5701 [dev] [5.4.0-SNAPSHOT] Destroying node NodeExtension. 2023-11-20 16:26:31,206 [ INFO] [hz.ShutdownThread] [c.h.i.i.Node]: [127.0.0.1]:5701 [dev] [5.4.0-SNAPSHOT] Hazelcast Shutdown is completed in 18 ms.
default void shutdown() { | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think other logging frameworks suffer same/similar issues. Should we investigate and resolve all? Especially logback and JUL.
@@ -570,6 +570,7 @@ public void shutdown(final boolean terminate) { | |||
if (state != NodeState.SHUT_DOWN) { | |||
shuttingDown.compareAndSet(true, false); | |||
} | |||
loggingService.shutdown(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really want to call this in shutdown? Won't this shutdown the logging framework for the whole JVM? what if something wants to log afterwards?
Two cases I am thinking of:
- our tests where we shutdown nodes
- environment where the instance is managed by another framework, e.g. Spring
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also wonder what shuting down log4j means. Will it shut down logger for whole jvm, including other libraries that uses it. I suppose it shut downs the only Hazelcast's logger context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose it shut downs the only Hazelcast's logger context.
That was my thought too. the LogManager#shutdown
javadoc reads:
Shutdown using the LoggerContext appropriate for the caller of this method.
Digging to figure out what is a "LoggerContext appropriate for the caller of this method", the only concrete information I could find was that a LoggerContext can be associated with a specific webapp's ServletContext
(when running log4j under a webapp server), but no idea how LoggerContext
s are created in a normal Java app.
I did a bit of research into spring-boot and elasticsearch. They both disable the log4j shutdown hook and shut down log4j explicitly (spring-boot, elasticsearch ) , however it is rather safe to assume spring-boot/elasticsearch is probably the only app in the JVM.
I think we should control explicit shutdown of log4j (or any logging framework) by a HazelcastProperty
and ensure it is enabled from our scripts (assuming Hazelcast is the only JVM app) but have it disabled by default. In this way, when using the client or embedded member, the logging framework explicit shutdown won't be executed by default since it can have unexpected side-effects. wdyt @frant-hartm @srknzl ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense imo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implemented in 5c20529
Default is false, so Hazelcast will not explicitly shut down the logging framework during graceful shutdown. When true, Hazelcast will shutdown the backing logging framework as part of its graceful shutdown procedure.
Closed in favour of https://github.com/hazelcast/hazelcast-mono/pull/236 - let's continue the discussion there |
A common issue, especially when Hazelcast runs in Kubernetes pods with log4j2 logging, is that shutdown logs are missing. The reason is that log4j2 registers its own shutdown hook, alongside Hazelcast's own shutdown hook. Since the JVM does not guarantee any ordering of shutdown hook execution, it can be the case that the log4j shutdown hook shuts down loggers while Hazelcast is still performing shut down activities. This results in missing important logs from the shutdown of Hazelcast members that can be useful for troubleshooting. This change introduces explicit shutdown of the log4j2 LogManager after Hazelcast Node has been shut down, controlled by a hazelcast property: this does no harm when the log4j shutdown hook is enabled. When the shutdown hook is disabled, it allows for clean logging of shutdown information. Supersedes #26006 GitOrigin-RevId: 5ca7e476db0f9f9461f20424b482c80084e11514
A common issue, especially when Hazelcast runs in Kubernetes pods with log4j2 logging,
is that shutdown logs are missing. The reason is that log4j2 registers its own shutdown
hook, alongside Hazelcast's own shutdown hook. Since the JVM does not guarantee
any ordering of shutdown hook execution, it can be the case that the log4j shutdown
hook shuts down loggers while Hazelcast is still performing shut down activities.
This results in missing important logs from the shutdown of Hazelcast members that
can be useful for troubleshooting.
This change introduces explicit shutdown of the log4j2 LogManager after Hazelcast Node
has been shut down: this does no harm when the log4j shutdown hook is enabled (as far as
I navigated the log4j code & tested the changes locally). When the shutdown hook is disabled,
it allows for clean logging of shutdown information.
Example of Hazelcast shutdown logging with
GRACEFUL
shutdown hook policy: