Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RedisSessionExpirationPolicy.cleanExpiredSessions failed with RedisCommandTimeoutException #2496

Closed
huaxne opened this issue Aug 28, 2023 · 5 comments

Comments

@huaxne
Copy link

huaxne commented Aug 28, 2023

I use Spring Session with Redis in my work.
By default, SpringSession executes scheduled tasks (RedisSessionExpirationPolicy.cleanExpiredSessions) every 60 seconds, and occasionally RedisCommandTimeoutException occurs.
It seems like connection problem, and it continued about 15 minutes.
I tried to add cron task to validate connection, and it run well. But the exception still occurs.
What should I do to fix this connection problem.

cron

@scheduled(fixedRate = 5000)
public void lettuceConnCheck() {
try {
lettuceConnectionFactory.validateConnection();
} catch (Exception e) {
log.error("lettuceConnCheck failed with exception[{}]. cause:[{}], message:[{}]",
e.getClass(), e.getCause(), e.getMessage());
log.warn("lettuceConnCheck failed with exception:", e);
}
}

lettuce config:

    final ClusterTopologyRefreshOptions clusterTopologyRefreshOptions = ClusterTopologyRefreshOptions.builder()
            .enablePeriodicRefresh(Duration.ofSeconds(5L))
            .enableAllAdaptiveRefreshTriggers()
            .build();

    final ClusterClientOptions clusterClientOptions = ClusterClientOptions.builder()
            .topologyRefreshOptions(clusterTopologyRefreshOptions)
            .validateClusterNodeMembership(false)
            .maxRedirects(cluster.getMaxRedirects())
            .build();

    final Duration commandTimeout = properties.getTimeout();
    final LettuceClientConfiguration clientConfig = LettuceClientConfiguration.builder()
            .commandTimeout(commandTimeout)
            .clientOptions(clusterClientOptions)
            .build();

logs:
[tag:t2-sec-switch-web001] 2023-08-26 12:21:05.000 [ERROR] [pool-3-thread-1] [org.springframework.scheduling.support.TaskUtils$LoggingErrorHandler:95] Unexpected error occurred in scheduled task
org.springframework.dao.QueryTimeoutException: Redis command timed out; nested exception is io.lettuce.core.RedisCommandTimeoutException: Command timed out after 5 second(s)
at org.springframework.data.redis.connection.lettuce.LettuceExceptionConverter.convert(LettuceExceptionConverter.java:70)
at org.springframework.data.redis.connection.lettuce.LettuceExceptionConverter.convert(LettuceExceptionConverter.java:41)
at org.springframework.data.redis.PassThroughExceptionTranslationStrategy.translate(PassThroughExceptionTranslationStrategy.java:44)
at org.springframework.data.redis.FallbackExceptionTranslationStrategy.translate(FallbackExceptionTranslationStrategy.java:42)
at org.springframework.data.redis.connection.lettuce.LettuceConnection.convertLettuceAccessException(LettuceConnection.java:277)
at org.springframework.data.redis.connection.lettuce.LettuceConnection.await(LettuceConnection.java:1085)
at org.springframework.data.redis.connection.lettuce.LettuceConnection.lambda$doInvoke$4(LettuceConnection.java:938)
at org.springframework.data.redis.connection.lettuce.LettuceInvoker$Synchronizer.invoke(LettuceInvoker.java:665)
at org.springframework.data.redis.connection.lettuce.LettuceInvoker.just(LettuceInvoker.java:94)
at org.springframework.data.redis.connection.lettuce.LettuceSetCommands.sMembers(LettuceSetCommands.java:163)
at org.springframework.data.redis.connection.DefaultedRedisConnection.sMembers(DefaultedRedisConnection.java:866)
at org.springframework.data.redis.core.DefaultSetOperations.lambda$members$11(DefaultSetOperations.java:244)
at org.springframework.data.redis.core.RedisTemplate.execute(RedisTemplate.java:224)
at org.springframework.data.redis.core.RedisTemplate.execute(RedisTemplate.java:191)
at org.springframework.data.redis.core.AbstractOperations.execute(AbstractOperations.java:97)
at org.springframework.data.redis.core.DefaultSetOperations.members(DefaultSetOperations.java:244)
at org.springframework.data.redis.core.DefaultBoundSetOperations.members(DefaultBoundSetOperations.java:162)
at org.springframework.session.data.redis.RedisSessionExpirationPolicy.cleanExpiredSessions(RedisSessionExpirationPolicy.java:132)
at org.springframework.session.data.redis.RedisIndexedSessionRepository.cleanupExpiredSessions(RedisIndexedSessionRepository.java:424)
at org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:54)
at org.springframework.scheduling.concurrent.ReschedulingRunnable.run(ReschedulingRunnable.java:95)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: io.lettuce.core.RedisCommandTimeoutException: Command timed out after 5 second(s)
at io.lettuce.core.internal.ExceptionFactory.createTimeoutException(ExceptionFactory.java:59)
at io.lettuce.core.internal.Futures.awaitOrCancel(Futures.java:246)
at io.lettuce.core.LettuceFutures.awaitOrCancel(LettuceFutures.java:74)
at org.springframework.data.redis.connection.lettuce.LettuceConnection.await(LettuceConnection.java:1083)
... 22 common frames omitted
........
........
[tag:t2-sec-switch-web001] 2023-08-26 12:35:28.244 [INFO] [lettuce-eventExecutorLoop-1-3] [io.lettuce.core.protocol.ConnectionWatchdog:171] Reconnecting, last destination was xx.xx.xx.xx/xx.xxx.xx.xx:8001
[tag:t2-sec-switch-web001] 2023-08-26 12:35:28.247 [INFO] [lettuce-epollEventLoop-6-2] [io.lettuce.core.protocol.ReconnectionHandler:174] Reconnected to xx.xx.xx.xx:xx

Environment

  • JDK: 1.8.0_271
  • Lettuce-core version(s): 6.1.10
  • Redis version: 6.0.6
  • Spring boot : 2.7.8
@mp911de
Copy link
Collaborator

mp911de commented Aug 28, 2023

You need to find out the reason for the timeout. If you encounter a timeout period, make sure that all Redis nodes are up and reachable. Timeouts are a symptom of infrastructure issues.

@huaxne
Copy link
Author

huaxne commented Aug 29, 2023

Thank you for your support.

What's strange to me is that my scheduled check task executes LettuceConnectionFactory. validateConnection() every 5 seconds; And no log of validate failure was found. This indicates that the current connection is okay.

But why does a timeout error occur when using it in SpringSession, and it lasts for 15 minutes (during which LettuceConnectionFactory. validateConnection() runs normally).

I have checked the source code of validateConnection and found it executes the ping() function. Can't this function guarantee that the current connection is valid?

@huaxne
Copy link
Author

huaxne commented Aug 29, 2023

You need to find out the reason for the timeout. If you encounter a timeout period, make sure that all Redis nodes are up and reachable. Timeouts are a symptom of infrastructure issues.

I think there will always be occasional issues with the server or network. What I need is to promptly sense that the current connection is not available and quickly create new TCP connections. The log above shows that this problem occurs approximately 1-2 times a day (15 minutes each time). But LettuceConnectionFactory ValidateConnection() did not recognize that this connection is not available.

@mp911de
Copy link
Collaborator

mp911de commented Sep 1, 2023

Cluster connections cannot be reliably verified unless issuing a PING to every cluster node as the subsequent operation can target any node within the cluster. I think it is reasonable to move that requirement into Spring Data Redis.

@huaxne
Copy link
Author

huaxne commented Sep 7, 2023

Cluster connections cannot be reliably verified unless issuing a PING to every cluster node as the subsequent operation can target any node within the cluster. I think it is reasonable to move that requirement into Spring Data Redis.

Yes, I think you're right(it is reasonable to move that requirement into Spring Data Redis).
I solved this problem through the keepalive setting.
1428
Thank you very much for your help.

@huaxne huaxne closed this as completed Sep 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants