Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[QUESTION]The memory and CPU of the master node are full, replica does not switch to maste #13185

Open
polaris-alioth opened this issue Apr 2, 2024 · 2 comments

Comments

@polaris-alioth
Copy link
Contributor

version:
6.2.7
Deployment:
1 master 2salve 3sentinel

question
The memory and CPU of the master node are full. The client cannot be connected. replica does not switch to maste.

sentinel 1 log
833:X 30 Mar 2024 12:23:31.774 # +sdown sentinel b37b32a5973d9cd20799c042a64ce5ac2df50bfc 172.16.0.4 26379 @ redis-cluster 172.16.0.3 16379
833:X 30 Mar 2024 12:23:47.774 # -sdown sentinel b37b32a5973d9cd20799c042a64ce5ac2df50bfc 172.16.0.4 26379 @ redis-cluster 172.16.0.3 16379
833:X 30 Mar 2024 12:45:27.868 # +sdown sentinel b37b32a5973d9cd20799c042a64ce5ac2df50bfc 172.16.0.4 26379 @ redis-cluster 172.16.0.3 16379
833:X 30 Mar 2024 12:46:37.648 * client 7fc0e66b5000 auth in sentinel request
833:X 30 Mar 2024 12:46:37.649 * The unknown#user 172.16.0.4:46191 ID = 1532791 db = 0 operation:client  SETNAME sentinel-620f46ae-cmd
833:X 30 Mar 2024 12:46:37.660 # -sdown sentinel b37b32a5973d9cd20799c042a64ce5ac2df50bfc 172.16.0.4 26379 @ redis-cluster 172.16.0.3 16379
833:X 30 Mar 2024 12:46:39.515 * +sentinel-invalid-addr sentinel b37b32a5973d9cd20799c042a64ce5ac2df50bfc 172.16.0.4 26379 @ redis-cluster 172.16.0.3 16379
833:X 30 Mar 2024 12:46:39.515 * +sentinel sentinel 620f46ae4bfd59be25a15a37cfa759762df24af1 172.16.0.4 26379 @ redis-cluster 172.16.0.3 16379
833:X 30 Mar 2024 12:47:08.863 # +sdown sentinel b37b32a5973d9cd20799c042a64ce5ac2df50bfc 172.16.0.4 0 @ redis-cluster 172.16.0.3 16379
833:X 30 Mar 2024 12:48:58.427 # +reset-master master redis-cluster 172.16.0.3 16379
833:X 30 Mar 2024 12:49:00.032 * +sentinel sentinel 620f46ae4bfd59be25a15a37cfa759762df24af1 172.16.0.4 26379 @ redis-cluster 172.16.0.3 16379
833:X 30 Mar 2024 12:49:03.740 * +sentinel sentinel f100f703af108600de88d5fb47f7ad8ad21c3dbe 172.16.0.36 26379 @ redis-cluster 172.16.0.3 16379
833:X 30 Mar 2024 12:49:03.757 * client 7fc0e64e9000 auth in sentinel request
833:X 30 Mar 2024 12:49:03.757 * The unknown#user 172.16.0.36:44691 ID = 1532858 db = 0 operation:client  SETNAME sentinel-f100f703-cmd
833:X 30 Mar 2024 12:49:08.318 * +slave slave 172.16.0.37:16379 172.16.0.37 16379 @ redis-cluster 172.16.0.3 16379
833:X 30 Mar 2024 12:49:08.322 * +slave slave 172.16.0.21:16379 172.16.0.21 16379 @ redis-cluster 172.16.0.3 16379
833:X 30 Mar 2024 12:52:01.881 # +sdown master redis-cluster 172.16.0.3 16379
833:X 30 Mar 2024 12:52:01.969 # +new-epoch 2
833:X 30 Mar 2024 12:52:01.972 # +vote-for-leader f100f703af108600de88d5fb47f7ad8ad21c3dbe 2
833:X 30 Mar 2024 12:52:02.992 # +odown master redis-cluster 172.16.0.3 16379 #quorum 2/2
833:X 30 Mar 2024 12:52:02.992 # Next failover delay: I will not start a failover before Sat Mar 30 12:58:02 2024
833:X 30 Mar 2024 12:52:03.067 # +config-update-from sentinel f100f703af108600de88d5fb47f7ad8ad21c3dbe 172.16.0.36 26379 @ redis-cluster 172.16.0.3 16379
833:X 30 Mar 2024 12:52:03.067 # +switch-master redis-cluster 172.16.0.3 16379 172.16.0.21 16379
833:X 30 Mar 2024 12:52:03.067 * +slave slave 172.16.0.37:16379 172.16.0.37 16379 @ redis-cluster 172.16.0.21 16379
833:X 30 Mar 2024 12:52:03.067 * +slave slave 172.16.0.3:16379 172.16.0.3 16379 @ redis-cluster 172.16.0.21 16379
833:X 30 Mar 2024 12:52:33.095 # +sdown slave 172.16.0.3:16379 172.16.0.3 16379 @ redis-cluster 172.16.0.21 16379
833:X 30 Mar 2024 12:56:37.131 # +sdown sentinel 620f46ae4bfd59be25a15a37cfa759762df24af1 172.16.0.4 26379 @ redis-cluster 172.16.0.21 16379
833:X 30 Mar 2024 12:59:57.360 # -sdown slave 172.16.0.3:16379 172.16.0.3 16379 @ redis-cluster 172.16.0.21 16379

sentinel 2 log

1301:X 30 Mar 2024 12:23:31.820 # +sdown sentinel b37b32a5973d9cd20799c042a64ce5ac2df50bfc 172.16.0.4 26379 @ redis-cluster 172.16.0.3 16379
1301:X 30 Mar 2024 12:23:47.724 # -sdown sentinel b37b32a5973d9cd20799c042a64ce5ac2df50bfc 172.16.0.4 26379 @ redis-cluster 172.16.0.3 16379
1301:X 30 Mar 2024 12:42:57.812 # +sdown sentinel b37b32a5973d9cd20799c042a64ce5ac2df50bfc 172.16.0.4 26379 @ redis-cluster 172.16.0.3 16379
1301:X 30 Mar 2024 12:42:59.777 # -sdown sentinel b37b32a5973d9cd20799c042a64ce5ac2df50bfc 172.16.0.4 26379 @ redis-cluster 172.16.0.3 16379
1301:X 30 Mar 2024 12:45:27.916 # +sdown sentinel b37b32a5973d9cd20799c042a64ce5ac2df50bfc 172.16.0.4 26379 @ redis-cluster 172.16.0.3 16379
1301:X 30 Mar 2024 12:46:38.900 # -sdown sentinel b37b32a5973d9cd20799c042a64ce5ac2df50bfc 172.16.0.4 26379 @ redis-cluster 172.16.0.3 16379
1301:X 30 Mar 2024 12:46:38.901 * client 7fd8a5590000 auth in sentinel request
1301:X 30 Mar 2024 12:46:38.901 * The unknown#user 172.16.0.4:59129 ID = 3070852 db = 0 operation:client  SETNAME sentinel-620f46ae-cmd
1301:X 30 Mar 2024 12:46:39.515 * +sentinel-invalid-addr sentinel b37b32a5973d9cd20799c042a64ce5ac2df50bfc 172.16.0.4 26379 @ redis-cluster 172.16.0.3 16379
1301:X 30 Mar 2024 12:46:39.515 * +sentinel sentinel 620f46ae4bfd59be25a15a37cfa759762df24af1 172.16.0.4 26379 @ redis-cluster 172.16.0.3 16379
1301:X 30 Mar 2024 12:47:08.876 # +sdown sentinel b37b32a5973d9cd20799c042a64ce5ac2df50bfc 172.16.0.4 0 @ redis-cluster 172.16.0.3 16379
1301:X 30 Mar 2024 12:48:58.451 # +reset-master master redis-cluster 172.16.0.3 16379
1301:X 30 Mar 2024 12:49:00.032 * +sentinel sentinel 620f46ae4bfd59be25a15a37cfa759762df24af1 172.16.0.4 26379 @ redis-cluster 172.16.0.3 16379
1301:X 30 Mar 2024 12:49:03.740 * +sentinel sentinel a41bb11bb4dca81e126e2063a0bbfc0b3749f50e 172.16.0.20 26379 @ redis-cluster 172.16.0.3 16379
1301:X 30 Mar 2024 12:49:03.772 * client 7fd8a559f000 auth in sentinel request
1301:X 30 Mar 2024 12:49:03.772 * The unknown#user 172.16.0.20:45693 ID = 3070919 db = 0 operation:client  SETNAME sentinel-a41bb11b-cmd
1301:X 30 Mar 2024 12:49:08.390 * +slave slave 172.16.0.37:16379 172.16.0.37 16379 @ redis-cluster 172.16.0.3 16379
1301:X 30 Mar 2024 12:49:08.396 * +slave slave 172.16.0.21:16379 172.16.0.21 16379 @ redis-cluster 172.16.0.3 16379
1301:X 30 Mar 2024 12:52:01.882 # +sdown master redis-cluster 172.16.0.3 16379
1301:X 30 Mar 2024 12:52:01.959 # +odown master redis-cluster 172.16.0.3 16379 #quorum 2/2

sentinel 3 log

1340:X 30 Mar 2024 12:17:07.901 # +tilt #tilt mode entered
1340:X 30 Mar 2024 12:17:36.793 # -tilt #tilt mode exited
1340:X 30 Mar 2024 12:18:19.780 # +tilt #tilt mode entered
1340:X 30 Mar 2024 12:18:27.728 # +tilt #tilt mode entered
1340:X 30 Mar 2024 12:18:57.749 # -tilt #tilt mode exited
1340:X 30 Mar 2024 12:21:23.735 # +tilt #tilt mode entered
1340:X 30 Mar 2024 12:21:53.799 # -tilt #tilt mode exited
1340:X 30 Mar 2024 12:22:40.976 # +tilt #tilt mode entered
1340:X 30 Mar 2024 12:23:43.977 # -tilt #tilt mode exited
1340:X 30 Mar 2024 12:23:47.206 # +tilt #tilt mode entered
1340:X 30 Mar 2024 12:23:47.714 * client 7fb5f5e08000 auth in sentinel request
1340:X 30 Mar 2024 12:23:47.716 * The unknown#user 172.16.0.20:53649 ID = 3070558 db = 0 operation:client  SETNAME sentinel-a41bb11b-cmd
1340:X 30 Mar 2024 12:23:47.718 * client 7fb5f5e0d000 auth in sentinel request
1340:X 30 Mar 2024 12:23:47.718 * The unknown#user 172.16.0.36:50691 ID = 3070559 db = 0 operation:client  SETNAME sentinel-f100f703-cmd
1340:X 30 Mar 2024 12:23:47.720 * client 7fb5f5fb4000 auth in sentinel request
1340:X 30 Mar 2024 12:23:47.720 * The unknown#user 172.16.0.20:57405 ID = 3070586 db = 0 operation:client  SETNAME sentinel-a41bb11b-cmd
1340:X 30 Mar 2024 12:23:47.721 * client 7fb5f5fb9000 auth in sentinel request
1340:X 30 Mar 2024 12:23:47.721 * The unknown#user 172.16.0.36:47329 ID = 3070587 db = 0 operation:client  SETNAME sentinel-f100f703-cmd
1340:X 30 Mar 2024 12:23:47.723 * client 7fb5f5bd6000 auth in sentinel request
1340:X 30 Mar 2024 12:23:47.723 * The unknown#user 172.16.0.20:42215 ID = 3070612 db = 0 operation:client  SETNAME sentinel-a41bb11b-cmd
1340:X 30 Mar 2024 12:23:47.723 * client 7fb5f5bdb000 auth in sentinel request
1340:X 30 Mar 2024 12:23:47.723 * The unknown#user 172.16.0.36:38197 ID = 3070613 db = 0 operation:client  SETNAME sentinel-f100f703-cmd
1340:X 30 Mar 2024 12:24:00.164 # +tilt #tilt mode entered
1340:X 30 Mar 2024 12:24:22.903 # +tilt #tilt mode entered
1340:X 30 Mar 2024 12:24:26.128 * client 7fb5f5e3f000 auth in sentinel request
1340:X 30 Mar 2024 12:24:26.130 * The unknown#user 172.16.0.36:45843 ID = 3070653 db = 0 operation:client  SETNAME sentinel-f100f703-cmd
1340:X 30 Mar 2024 12:24:26.130 * client 7fb5f5e44000 auth in sentinel request
1340:X 30 Mar 2024 12:24:26.130 * The unknown#user 172.16.0.20:60009 ID = 3070654 db = 0 operation:client  SETNAME sentinel-a41bb11b-cmd
1340:X 30 Mar 2024 12:24:26.131 # +tilt #tilt mode entered
1340:X 30 Mar 2024 12:24:56.147 # -tilt #tilt mode exited
1340:X 30 Mar 2024 12:25:30.894 # +tilt #tilt mode entered
1340:X 30 Mar 2024 12:26:00.952 # -tilt #tilt mode exited
1340:X 30 Mar 2024 12:27:11.029 # +tilt #tilt mode entered
1340:X 30 Mar 2024 12:27:23.973 # +tilt #tilt mode entered
1340:X 30 Mar 2024 12:27:54.058 # -tilt #tilt mode exited
1340:X 30 Mar 2024 12:27:57.965 # +tilt #tilt mode entered
1340:X 30 Mar 2024 12:28:04.159 # +tilt #tilt mode entered
1340:X 30 Mar 2024 12:28:34.225 # -tilt #tilt mode exited
1340:X 30 Mar 2024 12:30:34.041 # +tilt #tilt mode entered
1340:X 30 Mar 2024 12:30:40.886 # +tilt #tilt mode entered
1340:X 30 Mar 2024 12:30:54.813 # +tilt #tilt mode entered
1340:X 30 Mar 2024 12:31:24.849 # -tilt #tilt mode exited
1340:X 30 Mar 2024 12:32:51.449 # +tilt #tilt mode entered
1340:X 30 Mar 2024 12:33:15.241 # +tilt #tilt mode entered
1340:X 30 Mar 2024 12:33:45.258 # -tilt #tilt mode exited
1340:X 30 Mar 2024 12:34:09.541 # +tilt #tilt mode entered
1340:X 30 Mar 2024 12:34:39.933 # -tilt #tilt mode exited
1340:X 30 Mar 2024 12:35:43.750 # +tilt #tilt mode entered
1340:X 30 Mar 2024 12:36:13.817 # -tilt #tilt mode exited
1340:X 30 Mar 2024 12:38:50.105 # +tilt #tilt mode entered
1340:X 30 Mar 2024 12:39:08.949 # +tilt #tilt mode entered
1340:X 30 Mar 2024 12:39:38.967 # -tilt #tilt mode exited
1340:X 30 Mar 2024 12:40:18.716 # +tilt #tilt mode entered
1340:X 30 Mar 2024 12:40:46.749 # -tilt #tilt mode exited
1340:X 30 Mar 2024 12:41:30.678 # +tilt #tilt mode entered
1340:X 30 Mar 2024 12:41:41.797 # +tilt #tilt mode entered
1340:X 30 Mar 2024 12:41:47.699 # +tilt #tilt mode entered
1340:X 30 Mar 2024 12:42:16.912 # +tilt #tilt mode entered
1340:X 30 Mar 2024 12:42:21.743 # +tilt #tilt mode entered
1340:X 30 Mar 2024 12:42:29.721 # +tilt #tilt mode entered
1340:X 30 Mar 2024 12:42:40.674 # +tilt #tilt mode entered
1340:X 30 Mar 2024 12:42:52.735 # +tilt #tilt mode entered
1340:X 30 Mar 2024 12:42:59.728 * client 7fb5f6295000 auth in sentinel request
1340:X 30 Mar 2024 12:42:59.732 * The unknown#user 172.16.0.36:60843 ID = 3071298 db = 0 operation:client  SETNAME sentinel-f100f703-cmd
1340:X 30 Mar 2024 12:42:59.755 * client 7fb5f5e4e000 auth in sentinel request
1340:X 30 Mar 2024 12:42:59.756 * The unknown#user 172.16.0.36:48729 ID = 3071317 db = 0 operation:client  SETNAME sentinel-f100f703-cmd
1340:X 30 Mar 2024 12:42:59.756 * client 7fb5f62d1000 auth in sentinel request
1340:X 30 Mar 2024 12:42:59.756 * The unknown#user 172.16.0.20:55755 ID = 3071318 db = 0 operation:client  SETNAME sentinel-a41bb11b-cmd
1340:X 30 Mar 2024 12:43:22.997 # +tilt #tilt mode entered
1340:X 30 Mar 2024 12:43:44.972 # +tilt #tilt mode entered
1340:X 30 Mar 2024 12:44:16.784 # +tilt #tilt mode entered
1340:X 30 Mar 2024 12:44:21.706 # +tilt #tilt mode entered
1340:X 30 Mar 2024 12:44:34.694 # +tilt #tilt mode entered
1340:X 30 Mar 2024 12:44:43.068 # +tilt #tilt mode entered
1340:X 30 Mar 2024 12:44:43.071 * client 7fb5f5cdd000 auth in sentinel request
1340:X 30 Mar 2024 12:44:43.074 * The unknown#user 172.16.0.36:58955 ID = 3071421 db = 0 operation:client  SETNAME sentinel-f100f703-cmd
1340:X 30 Mar 2024 12:44:43.074 * client 7fb5f5ce2000 auth in sentinel request
1340:X 30 Mar 2024 12:44:43.074 * The unknown#user 172.16.0.20:51851 ID = 3071422 db = 0 operation:client  SETNAME sentinel-a41bb11b-cmd
1340:X 30 Mar 2024 12:44:54.699 # +tilt #tilt mode entered
831:X 30 Mar 2024 12:46:37.508 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
831:X 30 Mar 2024 12:46:37.509 # Redis version=6.2.7, bits=64, commit=0b65d7f4, modified=1, pid=831, just started
831:X 30 Mar 2024 12:46:37.509 # Configuration loaded
831:X 30 Mar 2024 12:46:37.516 * monotonic clock: POSIX clock_gettime
831:X 30 Mar 2024 12:46:37.517 # A key '__redis__compare_helper' was added to Lua globals which is not on the globals allow list nor listed on the deny list.
@tianbin1001
Copy link

tianbin1001 commented Apr 2, 2024

The issue appears to have been triggered by the activation of the sentinel.tilt mode, which subsequently led to a failure to perform failover correctly. When Sentinel is unable to complete certain critical tasks within a specified time frame, it enters tilt mode. These critical tasks include communication with other Sentinel nodes, checking the status of the primary and replica nodes, etc. From the logs, it seems that the node was oscillating between (subjective down) and (objective down).

@tianbin1001
Copy link

tianbin1001 commented Apr 2, 2024

code :

void sentinelCheckTiltCondition(void) {
    /* Check if we need to enter the TILT mode. */
    if (!sentinel.tilt) {
        if (mstime() - sentinel.tilt_start_time >= SENTINEL_TILT_PERIOD) {
            sentinelEvent(LL_WARNING,"-tilt",NULL,"#tilt mode exited");
            sentinel.tilt = 0;
        }
    }
    /* Check if we need to exit the TILT mode. */
    if (sentinel.tilt) {
        if (mstime() - sentinel.tilt_start_time < SENTINEL_TILT_TRIGGER) {
            sentinelEvent(LL_WARNING,"+tilt",NULL,"#tilt mode entered");
            sentinel.tilt = 1;
            sentinel.tilt_start_time = mstime();
        }
    }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants