-
-
Notifications
You must be signed in to change notification settings - Fork 730
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Last status of misc check sometimes not updated in keepalived_check.data #2412
Comments
If the script returns a non-zero exit code (and the checker does not have misc_dynamic enabled), the last_status is only updated if the previous exit code was 0 (i.e. success). In other words, for a non-dynamic checker the "Last status" is only updated when the checker changes status from success to failure or vice versa. The reason for this is that the exit code changing from 1 to 2, or 2 to 1, or 100 to 1, etc, is not changing the status of the checker, i.e. it remains in the failed state. The detail is in the misc_check_child_thread() function in the keepalived/check/check_misc.c source file. keepalived is functioning correctly, other than the If it is important for you that the |
Thanks for explaining the behavior. What I'm trying to do is to monitor the state of keepalived's checks so I could know not only if but also why given real server is down. Misc check is not my only check. I also use BFD check. Therefore I prefer to get all checks status from one place that is from keepalived. Moreover, this way I get real checks' state not the state of diagnostic files. I agree that check status (up/down) is more important but exit code provides some extra troubleshooting info, at least in my case. Last status is present in the state file anyway so having it updated like "Last ran" is would be great. Btw. have you considered including checks' statuses (up/down at least) in the SNMP MIB ? Dumping and parsing state files doesn't seem to be the best method for such use case and I guess is not intended for that purpose. |
@pwloc I hadn't considered adding checker information to the SNMP MIB (I find adding extra SNMP code very tedious!). I wonder if it would be more appropriate to use traps to notify status changes. I would appreciate your thoughts on this. If you or someone else would like to modify the code to add the extra SNMP functionality and submit a pull request, it would be very helpful. |
I understand. SNMP is not easy protocol to work with. As for SNMP traps they don't fit my architecture. I need a place to query for checks status. I would have to add traps listener to hold the state. Prefer to stick with data file. As for implementing SNMP features, I'm affraid that you overestimate my C skills. I'll see If I could find someone else but I find chances slim. |
Commit a4258a6 now updates the exit code if it changes, even if the checker is already down. |
Describe the bug
I've got virtual server configured and real_server has the following misc check configured. The check returns 0 or 1 that can be found in the file or 2 if the file is not present.
Whenever exit code changes between 2 and 1 (or greater value) the change is not present in keepalived_check.data. Of course data file is dumped again. "Last ran" gets updated but "Last status" not
When exit code changes from 0 to 1 or 2 (or vice versa) there is no problem.
To Reproduce
Expected behavior
keepalived_check.data is updated every time check's exit code changes.
Keepalived version
Distro (please complete the following information):
Details of any containerisation or hosted service (e.g. AWS)
Alpine running in docker container running on Ubuntu 22.04.2
The text was updated successfully, but these errors were encountered: