New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Timeout Event Callback Triggered Later Than Expected Even With EVENT_BASE_FLAG_PRECISE_TIMER Flag Set #1484
Comments
Looks like everything should work. How do you calculate the timeout? Maybe you are affected by this bug #1474? Also can you try upstream libevent? |
Hi @azat ,
I have written a test file in C++ explaining what I have done with libevent, and the problem of unexpected delay of timeout event callback seems to be reproducible at least in my Intel NUC. Could you please have a try and see if the problem is reproducible in your computer? The miss request should be 0 if the timeout event of libevent performs as unexpected. Thank you in advance! //
// Created by tonyyxliu on 7/11/23.
//
// This is a test for the problem related to the unexpected delay of libevent
// timeout event callback trigger
//
// Compile this program in release mode and link it with two libraries: 'event' and 'glog'
//
#include <iostream>
#include <event2/buffer.h>
#include <event2/event.h>
#include <event2/util.h>
#include <glog/logging.h>
unsigned int num_issue_attempts = 0;
unsigned int num_missed_requests = 0;
double request_interval =
0.0001; // in seconds (=100 microseconds, equivalent to QPS=10k)
double test_duration = 2; // in seconds
event_base *m_base = nullptr;
event *m_issue_event = nullptr;
event *m_wrapup_event = nullptr;
timeval next_issue_time{};
timeval next_wrapup_time{};
/**
* @brief Transfer from timeval struct to double (in seconds)
* @param tv
* @return
*/
double timeval_to_double(timeval *tv)
{
return tv->tv_sec + static_cast<double>(tv->tv_usec) / 1e+06;
}
/**
* @brief Transfer from double (in seconds) to timeval struct
* @param time_val
* @param tv
* @return
*/
void double_to_timeval(double time_val, timeval *tv)
{
auto secs = static_cast<long long>(time_val);
tv->tv_sec = secs;
tv->tv_usec = static_cast<int>((time_val - secs) * 1e+06);
}
/**
* Add a event for listening and update the next time to tick this event
* @param timeout_event
* @param next_tick_time
* @param tick_interval
* @return true on success, false on error (timeout later than expected)
*/
void add_timeout_event(event *const timeout_event, timeval &next_tick_time,
const double next_tick_interval)
{
timeval now{}, time_before_next_tick{};
gettimeofday(&now, nullptr);
// Trigger the timeout event immediately if the tick interval is 0
if (next_tick_interval == 0)
{
event_active(timeout_event, 0, 0);
return;
}
// Update the next time of tick
double_to_timeval(timeval_to_double(&next_tick_time) + next_tick_interval,
&next_tick_time);
evutil_timersub(&next_tick_time, &now, &time_before_next_tick);
#ifdef DEBUG
LOG(INFO) << "tick_interval is " << next_tick_interval << " seconds.";
LOG(INFO) << "Time before next tick is: " << time_before_next_tick.tv_sec
<< "s and " << time_before_next_tick.tv_usec << "usec.";
#endif
// It may happen that the current time is later than time next_tick_time,
// resulting in a request miss
if (time_before_next_tick.tv_sec < 0)
{
num_missed_requests++;
add_timeout_event(timeout_event, next_tick_time, next_tick_interval);
}
else
{
// No need to delete the event since new timeout will replace the old
// one
evtimer_add(timeout_event, &time_before_next_tick);
}
}
/**
* Callback function for tick event.
* It sends the requests with the help of protocol and trigger the tick()
* function to set another tick timeout event.
* @param fd: socket
* @param events: events
* @param arg: can be transformed to the pointer of Connection class
*/
void timeout_callback(evutil_socket_t fd, short events, void *arg)
{
num_issue_attempts++;
#ifdef DEBUG
LOG(INFO) << "Issue_callback function triggered.";
#endif
// Set timeout event again for the next issue
add_timeout_event(m_issue_event, next_issue_time, request_interval);
// Send requests here
}
/**
* Callback function for the wrapup event. The timeout is set as the duration
* of the connection / load generation. When this event is triggered, the
* connection should be closed and load generation finished.
* @param fd: socket
* @param events: events
* @param arg: can be transformed to the pointer of Connection class
*/
void wrapup_callback(evutil_socket_t fd, short events, void *arg)
{
#ifdef DEBUG
LOG(INFO) << "Test Wrapup";
#endif
// Delete timeout events
evtimer_del(m_issue_event);
evtimer_del(m_wrapup_event);
// Stop listening for events
event_base_loopbreak(m_base);
// Resource destruction
event_free(m_issue_event);
event_free(m_wrapup_event);
event_base_free(m_base);
// Statistics logging
LOG(INFO) << "Number of issue attempts: " << num_issue_attempts;
LOG(INFO) << "Number of missed requests: " << num_missed_requests;
LOG(INFO) << "Miss request: "
<< 100.0 * num_missed_requests /
(num_missed_requests + num_issue_attempts)
<< "%";
// Print the version of libevent
LOG(INFO) << "libevent linked version: " << event_get_version() << "\n";
LOG(INFO) << "libevent headers version: " << LIBEVENT_VERSION << "\n";
LOG(INFO) << "libevent version number: " << std::hex
<< LIBEVENT_VERSION_NUMBER << "\n";
}
int main()
{
event_config *m_event_config = event_config_new();
event_config_set_flag(m_event_config, EVENT_BASE_FLAG_PRECISE_TIMER);
m_base = event_base_new_with_config(m_event_config);
m_issue_event = evtimer_new(m_base, timeout_callback, nullptr);
m_wrapup_event = evtimer_new(m_base, wrapup_callback, nullptr);
// Initialize the next time to tick timeout events as the current time,
// which will be updated in each iteration
gettimeofday(&next_issue_time, nullptr);
gettimeofday(&next_wrapup_time, nullptr);
// Activate the timeout events
add_timeout_event(m_wrapup_event, next_wrapup_time,
test_duration); // test lasts for 5 seconds
event_active(m_issue_event, 0, 0);
// Start listening for events
event_base_dispatch(m_base);
}
This is the result that I got in my Intel NUC. (BTW, the version of linking and headers are different for libevent-2.1.12-stable)
Sorry that I am not sure what this issue (lazy gettimeofday) is really about. It seems that I used the native
The problem still exists after upgrade libevent to the latest version (2.2.1-alpha). |
FWIW I've looked into this problem, and it looks like the problem is that epoll sometimes takes too much time, look: azat:.../tmp/1484 [1]$ sudo perf trace --duration 0.1 env LD_LIBRARY_PATH=$LD_LIBRARY_PATH ./test_azat
0.308 ( 0.291 ms): test_azat/21738 execve(filename: 0xffffe66b, argv: 0x7fffffffe3f8, envp: 0x555555560520) = 0
WARNING: Logging before InitGoogleLogging() is written to STDERR
I20230716 21:55:53.198009 21738 test_azat.cpp:88] tick_interval is 3000000 us.
I20230716 21:55:53.198060 21738 test_azat.cpp:89] Time before next tick is: 1s and 1999947usec.
2.553 I20230716 21:55:53.198233 21738 test_azat.cpp:111] Issue_callback function triggered.
( 0.155 ms): test_azat/21738 epoll_pwait2(epfd: 3<anon_inode:[eventpoll]>, events: 0x55555556e5a0, maxevents: 32, timeout: 0x7fffffffe1d8, sigsetsize: 8) = 0
I20230716 21:55:53.198246 21738 test_azat.cpp:88] tick_interval is 100 us.
I20230716 21:55:53.198256 21738 test_azat.cpp:89] Time before next tick is: -1s and 999809usec.
test_azat: test_azat.cpp:93: void add_timeout_event(event *const, timeval &, const long, bool): Assertion `time_before_next_tick.tv_sec >= 0' failed. P.S. sorry for the long delay, I thought that I will take a look at why epoll is takes that much time, but it is already 2 weeks left and I did not did this yet... Here is the full sources - https://gist.github.com/azat/f60871c83a32b7e31d8dfecb8f10f4a9 |
Dear libevent developers,
I am using libevent-2.1.12-stable in my project to do latency measurement for latency-critical applications like Memcached, and I wish that the callback function of the timeout event could be triggered exactly as I expected even when the timeout is with a precision of microseconds. However, I found that
The callback function of timeout event is sometimes triggered much later than expected
The following screenshot shows what happened, and the logging messages are printed by google-log.
Since I am sending requests to Memcached at a certain frequency (for example, 500 microseconds as shown in the screenshot), I compute the time that I have to wait before sending the next request (466 microseconds in this case), setting it as the timeout of the event by the following command, and finally send the request and prepare for the next tick event when the callback function of the timeout event is triggered.
However, as the screenshots shows, the timeout event callback is triggered after around 10000 microseconds, which is much later than expected (466 microseconds). As all the operations above are handled by the client, there is no server-side issue. Besides, the event_base is created with EVENT_BASE_FLAG_PRECISE_TIMER flag in the following way, which seems fine as well.
event_config *m_event_config = event_config_new(); event_config_set_flag(m_event_config, EVENT_BASE_FLAG_PRECISE_TIMER); event_base *m_base = event_base_new_with_config(m_event_config);
I have tried some other EVENT_BASE_FLAGs such as NO_CACHE_TIME but none of them solved my problem.
I am sorry that the reproducible scripts cannot be provided since only a very small number of timeout events behave unexpected, while others are completely fine. It makes me confused why the timeout event suddenly failed.
Does anyone have any idea about what caused this problem? Thank you very much for your kindness!
Yuxuan
The text was updated successfully, but these errors were encountered: