Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deadlock on event_del_nolock_() #1654

Open
legale opened this issue May 13, 2024 · 1 comment
Open

deadlock on event_del_nolock_() #1654

legale opened this issue May 13, 2024 · 1 comment

Comments

@legale
Copy link

legale commented May 13, 2024

aarch64 system kernel 5.4.164
libevent2 master fc9bfd2

I got flickering bug with deadlock on event_del_nolock_()

bt:

image

stuck on the libmosquito loop event EV_PERSIST callback.

This is callback

void mqtt_loop_callback(int fd, short event, void *arg) {
  // syslog2(LOG_INFO, "%s\n", __func__);
  int ret;

  struct timespec ts;
  ts.tv_sec = 0;
  ts.tv_nsec = 100000000; // 100ms

  int ret_misc = 0, ret_read = 0, ret_write = 0;

  fd_set read_fds, write_fds;

  FD_ZERO(&read_fds);
  FD_ZERO(&write_fds);

  FD_SET(fd, &read_fds);

  ret = locksafe_ms(mqtt_lock, 2000);
  if (ret){
    // syslog2(LOG_INFO, "%s END\n", __func__);
    return;
  }

  if (mosquitto_want_write(mosq)) {
    FD_SET(fd, &write_fds);
  }
  unlocksafe(mqtt_lock);

  ret = pselect(fd + 1, &read_fds, &write_fds, NULL, &ts, NULL);
  if (ret == -1) {
    syslog2(LOG_ERR, "pselect failed: returned: %d\n", ret);
    return;
  }

  ret = locksafe_ms(mqtt_lock, 1000);
  if (ret){
    // syslog2(LOG_INFO, "%s END\n", __func__);
    return;
  }

  if (FD_ISSET(fd, &read_fds)) {
    ret_read = mosquitto_loop_read(mosq, 1);
  }

  if (FD_ISSET(fd, &write_fds)) {
    ret_write = mosquitto_loop_write(mosq, 10);
  }

  mosquitto_loop_misc(mosq);
  unlocksafe(mqtt_lock);
  // syslog2(LOG_INFO, "%s END\n", __func__);
}

This is init function

static int setup_mqtt_loop_event(agent_config_t *ac) {
  syslog2(LOG_INFO, "%s\n", __func__);
  if (locksafe_ms(ac_events_lock, 1000)) return 1;
  if (locksafe_ms(mqtt_lock, 1000)) {
    unlocksafe(ac_lock);
    return 1;
  }

  ac->events.mqtt_loop.fd = mosquitto_socket(mosq);
  if (ac->events.mqtt_loop.fd == -1) {
    syslog2(LOG_ERR, "get mqtt fd failed\n");
    unlocksafe(mqtt_lock);
    unlocksafe(ac_events_lock);
    return 1;
  }

  if (event_initialized(&ac->events.mqtt_loop.ev)) {
    if (event_del(&ac->events.mqtt_loop.ev))
      syslog2(LOG_ERR, "unable to delete prev. mqtt_loop event\n");
  }

  event_assign(&ac->events.mqtt_loop.ev, ac->events.base, ac->events.mqtt_loop.fd, EV_READ | EV_WRITE | EV_PERSIST, mqtt_loop_callback, (void *)ac);
  event_add(&ac->events.mqtt_loop.ev, NULL);
  unlocksafe(mqtt_lock);
  unlocksafe(ac_events_lock);
  return 0;
}
@azat
Copy link
Member

azat commented May 18, 2024

Looks like lines does not match.

Anyway it is hard to say something based on provided samples. But you can try to compile with TSan at least, and see what it will report about deadlocks. Also if you have threads, please make that you enabled pthread support (evthread_use_pthreads())

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

2 participants