Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't disable st2timersengine #6039

Open
DesireWithin opened this issue Oct 20, 2023 · 3 comments
Open

Can't disable st2timersengine #6039

DesireWithin opened this issue Oct 20, 2023 · 3 comments

Comments

@DesireWithin
Copy link

SUMMARY

I followed the documentation(https://docs.stackstorm.com/reference/ha.html#blueprint-box) to install a highly available st2,
I can't disable st2timersengine after I add:

[timer]
enable = False

STACKSTORM VERSION

st2 3.8.0, on Python 3.6.9

OS, environment, install method

Ubuntu 18.04.6, install by apt.

Steps to reproduce the problem

add the configuration, and then restart st2:

root@prod-stackstorm-03:/etc/apt/sources.list.d# tail -n 10 /etc/st2/st2.conf
...
db_name = st2
username = stackstorm
password = XXXX
compressors = zstd

[coordination]
url = redis://:Redis_XXXXX@10.XX.XX.XXX:6379

[timer]
enable = False

root@prod-stackstorm-03:/etc/st2# st2ctl restart
Failed to stop st2chatops.service: Unit st2chatops.service not loaded.
Failed to start st2chatops.service: Unit st2chatops.service not found.
##### st2 components status #####
st2actionrunner PID: 102513
st2actionrunner PID: 102515
st2actionrunner PID: 102517
st2actionrunner PID: 102519
st2actionrunner PID: 102521
st2actionrunner PID: 102523
st2actionrunner PID: 102525
st2actionrunner PID: 102527
st2actionrunner PID: 102529
st2actionrunner PID: 102531
st2api PID: 102539
st2stream PID: 102549
st2auth PID: 102559
st2garbagecollector PID: 102562
st2notifier PID: 102565
st2rulesengine PID: 102569
st2sensorcontainer PID: 102572
st2chatops is not running.
st2timersengine PID: 102577
st2workflowengine PID: 102580
st2scheduler PID: 102583

Expected Results

I expect st2timersengine is not running.

Actual Results

Now I have duplicate rule evaluations.

@nzlosh
Copy link
Contributor

nzlosh commented Oct 20, 2023

What coordination backend are you using? I've observed this behaviour with HA setup and am using redis cluster as the coordination backend. Until a fix is found and released I'm using a workaround by putting a simple lock in the workflow that uses the st2 kv store. (This could be adapted to be an action that any workflow can call)

version: 1.0

vars:
  - check_lock_delay: 2

tasks:
  write_execution_id:
    action: st2.kv.set
    input:
      key: <% ctx(st2).action %>_exec_id
      value: <% ctx(st2).action_execution_id %>
    next:
      - when: <% succeeded() %>
        do: wait_to_check_lock

# Delay to allow all nodes to write to the kv store. (Adjust if nodes are heavily loaded and exceed delay)    
  wait_to_check_lock:
    action: core.local
    input:
      cmd: sleep <% ctx(check_lock_delay) %>
    next: 
      - when: <% succeeded() %>
        do: read_execution_id

  read_execution_id:
    action: st2.kv.get
    input:
      key: <% ctx(st2).action %>_exec_id
    next: 
      - when: <% succeeded() and result().result = ctx().st2.action_execution_id %>
        do: proceed

  proceed:
    action: core.local
    input: 
      cmd: echo "ONLY A SINGLE WORKFLOW SHOULD REACH HERE"

@DesireWithin
Copy link
Author

Yes, I'm using redis as a coordination backend. I am looking for a solution using haproxy to monitor st2timersengine progress.

@DesireWithin
Copy link
Author

I use keepalived to make sure only one st2timersengine is running.

MASTER config:

global_defs {
    # notification_email {
    #     your_email@example.com
    # }
    # notification_email_from keepalived@your_server.com
    # smtp_server localhost
    # smtp_connect_timeout 30
    router_id LVS_DEVEL
}

vrrp_script chk_program {
    script "/etc/keepalived/check_program.sh"
    interval 2
    weight -2
    fall 2
    rise 2
}

vrrp_instance VI_1 {
    state MASTER
    interface ens4
    virtual_router_id 51
    priority 101
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass Sts_platform
    }
    track_script {
        chk_program
    }
    notify_master "/etc/keepalived/start_program.sh"
    notify_backup "/etc/keepalived/stop_program.sh"
}

BACKUP config:

global_defs {
    # notification_email {
    #     your_email@example.com
    # }
    # notification_email_from keepalived@your_server.com
    # smtp_server localhost
    # smtp_connect_timeout 30
    router_id LVS_DEVEL
}

vrrp_script chk_program {
    script "/etc/keepalived/check_program.sh"
    interval 2
    weight -2
    fall 2
    rise 2
}

vrrp_instance VI_1 {
    state BACKUP
    interface ens4
    virtual_router_id 51
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass Sts_platform
    }
    track_script {
        chk_program
    }
    notify_master "/etc/keepalived/start_program.sh"
    notify_backup "/etc/keepalived/stop_program.sh"
}

scripts:
check_program.sh

#!/bin/bash

status=$(systemctl status st2timersengine.service)

if [ $? -eq 0 ]; then
  echo "st2timersengine.service is running normally."
  exit 0
else
  echo "Error: st2timersengine.service is not running normally."
  exit 1
fi

start_program.sh

#!/bin/bash
systemctl restart st2timersengine.service

stop_program.sh

#!/bin/bash
systemctl stop st2timersengine.service

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants