Activities get stuck on "Created" if WORKER_GROUP doesn't exist / not running #3706

johnkm516 · 2024-05-13T04:10:53Z

Describe the issue

Activities using WORKER_GROUP for Kestra Enterprise get stuck on the "Created" status indefinitely if the WORKER_GROUP is not running / doesn't exist. The activity ignores all timeouts, and the flow will get stuck on "Running" status unless killed by the user.

Example :

id: worker_group_test
namespace: dev

labels:
  env: dev

tasks:

  - id: wait
    type: io.kestra.plugin.scripts.shell.Commands
    commands:
      - sleep 10
    docker: {}
    runner: PROCESS
    timeout: 1
    workerGroup:
      key: NONEXISTANT_WORKER_GROUP
  - id: print_status
    type: io.kestra.core.tasks.log.Log
    message: hello

Expected Behavior :

The activity should automatically fail if the activity is stuck on "Creating" for a set amount of time, or respect the timeout of the activity.

Environment

Kestra Version: 0.16.6
Operating System (OS/Docker/Kubernetes): Docker

The text was updated successfully, but these errors were encountered:

loicmathieu · 2024-05-13T08:26:44Z

The timeout is handled by the Woker, as the worker group didn't exist; no worker will handle the task, so the timeout cannot be hit.

We have an opened issue in our internal repository about that but I keep this public one opened for you to have feedback.

johnkm516 · 2024-05-13T10:30:58Z

The timeout is handled by the Woker, as the worker group didn't exist; no worker will handle the task, so the timeout cannot be hit.

We have an opened issue in our internal repository about that but I keep this public one opened for you to have feedback.

Hi @loicmathieu ,
Thank you for your response.

As worker groups can be on different server racks, I think there should be some sort of timeout outside of the task execution at the executor so that the flow fails if the task cannot be executed on the worker group. If the flow doesn't fail and continue indefinitely, it will be difficult to monitor and know if a flow is failing due to issues on a different VM or server rack where the worker group is located.

johnkm516 added the bug Something isn't working label May 13, 2024

loicmathieu added this to the v0.18.0 milestone May 13, 2024

anna-geller assigned loicmathieu May 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Activities get stuck on "Created" if WORKER_GROUP doesn't exist / not running #3706

Activities get stuck on "Created" if WORKER_GROUP doesn't exist / not running #3706

johnkm516 commented May 13, 2024

loicmathieu commented May 13, 2024

johnkm516 commented May 13, 2024

Activities get stuck on "Created" if WORKER_GROUP doesn't exist / not running #3706

Activities get stuck on "Created" if WORKER_GROUP doesn't exist / not running #3706

Comments

johnkm516 commented May 13, 2024

Describe the issue

Environment

loicmathieu commented May 13, 2024

johnkm516 commented May 13, 2024