Replies: 2 comments
-
same issue here any solution? |
Beta Was this translation helpful? Give feedback.
-
When a job is killed abruptly, Spring Batch won't have a chance to update its status in the Job repository, so the status is stuck at I also wrote about this case with an example in the blog post here: https://spring.io/blog/2021/01/27/spring-batch-on-kubernetes-efficient-batch-processing-at-scale#4-gracefulabrupt-shutdown-implication. Let me know if this helps. |
Beta Was this translation helpful? Give feedback.
-
Our project encountered an issue, the context is as follows:
Our project has some data synchronization jobs. These jobs are deployed in a cluster built by another team. The cluster uses k8s. Recently, the container where the job is located always disappears suddenly during the execution process, resulting in the status of the executing job being unable to be updated (execution status is unknown, step execution is executing), which leads to an error like below:
A job execution for this job is already running: JobExecution: id=9211, version=1, startTime=2024-01-09 02:03:04.677, endTime=null, lastUpdated=2024-01-09 02:03:04.678, status=STARTED, exitStatus=exitCode=UNKNOWN;
After investigation, it was found that the job was terminated because the node where the container was located was terminated, and it would take a long time for the team to fix it.
So, from our own side, is there any elegant way to reset the status of these jobs that executed normally but terminated abnormally because the container was killed, so that they can run normally again?
Beta Was this translation helpful? Give feedback.
All reactions