Create incidents to avoid blacklisted instances #5123
Labels
kind/toil
Categorizes an issue or PR as general maintenance, i.e. cleanup, refactoring, etc.
scope/broker
Marks an issue or PR to appear in the broker section of the changelog
Description
If a workflow instance is blacklisted, then this instance can't be used anymore. To avoid that we should try to create incident before that happens. We do this for example before job activation, if we realize that we are not able to activate the job, since it has to large variables (#4420).
There exist several other places where and how this can happen. For example in one of our last game days we accumulated variables together and at some point the workflow instance get stuck. It was blacklisted, because we were not able to write the next variable record. See the game day summary https://confluence.camunda.com/display/ZEEBE/Game+Day+05.08.2020.
If we would write an incident before this would make it possible for the user to react and solve the issue and he will not lose his workflow instance and the related data.
Another example is multi-instance, we still have the issue that we are not able to create large multi instances #2890. We will blacklist the instance quite easy, but this means the user is not able to resolve it. If we would create an incident before, the user can adjust the collection which is used to create the multi instance. It makes the system more usable.
Blacklisting should only be used to prevent bugs to cause more issues, but if we know limitations of the system then we should handle them properly.
The text was updated successfully, but these errors were encountered: