Proposal: Alternate backends for Dapr Workflow #7127

cgillum · 2023-10-31T13:52:07Z

In what area(s)?

/area runtime

Describe the proposal

This is a proposal for allowing users to configure alternate backend implementations for Dapr Workflow, beyond the default Actors backend.

Background

Dapr Workflow relies on the Durable Task Framework for Go (a.k.a. durabletask-go) as the core engine for executing workflows. This engine is designed to support multiple backend implementations. For example, the durabletask-go repo includes a SQLite implementation and the Dapr repo includes an Actors implementation. For Dapr Workflow users today, the Actors implementation is the only option.

The backend implementation is largely decoupled from the core engine or the programming model that developers see. Rather, the backend primarily impacts how workflow state is stored and how workflows execution is coordinated across replicas. In that sense, it is similar to Dapr's state store abstraction, except designed specifically for workflow. All APIs and programming model features are the same regardless of which backend is used.

Proposal

This proposal is about opening up the possibility of using alternate durabletask-go backend implementations for Dapr Workflow. For example, the existing (POC) SQLite implementation could be used for local development and testing, or a custom, vendor-specific implementation could be used for specialized scenarios. Dapr Workflow would continue to use the Actors implementation by default, and there is currently no desire to change this.

To be clear, this proposal does not propose making the Dapr Workflow backend into a full-fledged building block. Rather, it would be limited to backends that are either compiled into the durabletask-go project or into the Dapr runtime, like how the Actors backend is today. This is a scoping decision designed to keep the proposal simple and focused, but could be revisited if there's sufficient interest in community-contributed workflow backends.

Also, this is not a proposal to make the APIs for Dapr Workflow pluggable like we did in the start of the project - e.g., with a Temporal API implementation. The workflow management APIs would continue to target only the built-in Dapr Workflow implementation (and would not need to be updated to support alternate backends).

Lastly, this proposal doesn't recommend any specific alternate implementations. SQLite is mentioned as a practical example that could be made available for all Dapr users since a (POC) implementation already exists. But there are other possibilities, such as a Postgres implementation, or even a gRPC-based implementation that could target a highly customized remote workflow backend implementation.

Configuration options

There are currently two options we're considering for configuring the backend implementation. We're seeking community feedback to see whether one option is preferred over the other, or if there are other options we should consider.

Use a Dapr Workflow configuration setting specify the backend implementation and configuration settings. This option would build upon the proposed work to introduce workflow configuration settings in [Workflow] Make the Dapr workflow engine configurable #7089. For example, add backendType and backendConfiguration properties for selecting the alternative backend type and configuring it with backend-specific settings, as show in the following example for SQLite:
```
apiVersion: dapr.io/v1alpha1
kind: Configuration
metadata:
  name: daprConfig
spec:
  workflow:
    # ...
    backendType: "sqlite"
    backendConfiguration:
      connectionString: "file::memory:" # use in-memory SQLite
```
These would be optional settings, with the default being to use the Actors backend as it exists today.
Use component config YAML files to specify the backend implementation and configuration settings. For example, a user could place a sqlite.yaml file in the components directory with the following contents:
```
apiVersion: dapr.io/v1alpha1
kind: Component
metadata:
  name: sqlite
spec:
  type: workflow.dapr.sqlite
  metadata:
  - name: connectionString
    value: "file::memory:" # use in-memory SQLite
```
This option would be similar to how Dapr building blocks work today. The type property would be used to select the backend implementation, and the metadata property would be used to configure it. This option would also be optional, with the default being to use the Actors backend as it exists today.

One potential point of confusion with this option is that users might mistakenly think that this can be used to swap out the workflow engine implementation itself. For example, a user might try to use this option to swap out durabletask-go with a different workflow engine. This is not the case. The workflow engine itself is not configurable in this proposal, only the backend implementation.

The text was updated successfully, but these errors were encountered:

ItalyPaleAle · 2023-10-31T15:09:39Z

I think supporting multiple backends via DTF-go is good.

I personally like the Component proposal more. Perhaps this could be even simplified to a component of type "workflow.dapr" such as:

apiVersion: dapr.io/v1alpha1
kind: Component
metadata:
  name: workflow
spec:
  type: workflow.dapr
  metadata:
  - name: backend
    value: sqlite
  - name: connectionString
    value: "file::memory:"

There can only be 1 component of type "workflow.dapr" in each app. If none is set, the default is created with actors as backend.

yaron2 · 2023-10-31T15:22:08Z

Overall it looks feasible, I just want to confirm one critical thing: users will not need to configure two separate databases in the form of an actor state store AND also a component / configuration for yet another database, correct?

For example, the existing (POC) SQLite implementation could be used for local development and testing

Users can configure SQLite today for Dapr Workflows for dev/test, and that is their only point of interaction with a component so maybe I'm misunderstanding why this is needed for local development and testing.

olitomlinson · 2023-10-31T15:54:05Z

Of the two solutions, I prefer the component-based.

I also like this idea (particularly in-mem SQLite) as it will remove the dependency on the dapr placement service for users who just want to get up and running with Workflows quickly in their local env.

ItalyPaleAle · 2023-10-31T16:01:02Z

@yaron2

users will not need to configure two separate databases in the form of an actor state store AND also a component / configuration for yet another database, correct?

If they are not using actors as backend, with this proposal they don't need to create an actor state store to use workflow (of course, if they want to use actors, they still need to)

Users can configure SQLite today for Dapr Workflows for dev/test, and that is their only point of interaction with a component so maybe I'm misunderstanding why this is needed for local development and testing.

In this case, "dev/test" was meant for the DTF-go library. DTF-Go currently supports Dapr Actors as well as SQLite as backend, and in the tests in that repo, SQLite is used.

cgillum · 2023-10-31T17:06:30Z

If they are not using actors as backend, with this proposal they don't need to create an actor state store to use workflow (of course, if they want to use actors, they still need to)

Correct. It wouldn't make sense for users to have to configure more stores. If anything, this proposal should give users the possibility of configuring less infrastructure.

Users can configure SQLite today for Dapr Workflows for dev/test, and that is their only point of interaction with a component so maybe I'm misunderstanding why this is needed for local development and testing.

@yaron2 The difference in this case would be that you don't actually need the actors control plane either (for pure workflow use cases), reducing the overall footprint of a local dev/test setup. To be clear, this scenario isn't necessarily the focus of the proposal, but the proposal does create a few possibilities that we can subsequently explore, such as lighter-weight local dev setups, or even low-footprint, use-case-optimized production deployments, where it makes sense.

yaron2 · 2023-10-31T17:08:33Z

To be clear, this scenario isn't necessarily the focus of the proposal

That's a good clarification, because reducing the footprint of local dev/test isn't a strong argument here with placement being installed by default with our tooling.

artursouza · 2023-11-10T01:41:49Z

I like @cgillum's proposal to use component (option 2) where the workflow component can be configured just like any other component. We can also accept a dapr.workflow.default type. If user declares more than one workflow type, then one of them must be picked as default - can be an annotation in the CRD or a metadata: defaultWorkflowEngine or something. Declaring more than one workflow component is OK, but one must be the default if there is more than one and workflows can pass which engine it wants to use by name, the same way we use any other component in Dapr.

The variation proposed by @ItalyPaleAle would force the single workflow declaration forever and be unique when compared to other components today. By allowing extensibility, it can enable future scenarios that we might not think about today - like having workflows using different backends.

ASHIQUEMD · 2023-11-21T08:56:03Z

/assign

DeepanshuA · 2023-12-04T14:54:28Z

I like 2nd option of using Component based CRD.
But, Should we use workflow.backend.sqlite as type ? so that users know explicitly that it is for wf backend specifically.

ASHIQUEMD · 2023-12-04T16:15:34Z

Based on the comments, I am going ahead with Component based (option 2) implementation.

We can also accept a dapr.workflow.default type. If user declares more than one workflow type, then one of them must be picked as default - can be an annotation in the CRD or a metadata: defaultWorkflowEngine or something. Declaring more than one workflow component is OK, but one must be the default if there is more than one and workflows can pass which engine it wants to use by name, the same way we use any other component in Dapr.

@artursouza Is your suggestion to make changes in workflow engine or workflow backend?

artursouza · 2023-12-04T18:22:49Z

My suggestion is to make it a different backend but the terminology does not need to include "backend" or "engine" in the component type. They can be instantiated as different components even though the entire engine is reused. The same way we have Kafka code reused for multiple components.

ItalyPaleAle · 2023-12-04T20:35:05Z

Can we initialize multiple DTF-go instances today? I know they register a new gRPC service with the Dapr gRPC server. @cgillum to chime in here

ASHIQUEMD · 2024-01-10T06:12:43Z

@artursouza Thank you for your input and I understand your perspective. After careful consideration, it seems that combining the workflow engine and workflow backend as the same component may not be feasible. The workflow engine component has its own set of APIs, including Start, Terminate, and RaiseEvent, while the backend component only requires the init API for initialization.

Given these differing requirements, I believe it's more practical to keep the workflow engine and workflow backend as separate components. This approach allows us to maintain the distinct functionalities of each component without compromising on their specific APIs.

ItalyPaleAle added this to the v1.13 milestone Nov 7, 2023

dapr-bot assigned ASHIQUEMD Nov 21, 2023

shivamkm07 mentioned this issue Nov 24, 2023

Defining Workflow metrics #7152

Merged

7 tasks

ASHIQUEMD mentioned this issue Dec 8, 2023

Configure alternate backend for Dapr workflow #7283

Merged

7 tasks

ASHIQUEMD mentioned this issue Dec 22, 2023

Workflow backend documentation (workflow state store) dapr/docs#3924

Closed

mukundansundar closed this as completed in #7283 Jan 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: Alternate backends for Dapr Workflow #7127

Proposal: Alternate backends for Dapr Workflow #7127

cgillum commented Oct 31, 2023

ItalyPaleAle commented Oct 31, 2023

yaron2 commented Oct 31, 2023 •

edited

olitomlinson commented Oct 31, 2023

ItalyPaleAle commented Oct 31, 2023

cgillum commented Oct 31, 2023

yaron2 commented Oct 31, 2023

artursouza commented Nov 10, 2023

ASHIQUEMD commented Nov 21, 2023

DeepanshuA commented Dec 4, 2023

ASHIQUEMD commented Dec 4, 2023 •

edited

artursouza commented Dec 4, 2023

ItalyPaleAle commented Dec 4, 2023

ASHIQUEMD commented Jan 10, 2024

Proposal: Alternate backends for Dapr Workflow #7127

Proposal: Alternate backends for Dapr Workflow #7127

Comments

cgillum commented Oct 31, 2023

In what area(s)?

Describe the proposal

Background

Proposal

Configuration options

ItalyPaleAle commented Oct 31, 2023

yaron2 commented Oct 31, 2023 • edited

olitomlinson commented Oct 31, 2023

ItalyPaleAle commented Oct 31, 2023

cgillum commented Oct 31, 2023

yaron2 commented Oct 31, 2023

artursouza commented Nov 10, 2023

ASHIQUEMD commented Nov 21, 2023

DeepanshuA commented Dec 4, 2023

ASHIQUEMD commented Dec 4, 2023 • edited

artursouza commented Dec 4, 2023

ItalyPaleAle commented Dec 4, 2023

ASHIQUEMD commented Jan 10, 2024

yaron2 commented Oct 31, 2023 •

edited

ASHIQUEMD commented Dec 4, 2023 •

edited