New `CLAIMED` state? #188

jkeifer · 2022-11-02T22:15:36Z

Processing a payload requires first "claiming" the execution by setting the state to PROCESSING, then actually starting the step function execution, which returns an execution ARN, and then calling set_processing() with that ARN to again set the state to PROCESSING with the addition of execution ARN.

It seems like this makes PROCESSING ambiguous, in the sense that it can represent both "gonna try to start processing" and "processing has started". In this sense I would propose we consider adding a CLAIMED state (or whatever better name makes sense here) to disambiguate between these two states.

One possible issue here is conflating payload state with execution state. Maybe this distinction is not relevant and/or useful, but CLAIMED is a state disconnected from an execution, whereas the rest of the states are derived specifically from an execution. In this sense, what process is trying to do with claiming the execution is to set a lock.

In practice I am not sure this distinction is relevant, as we still need to know if locks are stuck for things that are actually not processing or otherwise in an unexpected state. But I think recognizing the difference is potentially useful in thinking about how to best handle this situation.

Is simply adding CLAIMED the best solution? I once proposed using a dynamo stream to trigger a lambda to do the actual step function start, maybe that fits here? Is the problem really at the database layer and the fact that dynamo won't allow us to structure queries dynamically enough (sql in an RDBMS would trivially allow separating payloads from their individual executions with a way to query the payload's lastest state via a join, and would allow using triggers to ensure a "lock" gets cleared by an execution starting)?

The text was updated successfully, but these errors were encountered:

jkeifer · 2023-08-02T22:10:06Z

I think this remark is poignant:

One possible issue here is conflating payload state with execution state. Maybe this distinction is not relevant and/or useful, but CLAIMED is a state disconnected from an execution, whereas the rest of the states are derived specifically from an execution. In this sense, what process is trying to do with claiming the execution is to set a lock.

Perhaps what we really want here is a timestamp on the dyanmo record that represent this "claim lock". If we try to again process the payload before the claim lock time has expired then the request will fail, similarly to how we currently check the state.

Using this mechanism rather than the CLAIMED state is we get automatic expiration--records will not get stuck in this state, and thus it behaves like the lock we want.

jkeifer mentioned this issue Nov 2, 2022

ProcessPayload() has a race condition due to setting failed on any exceptions #189

Open

jkeifer mentioned this issue Aug 4, 2023

Benchmark/analyze process timing and optimize timeout settings #167

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New `CLAIMED` state? #188

New `CLAIMED` state? #188

jkeifer commented Nov 2, 2022

jkeifer commented Aug 2, 2023

New CLAIMED state? #188

New CLAIMED state? #188

Comments

jkeifer commented Nov 2, 2022

jkeifer commented Aug 2, 2023

New `CLAIMED` state? #188

New `CLAIMED` state? #188