Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New CLAIMED state? #188

Open
jkeifer opened this issue Nov 2, 2022 · 1 comment
Open

New CLAIMED state? #188

jkeifer opened this issue Nov 2, 2022 · 1 comment

Comments

@jkeifer
Copy link
Collaborator

jkeifer commented Nov 2, 2022

Processing a payload requires first "claiming" the execution by setting the state to PROCESSING, then actually starting the step function execution, which returns an execution ARN, and then calling set_processing() with that ARN to again set the state to PROCESSING with the addition of execution ARN.

It seems like this makes PROCESSING ambiguous, in the sense that it can represent both "gonna try to start processing" and "processing has started". In this sense I would propose we consider adding a CLAIMED state (or whatever better name makes sense here) to disambiguate between these two states.

One possible issue here is conflating payload state with execution state. Maybe this distinction is not relevant and/or useful, but CLAIMED is a state disconnected from an execution, whereas the rest of the states are derived specifically from an execution. In this sense, what process is trying to do with claiming the execution is to set a lock.

In practice I am not sure this distinction is relevant, as we still need to know if locks are stuck for things that are actually not processing or otherwise in an unexpected state. But I think recognizing the difference is potentially useful in thinking about how to best handle this situation.

Is simply adding CLAIMED the best solution? I once proposed using a dynamo stream to trigger a lambda to do the actual step function start, maybe that fits here? Is the problem really at the database layer and the fact that dynamo won't allow us to structure queries dynamically enough (sql in an RDBMS would trivially allow separating payloads from their individual executions with a way to query the payload's lastest state via a join, and would allow using triggers to ensure a "lock" gets cleared by an execution starting)?

@jkeifer
Copy link
Collaborator Author

jkeifer commented Aug 2, 2023

I think this remark is poignant:

One possible issue here is conflating payload state with execution state. Maybe this distinction is not relevant and/or useful, but CLAIMED is a state disconnected from an execution, whereas the rest of the states are derived specifically from an execution. In this sense, what process is trying to do with claiming the execution is to set a lock.

Perhaps what we really want here is a timestamp on the dyanmo record that represent this "claim lock". If we try to again process the payload before the claim lock time has expired then the request will fail, similarly to how we currently check the state.

Using this mechanism rather than the CLAIMED state is we get automatic expiration--records will not get stuck in this state, and thus it behaves like the lock we want.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant