Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Java SDK seems to not play well with Replay tests. Setting child workflow ID using Temporal library seems to be throwing a non-deterministic error #2057

Open
gauravojha opened this issue May 8, 2024 · 5 comments

Comments

@gauravojha
Copy link

gauravojha commented May 8, 2024

Details attached on the temporal forum support request - https://community.temporal.io/t/workflowreplayer-replayworkflowexecution-throwing-nondeterministicexception-when-reading-workflow-id-from-workflow/12040/6

Copied from the post above

========

Here is a sample repo which replicates the error. The repo also has replay tests and sample workflow history if required under the src/test folder
https://github.com/gauravojha/learningTemporal/tree/master

So what we are trying to do is

(Also mentioned here, hope this helps - learningTemporal/src/main/java/dev/ojha/learningtemporal/workflows/HelloWorldWorkflowImpl.java at master · gauravojha/learningTemporal · GitHub)

  • Have a parent workflow trigger a child workflow
  • What we are trying to do is, reusing the parent workflows ID (this we get by using the provided Temporal utility method - Workflow.getInfo().getWorkflowId(), and then using this is a prefix and just append "-child" to it. So for instance if the parent workflow ID is "helloWorld", we are assigning our child workflow ID as "helloWorld-child", as this simplifies some usecases of tracing and audit for our usecases
    • When we do this, and replay the payload through the replay tests, we get an undeterministic error, I have attached the trace for this specific case at the end
    • Failure handling event 5 of type 'EVENT_TYPE_START_CHILD_WORKFLOW_EXECUTION_INITIATED' during replay. [TMPRL1100] Command COMMAND_TYPE_START_CHILD_WORKFLOW_EXECUTION doesn't match event EVENT_TYPE_START_CHILD_WORKFLOW_EXECUTION_INITIATED with EventId=5 on check workflowId with an expected value 'workflow_id_in_replay-child' and an actual value 'HelloWorldWorkflow-child'
  • However, if we don't reuse the parent's workflow ID, and let temporal use a random ID as the workflow ID of the child, then the replay tests pass.
@Quinn-With-Two-Ns
Copy link
Contributor

How are you replaying history? If you are using WorkflowReplayer.replayWorkflowExecution then when you construct WorkflowExecutionHistory you need to specify the workflow ID

@gauravojha
Copy link
Author

gauravojha commented May 9, 2024

How are you replaying history? If you are using WorkflowReplayer.replayWorkflowExecution then when you construct WorkflowExecutionHistory you need to specify the workflow ID

@Quinn-With-Two-Ns sorry, i am not sure i understand.. I am not doing anything complicated in my code if you see it. my workflow implementation is pretty simple, it only triggers a child workflow. I am not doing any special replaying in there. Any replay is being done in the replay test which is written using instructions at https://docs.temporal.io/dev-guide/java/durable-execution#add-replay-test . I don't see any specific instructions on providing a workflow ID here if i am providing a history extracted from temporal

This is the workflow implementation - https://github.com/gauravojha/learningTemporal/blob/652d2ba76e80b7d1a96946fae42b3d1dec0b3b3f/src/main/java/dev/ojha/learningtemporal/workflows/HelloWorldWorkflowImpl.java#L35-L40

This is the only place where any replay happens (in the WorkflowReplayer in a unit test) - https://github.com/gauravojha/learningTemporal/blob/652d2ba76e80b7d1a96946fae42b3d1dec0b3b3f/src/test/java/dev/ojha/learningtemporal/TemporalTesting.java#L45

From the forum question where this is originally posted, it seems to be a bug in the SDK? The repository I have attached is a very small repository which can be used to reproduce the error. The entire history is downloaded from temporal UI after a sample workflow got over from the above repro repo, so I think I am not constructing the history, it is a history which temporal gives me if I am not mistaken?

@Quinn-With-Two-Ns
Copy link
Contributor

Where you are calling the replayer you are not specifying the workflow ID. You need to specificy a workflow ID when using the replayer since it is not always in history.

        assertDoesNotThrow(() -> WorkflowReplayer.replayWorkflowExecution(eventHistoryFile,
                HelloWorldWorkflowImpl.class));

needs to be something like

        assertDoesNotThrow(() -> WorkflowReplayer.replayWorkflowExecution(WorkflowExecutionHistory.WorkflowExecutionHistory(WorkflowHistoryLoader.readHistory(eventHistoryFile), "YOUR WORKFLOW ID"),
                HelloWorldWorkflowImpl.class));

@gauravojha
Copy link
Author

gauravojha commented May 10, 2024

Got it.. Let me try @Quinn-With-Two-Ns, and thank you for pointing in the right direction, that worked on the sample repo I shared above :) .. i couldnt find this recommendation in the starting replay test documentation. If this is a critical statement, I feel this should be mentioned in the intial stages in the document, since as a new user I may end up only looking at the first recommendation from the doc?

Just adding some more color to this. So we wrote tests for 10 different workflows, and only one of them failed. Rest 9 passed, and we didn't have to pass in a workflow ID to 9 which passed, as we didn't get to the point that we may have to specify the workflow ID. If the following recommendation by you would have been mentioned in the initial stags of the replay test document, it would have helped us keep an eye out for it. Hope this helps?

You need to specificy a workflow ID when using the replayer since it is not always in history

@Quinn-With-Two-Ns
Copy link
Contributor

Yes I agree we should improve the documentation here. Passing the workflow ID is only important if the workflow uses the workflow ID as part of its logic which most don't.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants