Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support large input collections for Multi-instance #12692

Merged
merged 5 commits into from May 11, 2023

Conversation

remcowesterhoud
Copy link
Contributor

@remcowesterhoud remcowesterhoud commented May 8, 2023

Description

This PR makes it so the child elements of a multi-instance are activated in batches. This will prevent the activation from exceeding the batch size.

The way it works is by writing ProcessInstanceBatchRecord.ACTIVATE command. This command contains an index, which is the total amount of child elements that need to be activated.
During the processing of this command we will check if another ProcessInstance.ACTIVATE command will still fit in the batch. If it does we write and continue with the next one.
If the command doesn't fit we will write a followup ProcessInstanceBatchRecord.ACTIVATE command instead. In this command the index will be the total amount of children - the already activated amount of children.
This keeps repeating until the index reaches 0. At this point we'll have written an ACTIVATE command for all element of the input collection.

Only back porting to 8.1 and 8.2, as these records are not available on 8.0

Related issues

closes #2890

Definition of Done

Not all items need to be done depending on the issue and the pull request.

Code changes:

  • The changes are backwards compatibility with previous versions
  • If it fixes a bug then PRs are created to backport the fix to the last two minor versions. You can trigger a backport by assigning labels (e.g. backport stable/1.3) to the PR, in case that fails you need to create backports manually.

Testing:

  • There are unit/integration tests that verify all acceptance criterias of the issue
  • New tests are written to ensure backwards compatibility with further versions
  • The behavior is tested manually
  • The change has been verified by a QA run
  • The impact of the changes is verified by a benchmark

Documentation:

  • The documentation is updated (e.g. BPMN reference, configuration, examples, get-started guides, etc.)
  • If the PR changes how BPMN processes are validated (e.g. support new BPMN element) then the Camunda modeling team should be informed to adjust the BPMN linting.

Other teams:
If the change impacts another team an issue has been created for this team, explaining what they need to do to support this change.

Please refer to our review guidelines.

Verifies that a multi instance with a large input collection can activate all it's
 children without running out of batch size.
 The test also verifies that batching is used for the activating.
Improves the assertion message by explaining what the next steps are if it fails.
Copy link
Member

@koevskinikola koevskinikola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Awesome reuse of the ProcessInstanceBatch record @remcowesterhoud.

The changes look good. I just left some optional feedback that shouldn't block you from merging this PR.

Comment on lines +83 to +104
private void writeFollowupBatchCommand(
final ProcessInstanceBatchRecord recordValue, final long index) {
final var nextBatchRecord =
new ProcessInstanceBatchRecord()
.setProcessInstanceKey(recordValue.getProcessInstanceKey())
.setBatchElementInstanceKey(recordValue.getBatchElementInstanceKey())
.setIndex(index);
final long key = keyGenerator.nextKey();
commandWriter.appendFollowUpCommand(key, ProcessInstanceBatchIntent.ACTIVATE, nextBatchRecord);
}

private boolean canWriteCommands(
final TypedRecord<ProcessInstanceBatchRecord> record,
final ProcessInstanceRecord childInstanceRecord) {
// We must have space in the batch to write both the ACTIVATE command as the potential
// follow-up batch command. An excessive 8Kb is added to account for metadata. This is way
// more than will be necessary.
final var expectedCommandLength =
record.getLength() + childInstanceRecord.getLength() + (1024 * 8);
return commandWriter.canWriteCommandOfLength(expectedCommandLength);
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💭 I wonder if it makes sense to extract these two methods in a *Behavior class since there are two somewhat similar methods in the TerminateProcessInstanceBatchProcessor, and I assume we might have further use of the ProcessInstanceBatch record in the future, so we might duplicate these methods even more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about this as well. Thing is, in the end it's only 2 lines of code. One of which is used to determine the expectedCommandLength. This is different for both of the processors we have now. Moving this to a behavior class would mean that we require a parameter getting the expectedCommandLength. So in the end all the behavior method will do is make a call to the command writer.

💭 I guess we could argue that the + 8KB is duplicated and could be extracted from the calculation. To be honest, at this time I don't really care. Let's see if we start using this command more often and how similar the other checks will become.

Adds the ActivateProcessInstanceBatchProcessor and registers it with the Engine.

This processor is responsible to handle the ProcessInstanceBatch.ACTIVATE commands.
It is used for activating child element instance of a multi instance element in a batch-like manner.
The index in the command is used to indicate how much children need to be activated.
If the amount of children will exceed the max message size a followup batch command is written.
The index of the followup command is decremented for each of the activate commands that have been send.
A batch ACTIVATE operation works different from a TERMINATE batch operation.
During a TERMINATE the index is the key of a child element instance. For ACTIVATE it is a simple counter,
 keeping track of how many ACTIVATE commands we still need to write.
Instead of writing all the ACTIVATE commands for all children at once, we now use the
ProcessInstanceBatch.ACTIVATE command instead.

This command takes an index, which in this case is the amount of children that an ACTIVATE command will be written for.
@remcowesterhoud
Copy link
Contributor Author

bors merge

@zeebe-bors-camunda
Copy link
Contributor

Build succeeded:

@zeebe-bors-camunda zeebe-bors-camunda bot merged commit 70db923 into main May 11, 2023
32 checks passed
@zeebe-bors-camunda zeebe-bors-camunda bot deleted the 2890_large_input_collection branch May 11, 2023 18:01
@backport-action
Copy link
Collaborator

Successfully created backport PR for stable/8.1:

@backport-action
Copy link
Collaborator

Successfully created backport PR for stable/8.2:

zeebe-bors-camunda bot added a commit that referenced this pull request May 11, 2023
12742: [Backport stable/8.1] Support large input collections for Multi-instance r=remcowesterhoud a=backport-action

# Description
Backport of #12692 to `stable/8.1`.

relates to #2890

Co-authored-by: Remco Westerhoud <remco@westerhoud.nl>
zeebe-bors-camunda bot added a commit that referenced this pull request May 11, 2023
12743: [Backport stable/8.2] Support large input collections for Multi-instance r=remcowesterhoud a=backport-action

# Description
Backport of #12692 to `stable/8.2`.

relates to #2890

Co-authored-by: Remco Westerhoud <remco@westerhoud.nl>
@oleschoenburg oleschoenburg added the version:8.2.5 Marks an issue as being completely or in parts released in 8.2.5 label May 16, 2023
zeebe-bors-camunda bot added a commit that referenced this pull request May 23, 2023
12742: [Backport stable/8.1] Support large input collections for Multi-instance r=remcowesterhoud a=backport-action

# Description
Backport of #12692 to `stable/8.1`.

relates to #2890

Co-authored-by: Remco Westerhoud <remco@westerhoud.nl>
zeebe-bors-camunda bot added a commit that referenced this pull request Jun 1, 2023
12742: [Backport stable/8.1] Support large input collections for Multi-instance r=remcowesterhoud a=backport-action

# Description
Backport of #12692 to `stable/8.1`.

relates to #2890

Co-authored-by: Remco Westerhoud <remco@westerhoud.nl>
@oleschoenburg oleschoenburg added the version:8.1.13 Marks an issue as being completely or in parts released in 8.1.13 label Jun 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport stable/8.2 Backport a pull request to 8.2.x version:8.1.13 Marks an issue as being completely or in parts released in 8.1.13 version:8.2.5 Marks an issue as being completely or in parts released in 8.2.5
Projects
None yet
Development

Successfully merging this pull request may close these issues.

I can spawn inner instances for a large input collection
4 participants