SQL Support for Backlog Counter #5915

Shivs11 · 2024-05-13T21:23:47Z

What changed?

Support for the backlog counter, which was added in #5593, results in Cassandra persisting the backlog counts when tasks are created. However, the same is not being done by SQL databases as we don't persist task queue related information while creating tasks. Thus, if we fetch information immediately after we have created tasks, the backlog counter for SQL databases would read 0 even when we do have tasks in our backlog.

To fix this, a Count(*) query has been implemented whenever we fetch task queues.

Why?

This is required for the backlog counter accurate for both Postgres and SQL databases.

How did you test it?

Integration tests testing the functionality have been added. These verify if things on the postgres end also work as expected.
Existing suite of test cases.

Potential risks

None, since the backlog counter work has not been released yet.

Is hotfix candidate?

No

Shivs11 · 2024-05-13T22:00:55Z

common/persistence/task_manager.go

+	if err != nil {
+		return nil, err
+	}
+	taskQueueInfo.ApproximateBacklogCount = max(taskQueueInfo.ApproximateBacklogCount, int64(numTasks))


For Cassandra: numTasks would be 0 and the actual backlog value would be inside of taskQueueInfo.ApproximateBacklogCount.

For SQL: Vice versa.

common/persistence/persistence_interface.go

common/persistence/task_manager.go

Shivs11 · 2024-05-15T02:31:26Z

service/history/workflow/cache/cache.go

@@ -475,4 +475,4 @@ func makeCacheKey(
 		WorkflowKey: definition.NewWorkflowKey(namespaceID.String(), execution.GetWorkflowId(), execution.GetRunId()),
 		ShardUUID:   shardContext.GetOwner(),
 	}
-}


running make goimports; make lint was complaining by requiring me to include this.

… test to verify the workings of this

Shivs11 · 2024-05-16T14:00:38Z

Note : Discussions with @dnr and @ShahabT reached a consensus which requires a config to be added in order to allow customers to trigger the option of calling Count(*) or not. This shall be added *after this branch merges into shivam/backlog-count-updated and once this branch is looking to be merging into main. This is because of the fact that the current branch is not up-to-date with main, which has undergone config-changes recently. Thus, to avoid repetitive work, this shall be done when merging the feature into main.

service/matching/backlog_manager.go

common/persistence/persistence_interface.go

ShahabT · 2024-05-16T23:45:11Z

service/matching/backlog_manager.go

+	return int64(tasksPresent), nil
+}
+
+func (c *backlogManagerImpl) getApproximateBacklogCount(ctx context.Context) (int64, error) {


I think above this method we should describe the situations that we may under or over count. (when CountTasksExact is not used). These situations comes to my mind:

Overcount: backlog manager is killed without having the chance to persist latest ack level is

Undercount: ownership transfers and the old owner writes more tasks to backlog between the GetTaskQueue and UpdateTaskQueue calls of the new owner lease takeover.

Undercount: the scenario which moving ackLevel to db would have solved

I think for point (1), you mean the situation in which Cassandra randomly TTL'es out tasks which in-turn leads to the counter being an overestimate right?

Shivs11 added 3 commits May 13, 2024 16:30

SQL with Count(*) implemented

f4cf61d

Lint errors

8de1b72

Updated the mock file

a8a2f91

Shivs11 commented May 13, 2024

View reviewed changes

fixed typo for postgresql by replacing ? with

9d1a811

Shivs11 marked this pull request as ready for review May 13, 2024 22:56

Shivs11 requested a review from a team as a code owner May 13, 2024 22:56

Shivs11 requested review from ShahabT, dnr and carlydf May 13, 2024 22:56

ShahabT reviewed May 14, 2024

View reviewed changes

common/persistence/persistence_interface.go Outdated Show resolved Hide resolved

common/persistence/task_manager.go Outdated Show resolved Hide resolved

Shivs11 added 4 commits May 14, 2024 11:18

Only querying Count(*) when user requests backlogInfo

a3ccd5d

Lint fixes

e90d4c1

Re-generated mock files to fix unit test

fe530fd

more lint fixes

511b017

Shivs11 force-pushed the shivam_SQL_Count_Support branch from 12883be to 511b017 Compare May 14, 2024 20:17

Shivs11 added 3 commits May 14, 2024 16:29

tried fixing goimports error

15f2416

Addressed comments, improved function names

7fd9523

Updated metric names

8f95ccb

Shivs11 commented May 15, 2024

View reviewed changes

Shivs11 added 3 commits May 14, 2024 22:37

Fixed broken unit test

030de7b

Returning an unimplemented error for non-SQL databases + updated unit…

a5bad76

… test to verify the workings of this

Fixed lint errors

40e0ee2

ShahabT reviewed May 16, 2024

View reviewed changes

Shivs11 added 2 commits May 17, 2024 10:52

Addressed comments + better comments

3bb2c74

Fixing goimports issue

6a280f3

Shivs11 requested a review from ShahabT May 17, 2024 15:31

ShahabT approved these changes May 17, 2024

View reviewed changes

Shivs11 merged commit 4c03b85 into shivam/backlog-count-updated May 20, 2024
41 checks passed

Shivs11 deleted the shivam_SQL_Count_Support branch May 20, 2024 15:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SQL Support for Backlog Counter #5915

SQL Support for Backlog Counter #5915

Shivs11 commented May 13, 2024 •

edited

Shivs11 May 13, 2024

Shivs11 May 15, 2024

Shivs11 commented May 16, 2024 •

edited

ShahabT May 16, 2024

Shivs11 May 17, 2024

SQL Support for Backlog Counter #5915

SQL Support for Backlog Counter #5915

Conversation

Shivs11 commented May 13, 2024 • edited

What changed?

Why?

How did you test it?

Potential risks

Is hotfix candidate?

Shivs11 May 13, 2024

Choose a reason for hiding this comment

Shivs11 May 15, 2024

Choose a reason for hiding this comment

Shivs11 commented May 16, 2024 • edited

ShahabT May 16, 2024

Choose a reason for hiding this comment

Shivs11 May 17, 2024

Choose a reason for hiding this comment

Shivs11 commented May 13, 2024 •

edited

Shivs11 commented May 16, 2024 •

edited