Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent double flush due to race in SingleDirectoryDbLedgerStorage #4305

Merged
merged 1 commit into from
May 31, 2024

Conversation

michaeljmarshall
Copy link
Member

Disclaimer: I am not very familiar with this code, so please review carefully.

Motivation

We use the isFlushOngoing and the hasFlushBeenTriggered atomic booleans to prevent concurrent flushes. See:

// Write cache is full, we need to trigger a flush so that it gets rotated
// If the flush has already been triggered or flush has already switched the
// cache, we don't need to trigger another flush
if (!isFlushOngoing.get() && hasFlushBeenTriggered.compareAndSet(false, true)) {
// Trigger an early flush in background
log.info("Write cache is full, triggering flush");
executor.execute(() -> {
long startTime = System.nanoTime();
try {
flush();
} catch (IOException e) {
log.error("Error during flush", e);
} finally {
flushExecutorTime.addLatency(MathUtils.elapsedNanos(startTime), TimeUnit.NANOSECONDS);
}
});
}

By my reading of the code, it seems there is a brief period of time where the calling code will think it is valid to flush, even though there is a flush in progress.

Changes

In the SingleDirectoryDbLedgerStorage#swapWriteCache method modified by this PR, I propose a change to set isFlushOngoing to true before setting hasFlushBeenTriggered to false.

Master Issue: none

Note: I have not observed this race in an application,
but only noticed it when reading through the code.
Copy link
Member

@shoothzj shoothzj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dlg99
Copy link
Contributor

dlg99 commented May 29, 2024

closing to re-trigger CI

@dlg99 dlg99 closed this May 29, 2024
@dlg99 dlg99 reopened this May 29, 2024
@dlg99 dlg99 merged commit 3c5123d into apache:master May 31, 2024
42 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants