Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BufferTimeout with fair backpressure rework #3634

Draft
wants to merge 8 commits into
base: 3.5.x
Choose a base branch
from

Conversation

chemicL
Copy link
Member

@chemicL chemicL commented Nov 7, 2023

The fair backpressure variant of the bufferTimeout operator has been reworked to use a state machine with a minimum number of volatile variables eliminating potential data races, such as skipping the delivery when onNext and timeout happen concurrently or cancellation happens while onNext is delivered, etc.

Resolves #3531

@chemicL chemicL added the type/bug A general bug label Nov 7, 2023
@chemicL chemicL added this to the 3.5.12 milestone Nov 7, 2023
@chemicL chemicL self-assigned this Nov 7, 2023
@chemicL chemicL requested a review from a team as a code owner November 7, 2023 12:57
Copy link
Contributor

@OlegDokuka OlegDokuka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue sits deeper and as I suggested first requires complete rework.

To reproduce the issue try to modify discard setup as follows:

@Tag("slow")
public class OnDiscardShouldNotLeakTest {

	private static final int NB_ITERATIONS = 100_000;
	// add DiscardScenarios here to test more operators
	private static final DiscardScenario[] SCENARIOS = new DiscardScenario[] {
			DiscardScenario.fluxSource("bufferTimeout", f -> f.bufferTimeout(2, Duration.ofNanos(1), true).flatMapIterable(Function.identity())),
	};
       ...

the problem appears when multiple threads comes into play. If you have race between scheduled timeout task trying to flash the window and cancellation ,then some elements might be undischarged

@OlegDokuka
Copy link
Contributor

Although the above could be false negative, since the async task can be unawaited properly. Can you please doublecheck @chemicL

@OlegDokuka
Copy link
Contributor

I just remembered that it was one of the reasons why all the discard tests are in stress tests, therefore it could make sense to port part of them for bufferTimeout

@violetagg violetagg modified the milestones: 3.5.12, 3.5.13 Nov 14, 2023
@chemicL
Copy link
Member Author

chemicL commented Nov 22, 2023

@OlegDokuka the commit 0334959 adds significant improvements to the test suite that helps catch the racy situations. In the next commit I will add temporary fixes for the identified issues, but afterwards will follow up with a state machine implementation to eliminate the last one with timeout racing with draining.

@chemicL chemicL marked this pull request as draft November 22, 2023 13:07
@chemicL chemicL modified the milestones: 3.5.16, 3.5.17 Apr 9, 2024
@chemicL chemicL changed the title Handling late arriving drain operations in bufferTimeout BufferTimeout with fair backpressure rework Apr 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug A general bug
Projects
None yet
3 participants