ChannelSendOperator.WriteBarrier race condition in request(long) method leads to response being dropped #31865
Labels
in: web
Issues in web modules (web, webmvc, webflux, websocket)
status: backported
An issue that has been backported to maintenance branches
type: bug
A general bug
Milestone
Affects: spring-web 6.1.2
Context
I am using spring-boot-starter-undertow and WebFlux.
Description
When I have a WebFlux controller
@RequestMapping
handler method that returns aPublisher<T>
of at least 2 elements for whichpublishOn
is applied to move processing to a different Scheduler other than Undertow's XNIO threads, I am observing a race condition in ChannelSendOperator.WriteBarrier#request(long n) between my thread and the XNIO thread that is processingonWritePossible
from the channel selector. This sounds similar to the closed issue described previously in #21098.I can produce this race condition simply by setting a thread-suspending breakpoint on L292, and running e.g. this quick-and-dirty test case that I threw together (continue from the breakpoint after the XNIO thread parks on the object monitor indicated).
The race happens when
emitCachedSignals
, writing the first element to the client and allow the selector to propagate the "WritePossible" event on the XNIO thread, but has not passedthis.state = State.READY_TO_WRITE
.State.READY_TO_WRITE
check, where it would normally had requested more.In this race condition,
s.request(n)
is no longer called, meaning that the response is never finished sending.Thread Dump
The two relevant threads:
Speculation
On the other methods of
WriteBarrier
that happen to synchronize onthis
(onNext
,onError
,onComplete
), I noticed that there is double-checked locking in play forthis.state == State.READY_TO_WRITE
. Would it be correct to add this for therequest
method, such that we make a request upstream when we encounter this race condition?The text was updated successfully, but these errors were encountered: