Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add timeout enforcement to ProtocolClient #276

Merged
merged 4 commits into from
May 29, 2024
Merged

Conversation

jhump
Copy link
Member

@jhump jhump commented May 17, 2024

This moves the timeout enforcement out of the HTTPClientInterface and into the ProtocolClientInterface. This is more appropriate so that the protocol implementations can be aware of the timeout and add a timeout header, to propagate the deadline to the server.

I've broken it up into a sequence of commits that hopefully makes it easier to review.

I tried looking into making the timeout scheduler coroutine-aware. But this caused issues. Even though I could propagate the CoroutineScope from the caller, when invoked from a unary suspend function or a stream function, the issue is how CoroutineScope.launch impacts the scope lifetime. In particular, a scope can't complete until all of its children complete, so I was running into issues where code was incorrectly blocking on that child coroutine (which was just doing delay for the full timeout period) to end.

I didn't debug it enough to get to bedrock regarding where the blocking was happening or exactly why (naively, I would have expected the timeout coroutine to just be a part of the main coroutine scope create in the initial runBlocking in the conformance client, which shouldn't cause the blocking issues I observed 🤷).

So using a simpler Timer seemed like the next best thing. And an alternative like this was needed anyway to be called from non-suspend functions like the callback and blocking unary signatures (for which there is no parent coroutine scope in which to launch a new coroutine).

Resolves #249.

//
// The default oracle, if not configured, returns a 10 second timeout for
// all operations.
val timeoutOracle: TimeoutOracle = { 10.toDuration(DurationUnit.SECONDS) },
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this to account for removal of the timeouts in the configuration helper below. So if users use ConnectOkHttpClient.configureClient and forget to set timeouts in this config, they get the same default behavior as the OkHttpClient was providing (except that this default applies to bidirectional streams, whereas the OkHttpClient timeouts do not).

@jhump jhump requested a review from pkwarren May 17, 2024 17:05
@jhump
Copy link
Member Author

jhump commented May 21, 2024

@pkwarren, ping.

val DEFAULT_SCHEDULER = object : Scheduler {
override fun scheduleTimeout(delay: Duration, action: Cancelable): Timeout {
val timeout = Timeout(action)
val task = timerTask { timeout.trigger() }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works but we might consider tradeoffs vs. ScheduledThreadPoolExecutor. Probably not a big deal for timeout scheduling.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can change it. When I was searching, trying to figure out the idiomatic way to do this in Kotlin and Android apps, this was cited more than use of a ScheduledExecutorService. There were also a couple of Android-specific ways to do it, and then there was the coroutine way (that I couldn't get to work correctly -- to just create a coroutine and then delay(...) in the coroutine before executing the action).

For this, there's not really much of a tradeoff -- both solutions are roughly equivalent. Both approaches require creation of a heavyweight thread. Both implementations are similar, using a thread that polls a priority queue and then executes the tasks. The only potentially meaningful difference is that ScheduledExecutorService impls use a DelayQueue and the stuff in java.util.concurrent whereas the Timer just uses intrinsic locks and Object.wait and Object.notify.

But it's super easy to switch if you think that's more appropriate.

unaryClient: OkHttpClient = OkHttpClient(),
streamClient: OkHttpClient = unaryClient,
private val unaryClient: OkHttpClient = OkHttpClient(),
// TODO: remove this; a separate client is only useful for configuring separate timeouts,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not just for timeouts IIRC - you may also want to configure pings on clients for long lived streams but not on ones for unary calls.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's some background on the two clients (aside from timeouts which will be much better behaved with this PR): #13

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unclear why you'd want PINGs on one client and not the other. If you have created a connection that uses PINGs, just use that same client for both unary and stream traffic. I don't see any downside to that. It wouldn't increase the PING activity since you're not creating two clients that both use PING -- but instead using the PINGing client for all RPCs.

As far as the read timeout being different between unary and streams, that is now even more configurable (based on MethodSpec) via the timeout oracle. So you can set the read timeout to zero (or use ConnectOkHttpClient.configureClient) and it should be appropriate for both unary and stream traffic.

@kohenkatz, please take a look at this thread and PR and let me know what you think. Note that this PR is not removing the second client -- not yet. But I don't think it will be needed anymore and would appreciate your input, like if you see other reasons to keep the second client in future versions.

@jhump jhump merged commit 76b89e7 into main May 29, 2024
9 checks passed
@jhump jhump deleted the jh/timeouts-in-client branch May 29, 2024 23:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Timeout and deadline propagation issues
2 participants