Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ISSUE-159 VegasLimit for batch processing #161

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ericzhifengchen
Copy link

@ericzhifengchen ericzhifengchen commented Aug 4, 2020

Background
At Uber, We use Kafka producer to produce message to broker, we saw two problems could cause Kafka producer overload while running the system.

  1. Surge of messages rate.
  2. Broker degradation

When Kafka producer overloaded, messages accumulated in buffer, then it causes high message delivery latency, saturation of garbage-collector, and drop of throughput.
We think load shedding is the right way to deal with overload, however, existing concurrency limit algorithm is not perfectly fit for batching system, such as Kafka producer.

Abstraction
As a batching system, Kafka producer comes with a sending buffer. while sending message, clients put messages into the buffer.
And there are one or more sender threads keeps fetching message from the queue and send them to broker. latency is evaluated after client received ack from broker.
So in the system, latency has two components.

  • message buffer time
  • request roundtrip time

concretely, there is equation:
(1) observed-latency = buffer-time + roundtrip-latency
in practice, when not overload, there is following relationship
(2) 0 < buffer-time < BF * roundtrip-latency ( BF stand for bufferFactory = 1 / num-inflight-requests)
with equation (1) and relationship (2), there is
(3) roundtrip-latency < observed-latency < (1 + BF) * roundtrip-latency

The problem of VegasLimit is, it use rtt_noload as boundary to detect saturation.
however, for a batching system, it could easily cause over shedding, so we are proposing a little enhancement to make VegasLimit perfectly fit for batching system

Solution
Here is high level solution

  1. Introduce bufferFactor to represent portion of buffer-time in observed-latency.
  2. Consider bufferFactor while calculating queue size to avoid over shedding
  3. Consider bufferFactor while reseting probe to avoid limit inflation

with bufferFactor to represent portion of buffer-time in observed-latency, we propose a new equation to estimate queue size

    queueSize = (int) Math.ceil(estimatedLimit * (1 - (double)rtt_noload * (bufferFactor + 1) / rtt));

it suggests no queue, when 0 < rtt < rtt_noload * (bufferFactor + 1)

In practice, we found the change prevented over shedding, but also causes another problem - limit inflation.
limit inflation means concurrent limit gradually increase as time elapse.
The reason is when probe, we have

        rtt_noload = rtt

when system under load, rtt are actually in the range of [rtt_noload, (bufferFactor + 1) * rtt_noload]
as a result, if when probe, rtt is at the higher end, it could cause both rtt_noload and estimatedLimit inflate.

We solved the problem in two steps

  1. when probe, update estimatedLimit with following equation

          estimatedLimit = Math.max(initialLimit, Math.ceil(estimatedLimit / (bufferFactor + 1))
    

the change reduce estimatedLimit to bare size (no buffer), so in case system is under load, it should start rejecting requests and concurrency start dropping.
eventually when concurrency dropped to estimatedLimit, observed rtt should be equals to rtt_noload, and limit inflation can be solved.

  1. Pause updating of estimatedLimit for extra amount of time, to allow rtt dropped to rtt_noload

          pauseUpdateTime = rtt * (1 + bufferFactor / (1 + bufferFactor))
    

It takes time to have concurrency dropped to updated estimatedLimit, and during the time, updating of estimatedLimit should be paused.
other wise estimatedLimit could increase during the time, and miss the opportunity to observe rtt_noload.
The time to wait can be estimated with above equation.

@ericzhifengchen
Copy link
Author

CI pass locally. how to rerun?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant