-
Notifications
You must be signed in to change notification settings - Fork 38.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BoundedFrequencyRunner Loop maybe into an infinite loop #103286
Comments
/sig network |
@danwinship @thockin you are the ones that knows better the |
Where is 1000800s coming from? |
I'm having a little trouble following the repro here. It's clear there's some bound-checking to be done to prevent negative values, but I am not clear what's causing it to go negative or what your repro is trying to prove. Help me understand? |
Once bfr.limiter.TryAccept() is false and nextScheduled is negative value, bfr.limiter.TryAccept() will always be false which lead to bfr.fn() whill never be invoked any more and BoundedFrequencyRunner Loop into an infinite loop which lead to cpu 100%. I'm having this problem in a production environment. The same programs running in different clusters filled the CPU after 11 days and 14 hours and pprof indicates that the problem is with Loop(). |
I think if bfr.limiter.TryAccept() is false and nextScheduled is negative value, should reset tokens to zero and nextScheduled to minInterval or maxInterval |
TryAccept is supposed to occasionally be false (when you are calling too
fast). Why do you say it stays false forever?
…On Thu, Jul 8, 2021, 7:15 PM hk ***@***.***> wrote:
I think if bfr.limiter.TryAccept() is false and nextScheduled is negative
value, should reset tokens to zero and nextScheduled to minInterval or
maxInterval
—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
<#103286 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABKWAVHPLM3EFFRTENG2JK3TWZLUXANCNFSM47PG2UCA>
.
|
Because tokens is a cumulative value,once the deviation is generated, it cannot be corrected automatically in the case of periodic execution, waitDuration will always more than zero. |
I'm having this problem in a production environment, and the log information proves this fact |
The time to cause the problem depends on the minInterval. |
hi, I'm feeling rather dumb. Which I'm not disputing that there's something amiss, but frankly I don't understand the repro. Your changes break the functionality. |
Run this and wait 11 days and 14 hours(i think i maybe need to find a more min minInterval, which can lead to this problem) |
LOL. Help me find a repro that doesn't need 2 weeks ? Where is that
11 days and 14 hours magic number coming from? Is the clock changing
such that the tokens calculation goes haywire?
…On Fri, Jul 9, 2021 at 12:07 AM hk ***@***.***> wrote:
Run this and wait 11 days and 14 hours(i think i maybe need to find a more min minInterval, which can lead to this problem)
func main() {
runner := NewBoundedFrequencyRunner("syncFunc", syncFunc, 3time.Second, 10time.Second, 1)
go runner.Loop(wait.NeverStop)
select {}
}
func syncFunc() {
fmt.Println("do syncFunc")
}
—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
Haha, 11 days and 14 hours magic number decide by minInterval and maxInterval. MinInterval decides cycle value(100080).You can change minInterval to 4s, and repro will need 100080*4s. |
Cycle:8 |
I see it. Changing all the libs to float64 makes THIS repro go away, but I'll dig into it so that I can plausibly explain it. |
Here's a simpler repro, cutting BFR out of the loop:
|
Fix proposed in https://golang.org/cl/325289 |
OK, thanks |
/triage accepted |
The upstream fix merged. We need to update our dep on golang.org/x/time/rate to pick it up. |
#104014 offered |
What happened:
bfr.limiter.TryAccept() will always false in tryRun() and runner.Loop will into an infinite loop after 11 days and 14 hours ( 1000800s=1000800s/10s/cycle= 100080cycle)
What you expected to happen:
No matter how long runner.Loop runs, syncFunc() can be called every 10 seconds
How to reproduce it (as minimally and precisely as possible):
1、modify tryRun()
2、limiter.advance
if TryAccept() is false, and elapsed = bfr.maxInterval, nextScheduled=nextPossible < 0, and runner.Loop into an infinite loop(TryAccept() will always false, and syncFunc() can not be invoked )
Anything else we need to know?:
if miniInterval is not 2^*, TokenBucketRateLimiter.limit will loss of accuracy, and tokens have a deviation value of delta-1 every maxInterval duration。
if nextScheduled less than zero should set tokens to zero as a fix
Environment:
kubectl version
):cat /etc/os-release
):uname -a
):The text was updated successfully, but these errors were encountered: