-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fan out events in async mode for async recordings. #4813
Conversation
@russjones this is a forward port of 4.4 issue |
888afe8
to
2341d64
Compare
} | ||
elapsedTime := time.Since(start) | ||
log.Debugf("Emitted all events in %v.", elapsedTime) | ||
require.True(t, elapsedTime < time.Second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be problematic in CI context and super sensitive to workloads existing in parallel. I'd leave it as a warning and a log entry if that's longer than the specified threshold but not enforce it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we will miss the warning if we don't enforce the threshold. You are right that this is not the best practice. I will think about improveing it.
Here are details and a goroutine dump from a different machine (running a 5.0.0 beta) which I just discovered also has the hanging problem - just in case there's anything different for the forward port:
|
To me it looks like the auditwriter's
which blocks further processing of events. The only way it could get there is by ignoring to
|
@a-palchikov you have a great eye for edge cases :) The PR has this edge case covered, so this hang should disappear. |
2341d64
to
7b9a4c7
Compare
This commit fixes #4695. Teleport in async recording mode sends all events to disk, and uploads them to the server later. It uploads some events synchronously to the audit log so they show up in the global event log right away. However if the auth server is slow, the fanout blocks the session. This commit makes the fanout of some events to be fast, but nonblocking and never fail so sessions will not hang unless the disk writes hang. It adds a backoff period and timeout after which some events will be lost, but session will continue without locking.
7b9a4c7
to
c336798
Compare
This commit fixes #4695.
Teleport in async recording mode sends all events to disk,
and uploads them to the server later.
It uploads some events synchronously to the audit log so
they show up in the global event log right away.
However if the auth server is slow, the fanout blocks the session.
This commit makes the fanout of some events to be fast,
but nonblocking and never fail so sessions will not hang
unless the disk writes hang.
It adds a backoff period and timeout after which some
events will be lost, but session will continue without locking.