New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
operator: always report events #9427
operator: always report events #9427
Conversation
@@ -51,7 +51,7 @@ func ReportReconcileResult(logger *capnslog.PackageLogger, recorder *k8sutil.Eve | |||
logger.Errorf("failed to reconcile %s %q. %v", kind, nsName, err) | |||
|
|||
// 2. event | |||
recorder.ReportIfNotPresent(obj, corev1.EventTypeWarning, string(cephv1.ReconcileFailed), err.Error()) | |||
recorder.Event(obj, corev1.EventTypeWarning, string(cephv1.ReconcileFailed), err.Error()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if this is happening on every retry, maybe we should keep the ReportIfNotPresent()
call?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there are multiple events, they will be collated together like so, with something similar to (x4 over 8m46s)
indicating that they have happened multiple times and without adding loads of clutter:
Warning ReconcileFailed 8m11s rook-ceph-object-controller failed to create object store deployments: fail
ed to start rgw health checker for CephObjectStore "rook-ceph/my-store", will re-reconcile: failed to create bucket checker for
CephObjectStore "rook-ceph/my-store": failed to create or retrieve rgw admin ops user: failed to create object user "rgw-admin
-ops-user". error code 1 for object store "my-store": failed to create s3 user. . : signal: interrupt
Normal ReconcileSucceeded 9s (x4 over 8m46s) rook-ceph-object-controller successfully configured CephObjectStore "rook-c
eph/my-store"
IMO, that is an okay trade-off for getting more insight into when things have happened most recently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok .so this is happening regardless of whether we use ReportIfNotPresent
or Event
?
I guess we can just rely on this mechanism instead of adding another one.
The original PR which added event reporting unnecessarily "optimized" to prevent spamming the API controller[1]. It is sometimes important to get events as they happen and not hide new events behind preexisting older events. For example, in integration tests, we may often want to wait for a controller to finish processing an update, and the best way to do that is to wait for the "ReconcileSucceeded" event. In order for this to be useful, the events must be reported each time. If we begin having problems with events being reported too often, then we should fix the underlying issue of reconciles happening too often instead of relying on a time-based "optimization" that hides recent event reports that may be useful. [1]: rook#7222 Signed-off-by: Blaine Gardner <blaine.gardner@redhat.com>
59259d3
to
da61ac1
Compare
operator: always report events (backport #9427)
The original PR which added event reporting unnecessarily "optimized" to
prevent spamming the API controller1.
It is sometimes important to get events as they happen and not hide new
events behind preexisting older events. For example, in integration
tests, we may often want to wait for a controller to finish processing
an update, and the best way to do that is to wait for the
"ReconcileSucceeded" event. In order for this to be useful, the events
must be reported each time.
If we begin having problems with events being reported too often, then
we should fix the underlying issue of reconciles happening too often
instead of relying on a time-based "optimization" that hides recent
event reports that may be useful.
Signed-off-by: Blaine Gardner blaine.gardner@redhat.com
Description of your changes:
Which issue is resolved by this Pull Request:
Resolves #
Checklist:
make codegen
) has been run to update object specifications, if necessary.