Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(tests): fixed flaky Readiness condition with never ready route test in e2e/common/traits/health_test.go #5503

Merged
merged 1 commit into from
May 17, 2024

Conversation

valdar
Copy link
Member

@valdar valdar commented May 14, 2024

Due to how camel works stopping a route using the message bus inside the route can not assure that the route is reported always as stopped, there might be a small amount of time in which it can be reported as running.

Since the test is testing the behavior of camel k in the case that an integration is always reported not ready I took another approach by registering a custom healthcheck that force the reported condition to be never ready.

Release Note

NONE

Copy link
Contributor

@squakez squakez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I wonder if the work in #5450 was not already good enough though. According to the Camel Jira issues, that was the way it had to be fixed.

@valdar
Copy link
Member Author

valdar commented May 14, 2024

LGTM. I wonder if the work in #5450 was not already good enough though. According to the Camel Jira issues, that was the way it had to be fixed.

according to this test run https://github.com/apache/camel-k/actions/runs/9065576954/job/24906613292?pr=5119 in particular:

  Copy catalog camel-catalog-3.8.1 from namespace default
  Copy integration kit kit-cp136jalfugs739ljsu0 from namespace default
  Copy integration kit kit-cp138l2lfugs739ljsug from namespace default
  Setting build timeout to 10m
  OLM is not available in the cluster. Fallback to regular installation.
  Camel K installed in namespace test-5bad4aa0-0ccf-48e5-8576-d86e24c00b9a 
      health_test.go:374: 
          Failed after 2.522s.
          Expected
              <v1.ConditionStatus>: True
          to equal
              <v1.ConditionStatus>: False

something was still missing.

@lburgazzoli
Copy link
Contributor

Just for completeness, the parameters are controlling the behavior of the route controller in the case the a route would fail to start. In this case however, the route starts without errors and ad some point in the future it gets stopped which is outside of the control of the route controller (as the route was successfully started) hence, the route may result up if the health check is triggered before the control bus had a chance to actually stop the route.

@valdar
Copy link
Member Author

valdar commented May 15, 2024

I simplified the route and test a little bit more to be sure that intent is clearer and the test dose not mislead the next poor soul who read it.

In so doing I have also found that there were some minor issues (as conditions never actually evaluated) in the startup probe tests and I have tried to fix those as well.

e2e/common/traits/health_test.go Outdated Show resolved Hide resolved
e2e/support/test_support.go Outdated Show resolved Hide resolved
// TODO remove these workaround properties when https://issues.apache.org/jira/browse/CAMEL-20244 is fixed
"-p", "camel.route-controller.unhealthyOnRestarting=true",
"-p", "camel.route-controller.unhealthyOnExhausted=true",
"-p", "camel.health.routesEnabled=false",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is defeating the goal of the test. The idea is that the health is enabled on Camel side and we use those endpoints to query the route about it's health condition. I am not sure if I am understanding why this option is needed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is there to avoid any possible (present or future) interaction and side effect between the custom check we define in files/NeverReady.java and the default checks. At the moment the test makes sense and would work both with and without that property, there are no routes declared in files/NeverReady.java anyway...

e2e/common/traits/health_test.go Outdated Show resolved Hide resolved
@@ -439,7 +437,7 @@ func TestHealthTrait(t *testing.T) {
var r *v1.HealthCheckResponse

for h := range c.Pods[0].Health {
if c.Pods[0].Health[h].Name == "camel-routes" && c.Pods[0].Health[h].Status == "DOWN" {
if c.Pods[0].Health[h].Name == "never-ready" && c.Pods[0].Health[h].Status == "DOWN" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one look weird. If it is expected the Integration name, then, a randomized name variable name should be provided instead. Are we sure this condition is really existing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are sure because is the condition that we explicitly define in files/NeverReady.java

@valdar valdar merged commit 5ceed9d into apache:main May 17, 2024
13 checks passed
@valdar valdar deleted the fixNeverReadyHealth branch May 17, 2024 09:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants