Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net-istio flaky conformance tests #938

Open
AngeloDanducci opened this issue Jun 7, 2022 · 7 comments
Open

net-istio flaky conformance tests #938

AngeloDanducci opened this issue Jun 7, 2022 · 7 comments
Labels
lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.

Comments

@AngeloDanducci
Copy link
Contributor

Failing tests can be found here

TestIngressConformance/Update/# seems to be the primary culprit, with the issue
Reason:"NewObservedGenFailure", Message:"unsuccessfully observed a new generation"

Will dig into this a bit more and update the issue with any further findings.

@AngeloDanducci
Copy link
Contributor Author

/assign @AngeloDanducci

@AngeloDanducci
Copy link
Contributor Author

Looks like it may be related to this as the observed generation isn't incrementing.
istio/istio#31931

@AngeloDanducci
Copy link
Contributor Author

/lifecycle frozen

@knative-prow knative-prow bot added the lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. label Jun 22, 2022
@dprotaso
Copy link
Contributor

dprotaso commented Oct 31, 2022

NewObservedGenFailure is actually an error that the knative controllers set

https://github.com/knative/pkg/blob/d1d5c849073b4c59fabeef1aaaf5eece021d9ee2/reconciler/reconcile_common.go#L68-L72

net-istio should rebuild the status conditions - but I wonder if something is causing them to short circuit reconciliation and not do that

@dprotaso
Copy link
Contributor

dprotaso commented Oct 31, 2022

We don't use Istio's Virtual Service Status as source of information since it was always broken - https://github.com/knative-sandbox/net-istio/blob/abad502be7e14ef7281de0d77dcf5525d1e7e40b/config/400-config-istio.yaml#L67-L70

So I think the flake is specific to our controllers not re-creating the KIngress Status block in some situation.

It still remains flakey in testgrid

@dprotaso
Copy link
Contributor

Whoops forgot to update my findings

When the endpoint for a new service becomes ready two things happen in parallel

  1. the conformance test programs the KIngress to point to the new backend then net-istio programs the VirtualService
  2. the endpoint is being processed by Istio

What we're observing in the update tests is that 1. can happen much faster than 2.

A terrible workaround would be to introduce a delay in the test. A proper fix might be to do something similar to net-contour where we ensure the endpoint is known to istio prior to update the main Virtual Service.

dprotaso added a commit to dprotaso/net-istio that referenced this issue May 23, 2023
@dprotaso
Copy link
Contributor

/unassign @AngeloDanducci

knative-prow bot pushed a commit that referenced this issue May 24, 2023
knative-prow-robot pushed a commit to knative-prow-robot/net-istio that referenced this issue Jun 1, 2023
knative-prow-robot pushed a commit to knative-prow-robot/net-istio that referenced this issue Jun 1, 2023
knative-prow bot pushed a commit that referenced this issue Jun 1, 2023
See: #938

Co-authored-by: dprotaso <dprotaso@gmail.com>
knative-prow bot pushed a commit that referenced this issue Jun 1, 2023
See: #938

Co-authored-by: dprotaso <dprotaso@gmail.com>
openshift-merge-robot pushed a commit to openshift-knative/net-istio that referenced this issue Jun 9, 2023
* drop safe-to-evict annotations (knative-extensions#1119)

Co-authored-by: dprotaso <dprotaso@gmail.com>

* Disable update test (knative-extensions#1121)

See: knative-extensions#938

Co-authored-by: dprotaso <dprotaso@gmail.com>

* Sync upstream release

---------

Co-authored-by: Knative Prow Robot <knative-prow-robot@google.com>
Co-authored-by: dprotaso <dprotaso@gmail.com>
Co-authored-by: John Doe <johndoe@localhost>
Co-authored-by: nak3 <nak3@users.noreply.github.com>
openshift-merge-robot pushed a commit to openshift-knative/net-istio that referenced this issue Jun 9, 2023
* upgrade to latest dependencies (knative-extensions#1066)

bumping knative.dev/hack c7cfcb0...199139d:
  > 199139d Find checksums file works with ARTIFACTS_TO_PUBLISH variable (# 276)
  > 1384ebd [release-1.9] 🐛 Location-agnostic sign release (# 271)
bumping golang.org/x/net 1e63c2f...8e2b117:
  > 8e2b117 http2/hpack: avoid quadratic complexity in hpack decoding
  > 547e7ed http2: avoid referencing ResponseWrite.Write parameter after returning
  > 39940ad html: parse comments per HTML spec
  > 87ce33e go.mod: update golang.org/x dependencies
  > 415cb6d all: fix some comments
  > 7e3c19c all: correct typos in comments
  > 296f09a http2: case insensitive handling for 100-continue
  > f8411da nettest: fix tests on dragonfly and js/wasm
  > 8e0e7d8 go.mod: update golang.org/x dependencies
  > 7805fdc http2: rewrite inbound flow control tracking
  > 2aa8215 nettest: use RoutedInterface for probing network stack capability
  > ad92d3d websocket: don't recommend Gorilla
  > e1ec361 http2: fix race in TestCanonicalHeaderCacheGrowth
bumping golang.org/x/sys 3ca3b18...90c8f94:
  > 90c8f94 unix: avoid converting non-pointers to unsafe.Pointer in PtraceIO
  > 4e121b1 unix: add missing address operator in initxattrdest
  > 68f9dcb windows/debug/svc: buffer channel passed to signal.Notify
  > 0e1262c unix: add ptrace(PT_DENY_ATTACH) wrapper for darwin
  > 6938dae unix: add missing constants used with struct Timex on Linux
  > 01b330b unix: improve flaky solaris test logging
  > e7d7f63 all: fix some comments
  > 7a75290 unix/linux: update to glibc 2.36
  > 4112509 windows/mkwinsyscall: write source to temp file if formatting fails
  > 71da690 windows/mkwinsyscall: support "." and "-" in DLL name
  > b829a39 unix/linux: update to gcc 13.0.0, qemu 7.1.0 for loong64
  > c3037ed unix: add support for clock_adjtime on Linux
  > 13fe000 cpu: add IsBigEndian
  > 17fce3a unix: avoid false positive in vet shift check
  > a6f4650 windows: use UTF16FromString and UTF16ToString from syscall
  > 6e4d1c5 unix/linux: update to Linux kernel 6.1 and Go 1.20-rc2
  > b8be2fd cpu: add //go:build line to cpu_gccgo_x86.c
  > 1e9f341 unix: add //go:build line to gccgo_c.c
  > b60007c unix: add Uvmexp and SysctlUvmexp for NetBSD
  > b751db5 unix: gofmt hurd files after CL 459895
  > b360406 unix: support TIOCGETA on GNU/Hurd
  > 3086868 unix: regen on OpenBSD 7.2
  > 2b11e6b unix: remove Mclpool from openbsd types
  > 7c6badc unix: convert openbsd/mips64 to direct libc calls
  > 3b1fc93 unix: avoid allocations for common uses of Readv, Writev, etc.
  > 2204b66 cpu: parse /proc/cpuinfo on linux/arm64 on old kernels when needed
  > 72f772c unix: offs2lohi should shift by bits, not bytes
  > cffae8e unix: add ClockGettime on *bsd and solaris
  > 96e75de unix: improve Sendmsg and Recvmsg documentation
  > 127c0dd unix/linux: use Go 1.20rc1 to generate files
bumping knative.dev/networking db2bcbe...5e096d6:
  > 5e096d6 upgrade to latest dependencies (# 774)
  > 6689f05 upgrade to latest dependencies (# 769)
bumping knative.dev/pkg 247510c...75da922:
  > 75da922 upgrade to latest dependencies (# 2696)
  > cea413f [release-1.9] bump net and text packages (# 2694)
  > 8efb348 fix: `reconcilerImpl.updateStatus` calculates state difference in debug mode only (# 2686)
bumping golang.org/x/term 97ca0e3...d974fe8:
  > d974fe8 go.mod: update golang.org/x dependencies
  > 1efcd90 go.mod: update golang.org/x dependencies
bumping golang.org/x/text c8236a6...71a9c9a:
  > 71a9c9a all: fix some comments
  > ec5565b README.md: update documentation of module versioning

Signed-off-by: Knative Automation <automation@knative.team>

* bump go.mod (knative-extensions#1072)

* bump istio to v1.16.3 (knative-extensions#1074)

* Disable update test (knative-extensions#1122)

See: knative-extensions#938

Co-authored-by: dprotaso <dprotaso@gmail.com>

* drop safe-to-evict annotations (knative-extensions#1120)

Co-authored-by: dprotaso <dprotaso@gmail.com>

* Sync upstream release

---------

Signed-off-by: Knative Automation <automation@knative.team>
Co-authored-by: knative-automation <automation@knative.team>
Co-authored-by: Dave Protasowski <dprotaso@gmail.com>
Co-authored-by: Knative Prow Robot <knative-prow-robot@google.com>
openshift-merge-robot pushed a commit to openshift-knative/net-istio that referenced this issue Oct 2, 2023
* drop safe-to-evict annotations (knative-extensions#1119)

Co-authored-by: dprotaso <dprotaso@gmail.com>

* Disable update test (knative-extensions#1121)

See: knative-extensions#938

Co-authored-by: dprotaso <dprotaso@gmail.com>

---------

Co-authored-by: Knative Prow Robot <knative-prow-robot@google.com>
Co-authored-by: dprotaso <dprotaso@gmail.com>
Co-authored-by: John Doe <johndoe@localhost>
openshift-ci bot pushed a commit to openshift-knative/net-istio that referenced this issue Oct 12, 2023
* drop safe-to-evict annotations (knative-extensions#1119)

Co-authored-by: dprotaso <dprotaso@gmail.com>

* Disable update test (knative-extensions#1121)

See: knative-extensions#938

Co-authored-by: dprotaso <dprotaso@gmail.com>

* Leave a comment which will trigger a new dot release (knative-extensions#1185)

* Run ./hack/update-deps.sh --upgrade --release 1.10 (knative-extensions#1191)

---------

Co-authored-by: Knative Prow Robot <knative-prow-robot@google.com>
Co-authored-by: dprotaso <dprotaso@gmail.com>
Co-authored-by: Reto Lehmann <retocode@icloud.com>
Co-authored-by: John Doe <johndoe@localhost>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.
Projects
None yet
Development

No branches or pull requests

3 participants