Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluate test frameworks #253

Closed
bobcatfish opened this issue Feb 27, 2018 · 9 comments
Closed

Evaluate test frameworks #253

bobcatfish opened this issue Feb 27, 2018 · 9 comments
Assignees
Labels
area/build Build topics specifically related to Knative area/test-and-release It flags unit/e2e/conformance/perf test issues for product features

Comments

@bobcatfish
Copy link
Contributor

Our initial conformance test was implemented using Ginkgo, however we have had feedback from the k8s team that choosing to use Ginkgo for their conformance tests was the wrong decision and they'd rather they not use it.

This task is to evaluate options for test frameworks before we commit to one for conformance tests going forward.

@bobcatfish bobcatfish added this to the M3 milestone Feb 27, 2018
@bobcatfish bobcatfish self-assigned this Feb 27, 2018
@sclevine
Copy link
Contributor

sclevine commented Feb 28, 2018

If we're looking for a BDD-style testing library, I built spec to address some of the common criticisms of Ginkgo. I'm using it to test my port of the CF buildpacks to Build CRD.

@bobcatfish
Copy link
Contributor Author

Thanks @sclevine ! I'll try it out :D

@bobcatfish bobcatfish modified the milestones: M3, M4 Mar 15, 2018
@mattmoor mattmoor added area/build Build topics specifically related to Knative area/test-and-release It flags unit/e2e/conformance/perf test issues for product features labels Mar 19, 2018
@bobcatfish
Copy link
Contributor Author

I've gathered a list of test frameworks I want to compare. I'll try to narrow down the list a little, then I'll take our existing conformance tests, rewrite them using the framework, and compare.

I'll also see what it looks like to not use a framework at all, this article was very compelling re. not using any frameworks or even assertion libs, and I find myself agreeing with this article as well:

Go has a perfectly good testing framework built in. Frameworks are also one more barrier to entry for other developers contributing to your code.

These are the frameworks/libs I've found that I want to look into, if anyone knows of any other good candidates I should add to the list please let me know ASAP (note I've ruled out any projects that haven't been active in the last 6 months):

BDD:

Other test frameworks/libs:

@bobcatfish
Copy link
Contributor Author

Here is some feedback from the k8s team (@thockin and @ixdy) on troubles they have run into with Ginkgo:

  • Ginkgo is a large dependency vs the value it provides
  • The control flow in Ginkgo tests is different from most Go code and can be hard to follow, especially when new to it (e.g. invoking/instantiating test via var _ = , BeforeSuite, AfterSuite, Context, etc.)
  • Failures are hard to debug
  • The Ginkgo maintainers sometimes do not respond to issues for years (for example this issue was open for a year before anyone replied)
    The Ginkgo directives (e.g. It, By, Describe, Context):
    • Are often misunderstood and misused, e.g. test names which don’t follow BDD best practices. For example:
      • [sig-node] Dockershim [Serial] [Disruptive] [Feature:Docker] When all containers in pod are missing should complete pod sandbox clean up based on the information in sandbox checkpoint
      • [k8s.io] Security Context when creating containers with AllowPrivilegeEscalation should allow privilege escalation when not explicitly set and uid != 0
      • [k8s.io] DynamicKubeletConfiguration [Feature:DynamicKubeletConfig] [Serial] [Disruptive] When a remote config becomes the new last-known-good, and then the Kubelet is updated to use a new, bad config the Kubelet should report a status and configz indicating that it rolled back to the new last-known-good
    • The resulting test names do not display well
    • The tags in the test names are hacks to work around the fact that Ginkgo does not provide any other way to easily filter and run tests, leading to running tests with args such as --ginkgo.focus=\\[Feature:Federation\\] --ginkgo.skip=\\[Slow\\]|\\[Serial\\]|\\[Disruptive\\]|\\[Flaky\\]|\\[NoCluster\\]

See also discussion in kubernetes/kubernetes#3837 and kubernetes/kubernetes#3131.

@dprotaso
Copy link
Member

dprotaso commented Apr 9, 2018

This is great feedback - I've relayed it to some of our k8s & ginkgo contributors to see if they can prioritize some of the issues.

This also helped tease out some criteria that we can apply to the other testing frameworks

  • the ability to filter and run specific tests
  • test suite randomization
  • parallel test execution
  • custom reporting & readability of the test output

My comments/questions on the other points:

The control flow in Ginkgo tests is different from most Go code

Given this control flow Ginkgo hooks into a single go test func (ie. func TestXX(t *testing.T) {...}) which has the side effect of not working with subtests which were introduce in Go 1.7.

For example Expect(err).NotTo(HaveOccurred()) (which is everywhere in Ginkgo based tests) doesn’t provide enough context to debug compared to more thoughtful assertions

Both examples provided ultimately call Expect(err).NotTo(HaveOccurred()). What's providing more context is the error being matched. Curious does the k8s framework produce an error with more context automatically?

e.g. test names which don’t follow BDD best practices ... The resulting test names do not display well

I'm curious if this a side-effect of BDD - would following best practices lead to resulting test names being digestible at a quick glance?

how ginkgo behaves when parallel tests hang
...
the tags in the test names are hacks

Hopefully I can poke the right people to get these prioritized. In the Cloud Foundry world they've utilized a config & some ginkgo skip logic to help with focusing/selecting tests. See https://github.com/cloudfoundry/cf-acceptance-tests. This config approach helps remove tags out of test names/descriptions.

@ixdy
Copy link

ixdy commented Apr 12, 2018

One additional useful thing ginkgo provides (loosely mentioned above) is test output (junit XML) that's easily consumable by CI systems (e.g. Jenkins or Prow).
Though it's also worth noting: go 1.10 has a -json output, though I don't think anything has integrated with it yet, and bazel natively produces junit, too.

The other aspect of "weird code flow" is that everything is created at init time with the var _ = pattern.
Some folks are also displeased with having to import github.com/onsi/ginkgo and github.com/onsi/gomega into the top-level namespace.

The Expect(err).NotTo(HaveOccurred()) is probably my largest pet peeve, mostly in that it's too easy for this to be a meaningless error.

Part of the issue is that I can't ever remember which line in the error is the one I care about; e.g. for

/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/instrumentation/monitoring/stackdriver.go:68
Expected error:
    <*errors.errorString | 0xc42009b160>: {
        s: "timed out waiting for the condition",
    }
    timed out waiting for the condition
not to have occurred
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/instrumentation/monitoring/stackdriver.go:115

do I look at line 68 or 115?

Kubernetes additionally makes it hard on ourselves because we use k8s.io/apimachinery/pkg/util/wait approximately everywhere and it's the source of that useless timed out waiting for the condition error message.

I think the test spec names end up bad for a few reasons:

  • some developers don't know they're supposed to be readable phrases
  • we end up composing them from disparate code locations
  • the tags end up interfering with the readability

FWIW we proposed adding more structure to tests in onsi/ginkgo#144, but it was decided that it was too hard to do that in a backwards-compatible way, so we ended up with the tag hack.

@bobcatfish
Copy link
Contributor Author

Thanks for addressing these issues @dprotaso ! I'm glad to hear that these criteria will help make Ginkgo better. After looking at a lot of Go test frameworks, I'm really happy with how mature and usable Ginkgo is, especially the great documentation (which it turns out is super rare).

Expect(err)

Both examples provided ultimately call Expect(err).NotTo(HaveOccurred()). What's providing more context is the error being matched. Curious does the k8s framework produce an error with more context automatically?

If instead of calling Expect(err).NotTo(HaveOccurred()) the user had to use t.Errorf or t.Fatalf, they (hopefully) would write an actionable error message. With the k8s libs we frequently poll resources until they get into a desired state, which often fails by timing out. The difference in errors would be something like:

With Expect(err).NotTo(HaveOccurred()):

Code:

	By("The Revision will be updated when it is ready to serve traffic")
	err := test.WaitForRevisionState(clients.Revisions, revisionName, test.IsRevisionReady(revisionName))
	Expect(err).NotTo(HaveOccurred())

Error:


    Expected error:
        <*errors.errorString | 0xc4200c76c0>: {
            s: "timed out waiting for the condition",
        }
        timed out waiting for the condition
    not to have occurred

    /usr/local/google/home/christiewilson/Code/go/src/github.com/elafros/elafros/test/conformance/route_test.go:99

With t.Errorf/t.Fatalf:

Code:

	log.Println("The Revision will be marked as Ready when it can serve traffic")
	err := test.WaitForRevisionState(clients.Revisions, revisionName, test.IsRevisionReady(revisionName))
	if err != nil {
		t.Fatalf("Revision %s did not become ready to serve traffic: %v", revisionName, err)
	}

Error:

2018/04/12 13:49:31 The Revision will be marked as Ready when it can serve traffic
--- FAIL: TestRouteCreation (120.95s)
	route_test.go:80: Revision prod-00001 did not become ready to serve traffic: timed out waiting for the condition
FAIL

I think the point @jgrafton and co. are making is that in the Expect(err).NotTo(HaveOccurred()) case, it is still possible to debug the error, but the first error you are presented with is missing context (since the timeout could have theoretically happened at any point in the test).

BDD

I'm curious if this a side-effect of BDD - would following best practices lead to resulting test names being digestible at a quick glance?

You might be right! That would mean this is (maybe) one more instance of the maintainers and contributors failing to follow BDD best practices.

This is evidence that following BDD best practices is a challenge, meaning:

  1. Potential contributors will have the additional barrier of needing to come up to speed on BDD
  2. Maintainers will need to enforce those standards.

This is doable but I think we need strong evidence in support of BDD to choose this route.

The main benefit I can see for these tests is BDD gives us meaningful "step" output as the tests run, and I don't think that's a strong enough reason to commit to BDD.

Prototypes

Follow up, I've created prototypes with various frameworks/libs:

  • Godog - A Go implementation of cucumber
  • Ginkgo - New and improved
  • spec - A library (not a framework!) for BDD
  • testify - A popular Go test library (not BDD)
  • No framework - aka just the testing lib

I decided not to prototype these frameworks/libs:

  • goblin - Seems to be almost identical to Ginkgo, but with less features.
  • goconvey - The main feature seems to be a web ui which we don’t need, the maintainers don’t seem to be active anymore and it doesn’t seem to offer anything we need that Ginkgo doesn’t offer.
  • gogiven - The main feature seems to be generating output such as this (the inverse of Godog), which we don’t need.
  • endly - Has a far wider scope of features than what we need, e.g. source code downloading, DB initialization, Selenium testing

(If I had to describe my ideal framework/lib for these tests, what I'd really find useful is a way of describing the state transitions we expect custom resources to experience and assert against that.)

My conclusions are that:

  1. Ginkgo is the strongest of the BDD frameworks in Golang
  2. spec is a great way to mix BDD tests in with other tests (since it's a library and not a framework)
  3. We don't have a strong reason to use BDD.

∴ I think we should move forward with "no framework" aka using the testing lib.

@josephburnett
Copy link
Contributor

josephburnett commented Apr 19, 2018 via email

@dprotaso
Copy link
Member

@bobcatfish @ixdy FYI There's some work being done on ginkgo to help with debugging the hanging tests
See: onsi/ginkgo#461 (comment)

bobcatfish added a commit that referenced this issue May 1, 2018
As explored in the prototypes in #253, we have decided that although Ginkgo is the best of the Go test frameworks, we do not want to be tied to BDD and feel these tests will be cleaner and easier to contribute to if we use no framework.

I have also refactored the supporting logic of the tests into a `test` library which can be reused for other e2e tests, documented in test/adding_tests.md

* Improved docs on running tests: now all docs on running are in test/README.md, there is only one place to look to see how to run all of the tests. Instructions on how to run conformance tests are hopefully more clear.
* Updated links that pointed at CONTRIBUTING.md since the content moved

Fixes #253
markusthoemmes pushed a commit to markusthoemmes/knative-serving that referenced this issue Oct 9, 2019
Fix create-release-branch script.
ReToCode added a commit to ReToCode/serving that referenced this issue Apr 5, 2023
…-pick-251-to-release-v1.8

[release-v1.8] Delete the created golang cache directory during docker build
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/build Build topics specifically related to Knative area/test-and-release It flags unit/e2e/conformance/perf test issues for product features
Projects
None yet
Development

No branches or pull requests

6 participants