Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mocha's hanging CI browser tests #2890

Closed
ScottFreeCode opened this issue Jun 18, 2017 · 9 comments · Fixed by #3019
Closed

Mocha's hanging CI browser tests #2890

ScottFreeCode opened this issue Jun 18, 2017 · 9 comments · Fixed by #3019
Assignees

Comments

@ScottFreeCode
Copy link
Contributor

ScottFreeCode commented Jun 18, 2017

Of late, there has been a failure in CI that looks like a browser flake... except that it's far too consistent to be a flake. Doesn't always happen, but usually does, and always the same browsers (and usually the same point in the test). It seems like the browser hangs or something -- Sauce disconnects with a timeout of 20,000ms partway through the testsuite.

It's blocking #2886, notably, but some testing determined that it affected the commits there even before coverage was actually added, so it seems to be something that coverage makes more likely to happen but isn't the sole cause.

I have had a... cryptic breakthrough, inasmuch as while I was experimenting with something else I accidentally wound up reproducing a hang in PhantomJS with Node 8.0.0 at the same points where Sauce is disconnecting. I do not know what about my local experiment triggerred this, and am going to move further experimentation to a new checkout of Mocha to ensure this one stays where it can reproduce the hangy issue.

Here's what I have been able to discover:

  • In Travis, it occurs for Edge, IE7 and Safari.
  • It seems like it may happen more consistently with my PR of adding coverage to Karma runs, but it has happened to builds that lack those changes as well (e.g. master, other PRs).
  • Both in travis and locally, it happens after printing the success result for should format functions saved in windows style - spaces and for the last unit test.
  • Adding console.logs to later tests indicates that, when it hangs after the "- spaces" test, not only is the next test started, but many more tests are actually run even though no more get their results printed; however, at some point there is a test that is never console.logged, thus indicating that it also stops running tests somewhere along the line.
  • The point at which it actually stops running tests does not seem to be consistent.
  • Adding console.logs may affect the point where it stops running tests and/or whether the issue occurs at all. It's hard to tell for sure. Sometimes it changes after I add a console.log and then changes back when I remove that console.log; other times it changes after I add a console.log and does not change back when I remove it.
  • On the theory that the continued running of tests past the point where the reporter ceases output suggest that output is somehow gummed up and perhaps the system hangs for real when buffers waiting for the output to ungum finally fill up, I tried upgrading the two things that I presume affect output: karma and karma-spec-reporter. Results were much like adding console.logs: it seems like these upgrades may affect whether and when the issue occurs, but niether one (nor even both together) consistently eliminates the issue.
  • By saving the Browserify bundle of the test files, I see that the tests when this issue occurs (where I've been adding console.logs) is not minified or obfuscated, so it is unlikely to be due to a bug in bundling that the exact code affects nondeterministically. At least, I think so anyway.

I feel just as stuck as before, but maybe this information will ring a bell to somebody.

@ScottFreeCode
Copy link
Contributor Author

I have reason to suspect that this might have something to do with console.error and/or util.inspect, or some such formatting behavior under the hood, in Node 8: I get some similar results trying to util.inspect Mocha suite/test objects in a Karma run, and trying to console.error Mocha suite/test objects in a Node (!) run. (Was using .error purely to be able to differentiate it from other output on stdout, but that's a long story; I should double-check whether I get these behaviors with .log too.)

Looked at the Node issue tracker and didn't find any issues about util.inspect that sound both similar and still unfixed/recent; may need to scour for console.<level> issues too.

@ScottFreeCode
Copy link
Contributor Author

#2903 would help debug this (no pun intended).

I believe I have found the root cause of the issue.

return JSON.stringify(this, function (key, val) {
sometimes throws (I believe we have at least one other use of JSON.stringify that could do the same).

if (runnable.state || runnable.isPending()) {
suppresses some of these. (We should read it closely and double-check that there are no other cases in this function where insufficient/no info is printed.) @mochajs/core Does anyone remember why these errors are being suppressed? It strikes me as dangerous in general (this issue is a perfect example) and, without knowing the details yet, as probably not the right solution for whatever it was meant to solve: we shouldn't be emitting spurious errors in the first place that would need to be suppressed, or emitting the same error twice, or anything like that (and even if we were doing so and really could not fix it, there's got to be a better way to determine that that's the case for an individual error so as to avoid suppressing legit errors).

I think fixing the latter (or at least allowing us to shut it off during CI runs) may give us the info we're missing from these browser failures, allowing us to confirm or deny that they're the former issue and if they're not then to get some idea of what they are instead.

@boneskull
Copy link
Member

boneskull commented Jul 26, 2017

Does anyone remember why these errors are being suppressed? It strikes me as dangerous in general (this issue is a perfect example) and, without knowing the details yet, as probably not the right solution for whatever it was meant to solve: we shouldn't be emitting spurious errors in the first place that would need to be suppressed, or emitting the same error twice, or anything like that (and even if we were doing so and really could not fix it, there's got to be a better way to determine that that's the case for an individual error so as to avoid suppressing legit errors).

So, this is two different things. I'll explain the intent:

If runnable.state is truthy, the Runnable has already completed its test.

I think it's because of this:

it('should malinger', function (done) {
  setTimeout(() => {
    done();
    throw new Error('maybe I should have just returned?');
  });
});

So done() has marked the test OK, but we're still running code, and that code happens to throw an exception.

IMO, that should throw an exception (and probably exit) instead of eat it, but I haven't confirmed if that's what's happening here.

If runnable.pending is truthy, the Runnable has already been skipped.

This is necessary because of how an async test can skip itself:

it('should skip, eventually', function (done) {
  setTimeout(() => {
    this.skip();
    done(); // never called!
  }, 500);
});

If a test is async, it must call done with or without an Error as its first parameter. this.skip() is then shorthand for done(new Pending()). A Pending instance is of course an Error.

However, just like with calling this.skip() in a synchronous test:

it('should skip', function () {
  this.skip();
  throw new Error('never thrown');
});

...we must immediately abort test execution. The Error above is never thrown, because this.skip() in fact throws a Pending. In the synchronous case, this.skip() is shorthand for throw new Pending().

Likewise, in the async case, we call done() with the Pending, but we also must abort synchronous execution, since we are not guaranteed to return immediately after the call to done()! So we call done(new Pending()), but we also must throw to interrupt.


Happy to discuss better implementations or strategies for this sort of thing. We could abort execution upon done()--so anything called thereafter would never get called, like in the "pending" case--but this is likely to break tests, and adds more black magic where there's already too much.

@boneskull
Copy link
Member

boneskull commented Jul 26, 2017

I also realize "exceptions are not flow control" are wise words. We can abuse exceptions to halt a test immediately if the user decides it should be skipped, but outside of Zones or async_wrap there's really no way to be sure we've aborted everything:

it('life is pain', function (done) {
  let i = 0;
  setInterval(() => {
    console.log('ha ha ha');
    if (++i > 5) {
      throw new Error('^_^');
    }
  }, 1000);
  setTimeout(() => {
    this.skip();
  }, 500);
});

Throwing a Pending is just a "best effort" for a more common situation. And it can theoretically be problematic! If we find out that one or both of these are causing problems, I'd probably want to soft-deprecate the behavior then eventually hard-deprecate--unless we can leverage async_wrap/zones to ensure there are no remaining tasks in a Runnable.

@ScottFreeCode
Copy link
Contributor Author

So, I just now had a chance to sit down and review this one. If I understood this accurately:


it("should fail, but is suppressed", function(done) {
  setTimeout(function() {
    done()
    throw new Error("Gets thrown but ignored")
  }, 5)
})

...will be marked as passed even though it throws. Assuming that's accurate:

I'd consider this a bug and advocate for stopping suppression of that error (i.e. instead it should be failed with the thrown error -- even if after already printing that the test passed -- rather than considered just plain passed and nothing more). Furthermore, doing so by adding yet another special case would be inferior to doing so by removing a special case.

The main problem with fixing it is that this will break any tests that were incorrectly passing due to that pattern; it's technically a bugfix but we could save it for version 4 to minimize complaints. If we were to save it for verson 4, I would want to start working on a version 4 branch so that we can get these suppressions corrected there, use version 4 (in development) to diagnose the flake/hang, and then fix the flake/hang in master and add the browser coverage etc. so that before we release 4 we can pull in the browser coverage stuff and improve how sure we are of its correctness (especially if we're dropping a bunch of shims, although correspondingly dropping browsers will make it a little easier on the other hand). (Either way, eventually having more browser coverage will help us determine if there are any remaining shims or environment special-cases we can drop.)


it("synchronous pending", function() {
  this.skip()
  throw new Error("Should never be called")
})

it("asynchronous pending", function(done) {
  var test = this
  setTimeout(function() {
    test.skip()
    throw new Error("Should never be called")
  }, 5)
})
  • Both of these should behave the same.
  • They should be marked pending.
  • The throw should not be executed; not just ignored, but not executed, i.e. if there are side-effects they will not occur. (This behavior may be "magic" considering further asynchronous action cannot be stopped unless Zones turn out to not have the problems Domains have* and we integrate them. Allowing tests to continue executing after this.skip for consistency without Zones would be a breaking change we may consider for a future major bump.)
  • All of that is achieved in the current implementation; however...
  • The synchronous one is implemented as throw new Pending()
  • The asynchronous one is implemented as done(new Pending()); throw <ignored error>

Question: assuming the above is accurate, should we not be using throw new Pending() in both implementations and having both the synchronous failure handling and the asynchronous failure handling go through the same "treat Pending as skip rather than failure" branching code?


NB: I have not tried running those tests and confirming their behavior then changing the code and seeing how it changes their behavior. Just wrapping my head around the ideas in the comment.


* Last I checked, Domains made it into Node only to end up deprecated because people found problems with them, but Node won't remove them till there's a suitable replacement, and Zones have been proposed as that replacement but nobody can find info on how they avoid the problems Domains have. I don't have a citation or anything, that's just all the info I was able to bring up on Google last time I tried.

@boneskull
Copy link
Member

honestly I think this has to do with a sauce connect problem and nothing in our codebase. basically I think we gotta go back to running all of the browsers at once.

@boneskull
Copy link
Member

@ScottFreeCode see these commits

I think what's happening is this:

  1. make test-browser-whatever runs
  2. karma starts
  3. karma opens a sauce connect proxy with the job ID as tunnel identifier
  4. karma runs tests & exits, proxy with it
  5. make test-browser-whatevers-next runs
  6. karma opens a sauce connect proxy with the same job ID as tunnel identifier
  7. due to timing issues, the tunnel id may now be considered invalid by saucelabs, because the previous karma process killed the sauce connect process.
  8. fails to launch browser

so that's my theory. the fix would then be:

  1. allow travis (not karma) to open the sauce connect proxy
  2. whenever karma starts, use the job ID as the tunnel identifier
  3. meanwhile, travis should leave the sauce connect process running after karma completes a set of tests

a bonus is that awhile back I talked with SauceLabs and had them increase our concurrency. when we split up the browser targets in the past, it was because we were running into problems. now, we should not have problems running all browsers we want in one fell swoop.

granted, that fell swoop contains multitudes of test-browser-* targets... but either way, if this works, CI will run quicker.

@boneskull
Copy link
Member

fwiw, this build

@boneskull
Copy link
Member

boneskull commented Sep 27, 2017

I understand your concerns about this behavior, but I'm not convinced it's causing any real problems in our build.

This example:

it("should fail, but is suppressed", function(done) {
  setTimeout(function() {
    done()
    throw new Error("Gets thrown but ignored")
  }, 5)
})

Isn't a case that Mocha should be handling, IMO. If code calling done() then doing more stuff in any callback-based JS, then the code is either a) wrong, or b) too clever.

So, as you said, Mocha is eating the error, and to let it bubble, we'd have to add more special cases. I don't think this is necessary. Once you've called done(), that should be the end of the test, and whatever happens after that--which is hopefully nothing--is beyond the scope of Mocha.

Question: assuming the above is accurate, should we not be using throw new Pending() in both implementations and having both the synchronous failure handling and the asynchronous failure handling go through the same "treat Pending as skip rather than failure" branching code?

We call done() with the Pending error, which effectively marks the Runnable as pending, because the only way to mark a Runnable as anything in an async test is to call done(). Yet, the only way to abort execution of the current stack is to throw, so to abort, we throw as well (I could see an argument for throwing Pending in the body of done, but this is just another way of doing the same thing).

If we were to change this behavior, I wouldn't suggest to not attempt to abort, even though this only covers the general case. See:

it('should be a problem', function (done) {
  process.nextTick(() => {
    throw new Error('gotcha'); // thrown, but where does it go?
  });
  this.skip(); // calls done() and finishes up
  throw new Error('nope nope nope nope nope'); // not thrown
});

So, if we aren't going to abort, what should we do instead? It would involve changing the API dramatically. Open to suggestions about what this should really look like, but it should require the user to either return from an synchronous Runnable or otherwise notify an async Runnable that it's pending. In a perfect world?

import {describe, it} from 'mocha';

it('is sync pending', test => test.pending); // or `test.pending();` or `test.pending = true; return;`, etc.

it('is async pending', (test, done) => {
  done(test.pending);
});

it('is promise pending', test => {
  return myPromise.then(() => test.pending);  // or `await myPromise; return test.pending;`
});

At any rate, I'm not really convinced this is the source of browser flake, but this could definitely be improved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants