Skip to content
This repository has been archived by the owner on Mar 30, 2018. It is now read-only.

[DISCUSS] Roadmap for Humbug 2.0 #228

Open
padraic opened this issue Apr 20, 2017 · 7 comments
Open

[DISCUSS] Roadmap for Humbug 2.0 #228

padraic opened this issue Apr 20, 2017 · 7 comments

Comments

@padraic
Copy link
Collaborator

padraic commented Apr 20, 2017

Roadmap 2.0

  • [F001] Drop support for PHP 5. Target PHP 7.0 (LTS support?).
  • [F002] Drop support for PHPUnit 4/5. Target PHPUnit 6.0.
  • [F003] Migrate PHPUnit test result parsing from TAP to XML.
  • [F004] Migrate existing mutations to AST.
  • [F005] Assess generation/parsing of code coverage.
  • [F006] Better grouping/configuration of mutations, e.g. default, expanded, experimental.
  • [F007] Improve fastest-first ordering of tests (if feasible).
  • [F008] Improve range of mutations on offer.
  • [F009] Incremental Analysis (down the line for 2.1+?)
  • [F010] Diff result: having an XML result from which you could easily see an evolution from one to another (would be useful for a SaaS for example)
  • [F011] Better handling at false positive
  • [F012] Support phpspec
  • [F013] Support Codeception
  • [F014] Update return value mutators, where NULL is returned, to instead mutate based on return value type where possible in PHP 7.
  • [F015] Document a complete set of desired mutations, and keep track of those implemented, pending or unsupported (with reasons).

These are initial suggestions only for discussion whether here or on Slack.

Edits: Added @theofidry's suggested F010 and F011.
Edits: Added references to supporting phpspec + codeception.
Edits: Added references to extended mutation support.

@padraic
Copy link
Collaborator Author

padraic commented Apr 20, 2017

A few comments on items by way of explanation:

  • Code coverage: When I say assess, it's because we do some weird stuff to make this work, and it could probably use a few people to review it.
  • Mutation grouping: We currently throw all mutations at code - some of these might be especially messy when it comes to false positives. Do we run everything, or have some division?
  • Test ordering: One of the performance routines is to time test suites, order them fastest-first, and use that order after initial run. We currently don't/can't analyse individual tests within those suites which would be even better.

@theofidry
Copy link
Member

theofidry commented Apr 20, 2017

Drop support for PHP 5. Target PHP 7.0 (LTS support?).

As at the end of the year PHP 7.0 will have reached its end of active support, I would rather go for PHP 7.1 right away.

Drop support for PHPUnit 4/5. Target PHPUnit 6.0.

I'm less convinced about dropping PHPUnit 5.x. PHPUnit 6.0 may not get adopted as fast as we hope, if supporting the latest 5.x doesn't require too much extra work we could support it.

Assess generation/parsing of code coverage.

Actually why is this needed?

Incremental Analysis (down the line for 2.1+?)

I think that would be a very big plus to have it.

I would add a few things:

  • [F010] Diff result: having an XML result from which you could easily see an evolution from one to another (would be useful for a SaaS for example)
  • [F011] Better handling at false positive
  • [F009] Incremental analysis: we could have different strategy to detect which test to run first depending of the tests. For PHPUnit for example we could make use of the @covers annotation, for phpSpec you have a 1-1 relationship between your class and your tests.

@theofidry
Copy link
Member

@padraic edited your comment to add numbers to each point to make easier to know of what we are talking about when talking of a specific point.

@padraic
Copy link
Collaborator Author

padraic commented Apr 20, 2017

Perfect. On code coverage, there are two specific uses for it:
a) It's the devil. Mutation Testing exists to keep it honest, and we do a few comparisons against the coverage metric so we need to generate it at least once. We save subsets for...
b) The output tells us which test suites ran against which lines, so we can narrow the test suites executed per mutation to the minimum, and not execute every single test suite. At present we're literally writing subsets of code coverage to files for later use (so the entire chunky file isn't being reprocessed all the time). The coverage output format is PHP code itself - in an often large file.

Sort of linking to [F007], I don't think we currently filter tests within test suites - just the suites themselves. If you have 100 tests in a TestSuite class, and only 5 apply to a line, we're still stuck executing the lot of them. I'm trying to recall the details of what was blocking that (less tests = improved performance so it would be REALLY nice to have this). It could have been a mix of both identifying tests per line and/or implementing the filter. @Covers is one way, but it's not required so still needs a backup - and it may be open to abuse to fool the MSI score.

P.S. Anyone can the original comment to add items - should have mentioned that at start.

@theofidry
Copy link
Member

a) For what comparison to we need it? My understanding was that mutation testing allows you to assess the robustness of your test suite. You can have a rock solid coverage but if the tests are not good your results with Humbug will suck, showing they are weak. On another hand you could have a lesser coverage but for which the tests are quite robust.

b) looks like a much more interesting reason to me, although it makes me lean me more toward being able to run without coverage if we want to. [F009] indeed is redundant with [F007], and your point makes me want to try different strategy, the one I described above and/or the current when relying on the code coverage.

@padraic
Copy link
Collaborator Author

padraic commented Apr 20, 2017

The second reason would be the primary issue - performance. We do a lot of odd, sometimes even messy, things to boost performance. IA would be even faster - it's essentially mapping test suites to files, caching results, and running only a subset of MT for changed files (source or tests). It's lightening fast...well, when it works!

On the comparisons, other than MSI which applies to the entire source, having code coverage let's us generate a MSI for test covered code in isolation. Even if your code coverage sucks, you can still have some idea if the existing tests are any good. Ideally that enables an informed choice of priorities between fixing existing tests vs writing new tests. This is more of a nice-to-have - we likely wouldn't have bothered if we didn't already need coverage for performance reasons.

@padraic
Copy link
Collaborator Author

padraic commented May 5, 2017

Added a few points arising from #191

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants