Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression in executable line identification for match expressions in 9.2.21 #967

Open
hemberger opened this issue Dec 15, 2022 · 18 comments
Labels

Comments

@hemberger
Copy link
Contributor

Q A
php-code-coverage version 9.2.21
PHP version 8.1.7
Driver Xdebug
Xdebug version (if used) 3.1.5
Installation Method Composer
Usage Method PHPUnit
PHPUnit version (if used) 9.5.26

Some valuable precision in the identification of executable lines is lost as a result of #964. I have not reviewed that PR carefully enough to judge its merits, but I see how the concerns raised by @mvorisek can manifest as a regression in real-world test suites. I appreciate the many, many hours of effort by the contributors working on this issue and I hope that we can find a solution that is still friendly to mutation testing without impacting precision.

For example, in 9.2.20, only the branches of a match expression that were actually executed were displayed as covered. In 9.2.21, the entire match expression is now reported as being covered, even though it is not. It was quite useful to know which branches were being exercised by the test.

I'm having a bit of deja vu here, because I reported a nearly identical regression to match expression coverage in #904, which was fixed somewhere between 9.2.13 and 9.2.20! :)

Here is an example class:

<?php declare(strict_types=1);

class MatchExpr {

    public int $result;

    public function __construct(int $value) {
        $this->result = match ($value) {
            0 => 4,
            1 => 5,
            2 => 6,
            3 => 7,
            default => 8,
        };  
    }   

}

And here is the corresponding test:

<?php declare(strict_types=1);

use PHPUnit\Framework\TestCase;
use MatchExpr;

/**
 * @covers MatchExpr
 */
class MatchExprTest extends TestCase {

    /** 
     * @testWith [0, 4]
     *           [2, 6]
     *           [9, 8]
     */
    public function test_happy_path(int $value, int $expected): void {
        self::assertSame($expected, (new MatchExpr($value))->result);
    }   

}

Coverage report in 9.2.20 (only 3 of the 5 branches are covered, as expected):
image

Coverage report in 9.2.21 (all branches are marked as covered):
image

Thank you for your time!

@Slamdunk
Copy link
Contributor

Hi, match expression is indeed the most tricky to produce a code-coverage report for in a deterministic manner.
I found many ways to have inconsistent reports where never executed branches are reported as green by xDebug/PCOV.

For that reason I've highlighted in the main topic #964 that the only deterministic and consistent way to handle match is to mark each branch as executed if any branch is executed, and if you really care about precise feedback over it, enable MatchArmRemoval mutator from Infection, see https://infection.github.io/guide/mutators.html#Removal-Mutators

@hemberger
Copy link
Contributor Author

Thanks for the quick response! I can certainly appreciate that mutation testing may be the only 100% reliable way to assess code coverage for match arms, but that's asking a lot for users who don't have the means or need to set it up. The issue with enums giving false positives is kinda wild; would that be mitigated if the autoloader was lazier in match expressions?

I'm curious what other types of setups result in undesirable behavior for match expressions. Are they a sufficiently small subset that there's still value to providing more precise coverage for simple cases? I'm guessing the answer is no, but I suppose it doesn't hurt to ask. :)

@mvorisek
Copy link
Contributor

@Slamdunk would you be please so kind to post here analysis of xdebug data that are causing this issue?

@Slamdunk
Copy link
Contributor

Slamdunk commented Dec 16, 2022

Given the following command:

$ XDEBUG_MODE=coverage php -r 'require "vendor/autoload.php";xdebug_start_code_coverage(XDEBUG_CC_UNUSED|XDEBUG_CC_DEAD_CODE);(new Foo)->foo("HIT");print_r(xdebug_get_code_coverage()[realpath("src/Foo.php")]);'

For which I need to highlight 2 things:

  1. It uses composer autoload features
  2. The actual code is (new Foo)->foo("HIT");

And the following directory structure, the Foo class receive from xDebug the information in the comment at the end of each line:

// src/Bar.php
class Bar { public const BAR = 'bar'; }

// src/Baz.php
class Baz { public const BAZ = 'baz'; }

// src/Xyz.php
class Xyz { public const XYZ = 'xyz'; }

// src/Foo.php
class Foo
{
    public const FOO = 'foo';

    public function foo($var): string
    {
        return match ($var) {               //  1 : green
            self::FOO   => 'self::FOO',     // -1 :       RED
            Bar::BAR    => 'Bar::BAR',      //  1 : green
            'Baz::BAZ'  => Baz::BAZ,        //  1 : green
            uniqid(1)   => 'uniqid(1)',     //  1 : green
            'uniqid(2)' => uniqid(2),       //  1 : green
            'HIT'       => 'HIT',           //  1 : green
            Xyz::XYZ    => 'Xyz::XYZ',      // -1 :       RED
            uniqid(3)   => 'uniqid(3)',     // -1 :       RED
            'uniqid(4)' => uniqid(4),       // -1 :       RED
            default     => 'default',       //  1 : green
        };
    }
}

So despite only asking for 'HIT' match, all other cases are marked as executed too when they precede the one we are looking for, and also the default one is triggered even if it's at the end.

Line Code-Coverage for match is a total lie.

I can certainly appreciate that mutation testing may be the only 100% reliable way to assess code coverage for match arms, but that's asking a lot for users who don't have the means or need to set it up.

Activating Infection requires 3 steps:

  1. composer require --dev infection/infection
  2. Add a configuration file: https://infection.github.io/guide/usage.html#Configuration
  3. Run vendor/bin/infection --git-diff-lines --min-msi=100 to only run it on DIFF

If you set up PHPUnit and Code-Coverage, you certainly have the means for it.
If you care about match CC precision, you certainly need it.

I understand the psychological burden of adding another tool to the build.
I can only reply that I didn't choose to have such poor Code-Coverage tools from php-src in the first place neither.

@hemberger as far as I can tell, this only affect match by the way, so I'd appreciate if you can change the title to indicate a specific match regression, so other people that face the same issue can find this reply quicker.

@sebastianbergmann
Copy link
Owner

@derickr @krakjoe Is there anything that can be done in Xdebug or PCOV to improve the "precision" of code coverage for match expressions?

@hemberger hemberger changed the title Regression in executable line identification in 9.2.21 Regression in executable line identification for match expressions in 9.2.21 Dec 16, 2022
@derickr
Copy link
Contributor

derickr commented Dec 16, 2022

There seems to be a bug in VLD, it doesn't show the "default" jump target in the list of targets (from the original report):

line      #* E I O op                           fetch          ext  return  operands
-------------------------------------------------------------------------------------
    7     0  E >   RECV                                             !0
    8     1        EXT_STMT
          2      > MATCH                                                    !0, [ 0:->3, 1:->5, 2:->7, 3:->9, ]
    9     3    >   QM_ASSIGN                                        ~3      4
          4      > JMP                                                      ->13
   10     5    >   QM_ASSIGN                                        ~3      5
          6      > JMP                                                      ->13
   11     7    >   QM_ASSIGN                                        ~3      6
          8      > JMP                                                      ->13
   12     9    >   QM_ASSIGN                                        ~3      7
         10      > JMP                                                      ->13
   13    11    >   QM_ASSIGN                                        ~3      8
         12      > JMP                                                      ->13
    8    13    >   ASSIGN_OBJ                                               'result'
   13    14        OP_DATA                                                  ~3
   15    15        EXT_STMT
         16      > RETURN                                                   null

The coverage from PHP-CC 9.2.20 looks correct, as cases 0, 2, and the default are tested, and shown in line coverage. I don't know what you have done in 9.2.21, where it is broken.


the default one is triggered even if it's at the end.

That seems to be because PHP checks whether the return type of the return statement of the whole function on that line. It's not wrong, but confusing. It would be better if that was done on the actual return match line — but this is something for PHP to change.

   18    49    >   INIT_FCALL                                               'uniqid'
         50        SEND_VAL                                                 4
         51        DO_FCALL                                      0  $9      
         52        QM_ASSIGN                                        ~6      $9
         53      > JMP                                                      ->56
   19    54    >   QM_ASSIGN                                        ~6      'default'
         55      > JMP                                                      ->56
         56    >   VERIFY_RETURN_TYPE                                       ~6
         57      > RETURN                                                   ~6

Line Code-Coverage for match is a total lie.

That's a nonsense statement.

In Slamdunks example, there is no jump table, as the LHS has code running required for evaluations. This is similar to how switch works when it can't use a jump table either:

line      #* E I O op                           fetch          ext  return  operands
-------------------------------------------------------------------------------------
    7     0  E >   RECV                                             !0
    9     1        EXT_STMT
          2        IS_IDENTICAL                                             !0, 'foo'
          3      > JMPNZ                                                    ~1, ->29
   11     4    >   FETCH_CLASS_CONSTANT                             ~2      'Bar', 'BAR'
          5        IS_IDENTICAL                                             !0, ~2
          6      > JMPNZ                                                    ~1, ->31
   12     7    >   IS_IDENTICAL                                             !0, 'Baz%3A%3ABAZ'
          8      > JMPNZ                                                    ~1, ->33
   13     9    >   INIT_FCALL                                               'uniqid'
         10        SEND_VAL                                                 1
         11        DO_FCALL                                      0  $3
         12        IS_IDENTICAL                                             !0, $3
         13      > JMPNZ                                                    ~1, ->36
   14    14    >   IS_IDENTICAL                                             !0, 'uniqid%282%29'
         15      > JMPNZ                                                    ~1, ->38
   15    16    >   IS_IDENTICAL                                             !0, 'HIT'
         17      > JMPNZ                                                    ~1, ->43
   16    18    >   FETCH_CLASS_CONSTANT                             ~4      'Xyz', 'XYZ'
         19        IS_IDENTICAL                                             !0, ~4
         20      > JMPNZ                                                    ~1, ->45
   17    21    >   INIT_FCALL                                               'uniqid'
         22        SEND_VAL                                                 3
         23        DO_FCALL                                      0  $5
         24        IS_IDENTICAL                                             !0, $5
         25      > JMPNZ                                                    ~1, ->47
   18    26    >   IS_IDENTICAL                                             !0, 'uniqid%284%29'
         27      > JMPNZ                                                    ~1, ->49
         28    > > JMP                                                      ->54
   10    29    >   QM_ASSIGN                                        ~6      'self%3A%3AFOO'
         30      > JMP                                                      ->56
   11    31    >   QM_ASSIGN                                        ~6      'Bar%3A%3ABAR'
         32      > JMP                                                      ->56
   12    33    >   FETCH_CLASS_CONSTANT                             ~7      'Baz', 'BAZ'
         34        QM_ASSIGN                                        ~6      ~7
         35      > JMP                                                      ->56
   13    36    >   QM_ASSIGN                                        ~6      'uniqid%281%29'
         37      > JMP                                                      ->56
   14    38    >   INIT_FCALL                                               'uniqid'
         39        SEND_VAL                                                 2
         40        DO_FCALL                                      0  $8
         41        QM_ASSIGN                                        ~6      $8
         42      > JMP                                                      ->56
   15    43    >   QM_ASSIGN                                        ~6      'HIT'
         44      > JMP                                                      ->56
   16    45    >   QM_ASSIGN                                        ~6      'Xyz%3A%3AXYZ'
         46      > JMP                                                      ->56
   17    47    >   QM_ASSIGN                                        ~6      'uniqid%283%29'
         48      > JMP                                                      ->56
   18    49    >   INIT_FCALL                                               'uniqid'
         50        SEND_VAL                                                 4
         51        DO_FCALL                                      0  $9
         52        QM_ASSIGN                                        ~6      $9
         53      > JMP                                                      ->56
   19    54    >   QM_ASSIGN                                        ~6      'default'
         55      > JMP                                                      ->56
         56    >   VERIFY_RETURN_TYPE                                       ~6
         57      > RETURN                                                   ~6
   21    58*       EXT_STMT
         59*       VERIFY_RETURN_TYPE
         60*     > RETURN                                                   null

As this code is run, case by case until it matches, these LHS statements
are evaluated/executed, so it shows up correctly in line coverage.

The example doesn't talk about, or show, branch coverage, so I can't comment on
that.


So the only bug here, is that VLD misses showing the default case, and I've
just fixed that.

@sebastianbergmann
Copy link
Owner

Thank you, Derick, for the detailed response.

@sebastianbergmann
Copy link
Owner

@Slamdunk Do you have an idea how we should proceed here? Thanks!

@derickr
Copy link
Contributor

derickr commented Jan 16, 2024

From my POV, the result from 9.2.20 was correct, and no special hacks should be done to make the whole match being marked as covered. The output of Xdebug is, as far as I can tell from this example, the right one.

@sebastianbergmann
Copy link
Owner

@Slamdunk Do you still plan to work on this? Thanks!

@Slamdunk
Copy link
Contributor

In v9.2.20 the match coverage report depended on test execution order.
I haven't been able to find a solution to this issue, and honestly I lost hope on this one, sorry

@uuf6429
Copy link

uuf6429 commented May 20, 2024

I think I've just hit this problem with the following code and test:

<?php

class Updater
{
    public function update(ICLass $item): IClass
    {
        return match (true) {
            $item instanceof ClassA => (
                static fn() => new ClassA($item->number + 1)
            )(),
            $item instanceof ClassB => (static function () use ($item) {
                return new ClassB($item->text . ' updated');
            })(),
            $item instanceof ClassU => new ClassU(array_map($this->update(...), $item->items)),
        };
    }
}

class ClassBTest extends TestCase
{
    public function testThatClassBIsUpdated():void{
        $result = (new Updater())->update(new ClassB('test'));
        $this->assertEquals(new ClassB('test updated'), $result);
    }
}

(the convoluted match branches are only for testing)

In that scenario, I get 100% coverage for Updater even though:

  • the first branch (body) is not really executed
  • the last branch (condition) is not even hit

Specs:

PHP 8.3.2 
PHPUnit 11.1.3
Xdebug 3.3.1

Switching to Infection is a poor excuse in my opinion - it completely invalidates the advantage/feature of having code coverage in PHPUnit (I'd rather go back to switch than rely on yet another tool). But in that case, PHPUnit needs to handle this better - which means if it cannot correctly determine coverage, then it's better to mark it as uncovered (plus some warning that matches are unsupported, maybe) rather than giving a false sense of security.

@sebastianbergmann
Copy link
Owner

sebastianbergmann commented May 21, 2024

@Slamdunk wrote in #967 (comment):

In v9.2.20 the match coverage report depended on test execution order.

Would it be only ede00c7 that needs to be reverted to address this? That commit's message mentions a dependency on "autoloading runtime behavior". Is this what you mean by "test execution order"?

I think it would be better to have a solution that is sometimes wrong due to test execution order / autoloading runtime behavior than having a solution that appears to be always wrong.

@Slamdunk
Copy link
Contributor

I think it would be better to have a solution that is sometimes wrong due to test execution order / autoloading runtime behavior than having a solution that appears to be always wrong.

I disagree on this.
If you are going to revert that commit, I suggest you to add a F.A.Q. like this to prevent a flood a issues:

Q: Every run of phpunit gives a different code coverage result on match expressions, how can I have a stable result?
A: You can't, sorry: CC on match depends on autoloading runtime behavior and test execution order (both), deal with it


Switching to Infection is a poor excuse in my opinion - it completely invalidates the advantage/feature of having code coverage in PHPUnit

@uuf6429 code coverage tells what lines of code have been used by your tests, mutation testing tells if your tests are resilient to code changes and thus effective at all. Your statement doesn't make sense to me.

@uuf6429
Copy link

uuf6429 commented May 21, 2024

@uuf6429 code coverage tells what lines of code have been used by your tests, mutation testing tells if your tests are resilient to code changes and thus effective at all. Your statement doesn't make sense to me.

I know the difference - your comment and followup comment suggests switching to Infection as a workaround to this bug... The only practical workaround is to replace match with switch or ifs.

@Slamdunk
Copy link
Contributor

suggests switching to Infection

Adding rather than switching, actually

The only practical workaround is to replace match with switch or ifs

Been there, done that.
Installing Infection and running it with --git-diff-lines is more practical and beneficial to your code base than trying to fix ede00c7 or replacing every match usage.

@sebastianbergmann
Copy link
Owner

I disagree on this. If you are going to revert that commit, I suggest you to add a F.A.Q. like this to prevent a flood a issues:

Q: Every run of phpunit gives a different code coverage result on match expressions, how can I have a stable result?
A: You can't, sorry: CC on match depends on autoloading runtime behavior and test execution order (both), deal with it

Yes, I would add a note about this to the documentation.

Would it be only ede00c7 that needs to be reverted to address this?

Sorry to bother you, but you did not give an answer to the above question.

Switching to Infection is a poor excuse in my opinion - it completely invalidates the advantage/feature of having code coverage in PHPUnit

@uuf6429 code coverage tells what lines of code have been used by your tests, mutation testing tells if your tests are resilient to code changes and thus effective at all. Your statement doesn't make sense to me.

I understand your point of view and suspect that you must be even more frustrated about this situation than I am, but I also see sense in the argument made by @uuf6429 and the others here on this ticket.

I trust you, Filippo, on your assessment that code coverage of match arms is unreliable due to sensitivity to autoloading runtime behavior and test execution order. The current implementation leads to code coverage that is always wrong. As I wrote in #967 (comment), I would prefer a solution that is at least sometimes correct.

As much as I love Infection, I do not think that it makes sense to suggest to use mutation testing as a workaround for shortcomings when it comes to code coverage for match arms. We can warn about those in the documentation.

Maybe we can even recommend strategies to reduce the sensitivity to autoloading runtime behavior and test execution order. The first idea that comes to mind would be to use a combination of static loading and dynamic autoloading which is what we use for PHPUnit's PHAR.

@Slamdunk
Copy link
Contributor

Would it be only ede00c7 that needs to be reverted to address this?

As far as I remember you are correct

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants