Skip to content
This repository has been archived by the owner on Nov 15, 2021. It is now read-only.

NPath complexity is a larger value than expected. #716

Open
6 of 24 tasks
jswolf19 opened this issue May 24, 2017 · 21 comments
Open
6 of 24 tasks

NPath complexity is a larger value than expected. #716

jswolf19 opened this issue May 24, 2017 · 21 comments

Comments

@jswolf19
Copy link
Contributor

jswolf19 commented May 24, 2017

My Framework

  • .NET 2
  • .NET 3.5
  • .NET 4
  • .NET 4.5
  • .NET 4.6
  • .NET 4.6.1
  • .NET 4.6.2
  • .NET Core 1.0.0
  • Something else

My Environment

  • Windows 7 or below (not truly supported due to EOL)
  • Windows 8
  • Windows 8.1
  • Windows 10
  • Windows 10 IoT Core
  • Windows Server 2012
  • Windows Server 2012 R2
  • Windows Server 2016

I have already...

  • repeated the problem using the latest stable release of OpenCover.
  • reviewed the usage guide and usage document.
  • have looked at the opencover output xml file in an attempt to resolve the issue.
  • reviewed the current issues to check that the issue isn't already known.

My issue is related to (check only those which apply):

  • no coverage being recorded
  • 32 or 64 bit support
  • feature request

Expected Behavior

According to the original source,

The NPath complexity of a method is the number of acyclic execution paths through that method.
Thus, the following code should have 4 acyclic execution paths, for a nPath complexity of 4.

       public static string NPathResultShouldBe4(bool a, bool b, bool c)
        {
            if(a) 
            {
                if(b)
                {
                    if(c)
                    {
                        return "Path 1";
                    }
                    return "Path 2";
                }
                return "Path 3";
            }
            return "Path 4";
        }

Actual Behavior

The nPathComplexity value in the resulting xml file is 8. It appears that nested BranchPoints are not properly handled.

Steps to reproduce the problem:

  1. build the attached project. (Having trouble attaching a zip file... For now I'll just attach the source file as
    NPathResultTest.txt
    )
  2. run opencover configured with xunit as follows:
    -register:Path32 -target:"%PathToXUnit%\xunit.console.clr4.exe" -targetargs:"nPathTests.dll"
  3. observe the method nPathTests.NPathResultTest::NPathResultShouldBe4(...) as an nPathComplexity attribute equal to 8.
@sawilde
Copy link
Member

sawilde commented May 24, 2017

OpenCover calculates its complexity based on the IL branches encountered not source code (we use a 3rd party library - https://github.com/mono/mono-tools/blob/master/gendarme/rules/Gendarme.Rules.Maintainability/AvoidComplexMethodsRule.cs#L152) and is more of a guide.

If you want source code complexity analysis - you'll need to find another tool.

gah - ignore me - confusing myself :)

@ddur
Copy link
Contributor

ddur commented May 24, 2017

As far as I remember, NPath complexity is computed as 2branches.
8 seems good value.

@jswolf19
Copy link
Contributor Author

I don't see a problem in using IL instead of source as that gives a more accurate result in my mind.

I realize that the source provided in the feature request #363 for nPath gives only a simplistic example that could be interpreted as calculating 2branches, but the request also defines it as

the number of (non-looping) paths through a method. This matches the number of unit tests you would have to write to fully exercise every path through that method.

In other words, it can be used to find a somewhat useful upper bound to the minimum number of tests needed to achieve 100% coverage, so I imagine it is a good measure to report correctly in a coverage tool.

@ddur
Copy link
Contributor

ddur commented May 24, 2017

@jswolf19

Every IF branch has implicit ELSE.
How many tests cases would you have to write to cover all possible branches?
100% coverage of correct code, does not cover all cases of incorrect code.

Feel free to contribute. You are welcome.

@jswolf19
Copy link
Contributor Author

@ddur
An else, implicit or otherwise, does not increase the number of paths through the code because the jump before the else is unconditional. Similarly, a goto should not affect the nPath.

100% coverage of correct code, does not cover all cases of incorrect code.

I'm not sure what you're trying to say. Do you mean that 100% coverage does not mean that your code is correct? If so, I agree with you: minimal coverage does not show that your code is correct. In my mind, coverage is something to tell you that you have a problem, not that you don't have a problem ^^. Having the nPath gives you an idea of if it is even feasible to get to 100% coverage. As you can tell by the overflow I reported in #717, our code very likely has problems.

As far as contributing, I would be very happy to do so. I can't immediately, though, and it's unlikely that I can at work, so I'm trying to supply what information I can from the less-than-ideal code while it's available to me.

Because the endoffset is included in the branch information, I should be able to determine the nesting of branches, so I think I should be able to come up with an algorithm with existing information. However, I've noticed that sometimes branch information is missing, and I haven't figured out if this is intentional or not, but that would also effect the correctness of the calculation.

@sawilde
Copy link
Member

sawilde commented May 24, 2017

Because the endoffset is included in the branch information, I should be able to determine the nesting of branches, so I think I should be able to come up with an algorithm with existing information.

That would be handy as our solution is probably a rough approach but until now other than the original requester (@NigelThorne) I was never sure if anyone else cared about the calculated value. Frankly, I forgot we even calculated it which shows what little value it has for me.

However, I've noticed that sometimes branch information is missing, and I haven't figured out if this is intentional or not, but that would also effect the correctness of the calculation.

We do sometimes remove branches that we deem were created by the compiler and don't match up to the code and sometimes the compiler optimises the IL produced and so it doesn't directly correlate to the source code.

If you are looking to analyse a complex code base I highly recommend NDepend as this sort of analysis is their lifeblood. Whether they have support for npath I don't know but they do analyse the source code as well and so the results you get will match up well. I don't think they support OpenCover coverage files directly but there are tools about that will take OpenCover output files and convert them into NCover format, which they do support.

@jswolf19
Copy link
Contributor Author

I was actually comparing the branch information with the IL from a debug build. If necessary I could fall back to the IL, but it would probably be better to be consistent with the branch coverage.

Thanks for the recommendation. One of these days I'll get around to looking at some of the tools out there ^^. Right now, we don't have any sort of analysis going on, and certainly not automated, so it may be a tough sell, which is why I'm looking into what I can do with OS offerings. I am impressed with what you have put out for the community and want to help to make it better ^_^

@ddur
Copy link
Contributor

ddur commented May 29, 2017

If you explicitly code implicit branches, you will find lines/branches/paths, that are possibly not covered by test, but can be filled in with correct or incorrect code at any time. To cover all branches, one needs 8 test cases.

       public static string NPathResultShouldBe4(bool a, bool b, bool c)
        {
            if(a) 
            {
                if(b)
                {
                    if(c)
                    {
                        return "Path 1";
                    } else {
                        ;   // branch, undiscovered path
                    }
                    return "Path 2";
                } else {
                    ;  // branch, undiscovered path
                }
                return "Path 3";
            } else {
                ;  // branch, undiscovered path
            }
            return "Path 4";
        } else {
            ;  // branch, undiscovered path
        }

@ddur
Copy link
Contributor

ddur commented May 29, 2017

Open cover is not source code analysis tool, does not detects code semantics, so following code will also require 8 test cases. Example code is stupid and it is impossible to cover all branches. That is clear sign that example code requires refactoring :)

       public static string NPathResultShouldBe4(bool a)
        {
            if(a) 
            {
                if(a)
                {
                    if(a)
                    {
                        return "Path 1";  // branch, possibly covered path
                    } else {
                        ;   // branch, undiscoverable path
                    }
                    return "Path 2";  // branch, undiscoverable path
                } else {
                    ;  // branch, undiscoverable path
                }
                return "Path 3";  // branch, undiscoverable path
            } else {
                ;  // branch, undiscoverable path
            }
            return "Path 4";  // branch, undiscoverable path
        } else {
            ;  // branch, possibly covered path
        }

@ddur
Copy link
Contributor

ddur commented May 29, 2017

100% coverage of correct code, does not cover all cases of incorrect code.

Correction or in other words: 100% line coverage of code, does not cover all cases of missing code.

@jswolf19
Copy link
Contributor Author

@ddur Unfortunately, the original paper describing the algorithm is behind a pay wall, so I'm just going by the definition I cited above (the number of acyclic execution paths through the method). I'll use your last example. As far as I know, NPath does not try to eliminate paths that are impossible to traverse, so the NPath of the code would still be 4, even though there are only 2 paths that can actually be traversed.

 1       public static string NPathResultShouldBe4(bool a)
 2        {
 3            if(a) //true->4, false->23
 4            {
 5                if(a) //true->6, false->17
 6                {
 7                    if(a) //true->8, false->11
 8                    {
 9                        return "Path 1";  // branch, possibly covered path
10                    } //goto 15
11                    else
12                    {
13                        ;   // branch, undiscoverable path
14                    }
15                    return "Path 2";  // branch, undiscoverable path
16                } //goto 21
17                else
18                {
19                    ;  // branch, undiscoverable path
20                }
21                return "Path 3";  // branch, undiscoverable path
22            } //goto 27
23            else
24            {
25                ;  // branch, undiscoverable path
26            }
27            return "Path 4";  // branch, undiscoverable path
28        }

Because opencover's output xml doesn't have any info about return statements, I'll pretend that a path that reaches a return statement will still continue to the end of the function (this won't produce ideal results in some cases, but then, NPath won't produce the ideal result of 2 in this case, anyway). The four paths are

  • 1-2-3-4-5-6-7-8-9-10-15-16-21-22-27-28
  • 1-2-3-4-5-6-7-11-12-13-14-15-16-21-22-27-28
  • 1-2-3-4-5-17-18-19-20-21-22-27-28
  • 1-2-3-23-24-25-26-27-28

@ddur
Copy link
Contributor

ddur commented May 30, 2017

@jswolf19
Copy link
Contributor Author

jswolf19 commented May 30, 2017

@ddur, yes, I already provided that link above.

As I said, the example provided there is simplistic and happens to be 22 because all paths through the function encounter both conditions/branches. If you want, I will be happy to provide the analysis for that example function to show that the author is correct about there being 4 paths.

If you think/know my reasoning to be wrong, then I would appreciate it if you would take the time to explain where I am mistaken. You could for example show one of the four other paths you claim exist in the example above. If you don't understand something I've said, then let me know and I will try to explain better. If it doesn't matter what I say, then please save us both some time.

@ddur
Copy link
Contributor

ddur commented May 30, 2017

Yes you are right. It would be 8 paths only if branches are not nested.
Sorry, I jumped into conclusion too fast.

public static string NPathResultShouldBe4(bool a, bool b, bool c) {
    if(a) 
    {
        if(b)
        {
            if(c)
            {
                return "Path 1";
            } else {
                return "Path 2";
            }
        } else {
            return "Path 3";
        }
    } else {
        return "Path 4";
    }
}
public static string NPathResultShouldBe4(bool a, bool b, bool c) {
01    if(a) 
02    {
03        if(b)
04        {
05            if(c)
06            {
07                ;
08            } else {
09                ;
10            }
11        } else {
12            ;
13        }
14    } else {
15        ;
16    }
17
}
Input 01: 1, 1, 1 - path: __,__,__,__,__,07,__,__,__,__,__,__,__,__,__,17
Input 02: 1, 1, 0 - path: __,__,__,__,__,__,__,09,__,__,__,__,__,__,__,17
Input 03: 1, 0, 1 - path: __,__,__,__,__,__,__,__,__,__,12,__,__,__,__,17
Input 04: 1, 0, 0 - path: __,__,__,__,__,__,__,__,__,__,12,__,__,__,__,17
Input 05: 0, 1, 1 - path: __,__,__,__,__,__,__,__,__,__,__,__,__,15,__,17
Input 06: 0, 1, 0 - path: __,__,__,__,__,__,__,__,__,__,__,__,__,15,__,17
Input 07: 0, 0, 1 - path: __,__,__,__,__,__,__,__,__,__,__,__,__,15,__,17
Input 08: 0, 0, 0 - path: __,__,__,__,__,__,__,__,__,__,__,__,__,15,__,17
public static string NPathResultShouldBe4(bool a, bool b, bool c) {
01    if(a)
02    {
03        ;
04    } else {
05        ;
06    }
07    if(b)
08    {
09        ;
10    } else {
11        ;
12    }
13    if(c)
14    {
15        ;
16    } else {
17        ;
18    }
19
}
Input 01: 1, 1, 1 - path: __,__,03,__,__,__,__,__,09,__,__,__,__,__,15,__,__,__,19
Input 02: 1, 1, 0 - path: __,__,03,__,__,__,__,__,09,__,__,__,__,__,__,__,17,__,19
Input 03: 1, 0, 1 - path: __,__,03,__,__,__,__,__,__,__,11,__,__,__,15,__,__,__,19
Input 04: 1, 0, 0 - path: __,__,03,__,__,__,__,__,__,__,11,__,__,__,__,__,17,__,19
Input 05: 0, 1, 1 - path: __,__,__,__,05,__,__,__,09,__,__,__,__,__,15,__,__,__,19
Input 06: 0, 1, 0 - path: __,__,__,__,05,__,__,__,09,__,__,__,__,__,__,__,17,__,19
Input 07: 0, 0, 1 - path: __,__,__,__,05,__,__,__,__,__,11,__,__,__,15,__,__,__,19
Input 08: 0, 0, 0 - path: __,__,__,__,05,__,__,__,__,__,11,__,__,__,__,__,17,__,19

@jswolf19
Copy link
Contributor Author

@ddur No worries. I'm sorry that it seems that we got off on the wrong foot. I'm sure I could have been better with how I created the issue, as well ^^

@ddur
Copy link
Contributor

ddur commented May 31, 2017

Fixing would require a kind of semantic analysis of source-code.
There is not only if statement, we have switch and boolean expressions and loop branches.

MS IL is flat, tracking execution paths and branches would not be easy.

I think that fixing that goes far beyond intended usage of code-coverage tool.

@jswolf19
Copy link
Contributor Author

There's already information on branching with the BranchPoints. It would likely need info on unconditional jumps (br and br.s), and for best analysis, info on early exiting via ret, throw, and rethrow (probably jmp, too). So you may be right that it might be better to separate it in some way.

Maybe allowing for extensions to add attributes/child elements to the report xml would be a better long-term route to take.

@ddur
Copy link
Contributor

ddur commented May 31, 2017

Branches are available. Problem is to detect nesting.

@jswolf19
Copy link
Contributor Author

jswolf19 commented Jun 1, 2017

I may not have spent enough time looking, but it looked like the offsetend in the report told you where the branch would jump to. If that's the case, one path will go to the next offset after the branch, and the other path would go to offsetend and continue from there. Thus, a nested branch would have an offset between the next offset and offsetend and would only be in one of the two paths.

I haven't spent a lot of time looking though, so there may be some nesting that would be missed. If that's the case, then the question is what to do: leave it as it is, get rid of it, improve what you can easily, or try to make it perfect. I do think it can be improved some with the current information, but I'm not going to push the issue if you don't want make the change.

@sawilde
Copy link
Member

sawilde commented Jun 2, 2017

You would probably need to build a node graph of some form based on the IL and the branches and then analyse it.

@jswolf19
Copy link
Contributor Author

jswolf19 commented Jun 2, 2017

Yeah, that's basically what I was imagining: a depth-first search from the entry point on a directed graph until you hit an exit(count one path) or a node already on the current path(a cycle, not counted). I also wanted to see what kind of results I'd get working from the BranchPoints and SequencePoints for consistency, since some of the branches are dropped. It may also be possible/better to reuse the logic used to drop compiler-generated branches, though.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants