代码覆盖率不是质量指标

A program with high test coverage, measured as a percentage, has had more of itssource code executed during testing which suggests it has a lower chance of containingundetected software bugs compared to a program with low test coverage.

“Let’s make it clear, then: don’t set goals for code coverage. You may think thatit could make your code base better, but asking developers to reach a certain codecoverage goal will only make your code worse.”

— Mark Seemann

Why it’s bad to use high code coverage as a goal?

In the snippet below we have a function divide that accepts two floatarguments, x and y and performs a division between them. Note that we don’thave any kind of guards on out code.

float divide(float x, float y) {  return x / y;}

With the divide function we also provide a simple unit test that is makingsure that our function does the job. With this test, we have 100% code coverage.It means that our code is bullet proof, right?

@Testpublic void divide_with_valid_arguments() {  assertThat(new Calculator().divide(10, 2)).isEqualTo(5);}

Nope. We have 100% coverage, that’s a fact. But the code itself is not correct.Also, we’re testing one scenario; a positive and limited scenario. We should,always, test for failure. In this particular case, what happens if we try todivide by zero? We should check if the y is equal to zero and throw a properexception.

float divide(float x, float y) {  if (y == 0) {    throw new ArithmeticException("Can't divide by zero.");  }  return x / y;}

What’s the problem with adding decision branches? Coverage drops andthere’s no time to write another test. Having a good code coverage may be a signthat we have a solid test suite, but if we’re using it as a mandatory target,the codebase will eventually suffer.

Humans always take shortcuts. When we have two possible choices, we alwayschoose the easier one. If the coverage value is part of the merging process,developers will adapt the code to meet those requirements. A stronger, andstill meeting the 100% code coverage criteria, test suite, would like thesnippet below.

@Testpublic void divide_with_valid_arguments() {  assertThat(new Calculator().divide(10, 2)).isEqualTo(5);}@Testpublic void divide_with_invalid_arguments_should_throw_exception() {  assertThatThrownBy(() -> new Calculator().divide(10, 0))      .isInstanceOf(ArithmeticException.class)      .hasMessageContaining("Can't divide by zero.");}

For a simple division function, we can successfully achieve 100% code coveragewith these two tests but, are we really done with the testing? Do our testscover a reasonable number of scenarios that make us feel confident about ourcode? When testing, the hardest question is when to stop. For some, a shiny100% code coverage is the answer to that question. It is important to lookfor other quality factors than code coverage. Check if the test case is useful,and is intended to find failures in the system. If you’re looking only to codecoverage as a quality criteria, the test bellow would do the job.

@Testpublic void divide_with_valid_arguments() {  assertThat(new Calculator().divide(10, 2)).isNotZero();}

“I don’t know if they did code coverage analysis on this project, but ofcourse you can do this and have 100% code coverage - which is one reasonwhy you have to be careful on interpreting code coverage data.”

— Martin Fowler

Amplifying the scenarios with parameterized tests

Parameterized tests are a data-driven testing technique that uses test inputsand expected outcomes as data, normally in a tabular format, so that a singledriver script can execute all of the designed test cases. A suitable scenariowhere data-driven testing can be applied is when two or more test cases requiresthe same instructions but different inputs and different expected outcomes.

A nice technique to evaluate the different test inputs is to perform anequivalence class partitioning, where we divide all possible inputs intoclasses such that there is a finite number of input equivalence classes. Once,they’re set, we may assume that:

the program behaves analogously for inputs in the same class; one test with a representative value from a class is sufficient; if the representative detects a defect, then other class members would detect the same defect.

For x and y we can divide the inputs in four partitions: {1, 20},{1.0, 20.0}, {-20, -1}, and {-20.0, -1.0}. By combiningequivalence class partitioning and parameterized tests we can writea single test with multiple scenarios like the snippet below.

@ParameterizedTest@CsvSource({  "10, 5, 2",  "-10, -5, 2",  "10.5, 5.25, 2",  "-5.0, -10.0, 0.5"})public void divide_with_valid_fields(float x, float y, float z) {  assertThat(new Calculator().divide(x, y)).isEqualTo(z);}

What is the coverage of this parameterized test? 100%. The same coverageas the test bellow. A test that is not looking for failures in the system.

@Testpublic void divide_with_valid_arguments() {  assertThat(new Calculator().divide(10, 2)).isNotZero();}

Parameterized tests contribute to a much more solid test suite, since we’retesting multiple scenarios with some edge cases, although it doesn’t increasecode coverage.

Summary

“Designing your initial test suite to achieve 100% coverage is an even worseidea. It’s a sure way to create a test suite weak at finding those all-importantfaults of omission.”

— Brian Marick

High code coverage is not directly related with code quality and cannot be usedneither as a key metric or a goal. Try to look for other metrics, expand the testscenarios, and don’t stop when you reach that shiny 100% mark.

Why it’s bad to use high code coverage as a goal?

Amplifying the scenarios with parameterized tests

Summary

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签