Structure Based or Whitebox Testing Techniques


Using structure-based techniques to measure coverage and design tests

Structure-based techniques serve two purposes: 
  • test coverage measurement
  • structural test case design. 
They are often used first to assess the amount of testing performed by tests derived from specification-based techniques, i.e. to assess coverage. They are then used to design additional tests with the aim of increasing the test coverage.

Structure-based test design techniques are a good way of generating additional test cases that are different from existing tests. They can help ensure more breadth of testing, in the sense that test cases that achieve 100% coverage in any measure will be exercising all parts of the software from the point of view of the items being covered.


What is test coverage?

Test coverage measures in some specific way the amount of testing performed by a set of tests . Wherever we can count things and can tell whether or not each of those things has been tested by some test, then we can
measure coverage. The basic coverage measure is


                                Number of coverage items exercised
Coverage = --------------------------------------------------- x 100%
                                Total number of coverage items


where the 'coverage item' is whatever we have been able to count and see whether a test has exercised or used this item. 
There is danger in using a coverage measure. 100% coverage does not mean 100% tested! Coverage techniques measure only one dimension of a multi-dimen-sional concept. 

Two different test cases may achieve exactly the same coverage but the input data of one may find an error that the input data of the other doesn't.

One drawback of code coverage measurement is that it measures coverage of what has been written, i.e. the code itself; it cannot say anything about the software that has not been written. If a specified function has not been implemented, specification-based testing techniques will reveal this. 

If a function was omitted from the specification, then experience-based techniques may find it. But structure-based techniques can only look at a structure which is already there. 


Types of coverage

Test coverage can be measured based on a number of different structural elements in a system or component.  Coverage can be measured at component-testing level, integration-testing level or at system- or acceptance testing levels. 

For example, at system or acceptance level, the coverage items may be require-ments, menu options, screens, or typical business transactions. Other coverage measures include things such as database structural elements (records, fields and sub-fields) and files. 

It is worth checking for any new tools, as the test  tool market develops quite rapidly. 

At integration level, we could measure coverage of interfaces or specific interactions that have been tested. The call coverage of module, object or pro-cedure calls can also be measured (and is supported by tools to some extent).


We can measure coverage for each of the specification-based techniques as well:
  • EP: percentage of equivalence partitions exercised (we could measure valid and invalid partition coverage separately if this makes sense);
  •  BVA: percentage of boundaries exercised (we could also separate valid and invalid boundaries if we wished);
  • Decision tables: percentage of business rules or decision table columns tested;
  • State transition testing: there are a number of possible coverage measures:
    • Percentage of states visited
    • Percentage of (valid) transitions exercised (this is known as Chow's 0- switch coverage)
    • Percentage of pairs of valid transitions exercised ('transition pairs' or Chow's 1-switch coverage) - and longer series of transitions, such as tran sition triples, quadruples, etc.
    • Percentage of invalid transitions exercised (from the state table)
The coverage measures for specification-based techniques would apply at whichever test level the technique has been used (e.g. system or component level).

When coverage is discussed by business analysts, system testers or users, it most likely refers to the percentage of requirements that have been tested by a set of tests. This may be measured by a tool such as a requirements management tool or a test management tool.

However, when coverage is discussed by programmers, it most likely refers to the coverage of code, where the structural elements can be identified using a tool, since there is good tool support for measuring code coverage. We will cover statement and decision coverage shortly.

Statements and decision outcomes are both structures that can be measured in code and there is good tool support for these coverage measures. Code coverage is normally done in component and component integration testing - if it is done at all. 

If someone claims to have achieved code coverage, it is important to establish exactly what elements of the code have been covered, as statement coverage (often what is meant) is significantly weaker than decision coverage or some of the other code-coverage measures.


How to measure coverage


For most practical purposes, coverage measurement is something that requires tool support. However, knowledge of the steps typically taken to measure coverage is useful in understanding the relative merits of each technique. Our example assumes an intrusive coverage measurement tool that alters the code by inserting instrumentation:

  1. Decide on the structural element to be used, i.e. the coverage items to be counted.
  2. Count the structural elements or items.
  3. Instrument the code.
  4. Run the tests for which coverage measurement is required.
  5. Using the output from the instrumentation, determine the percentage of ele ments or items exercised.


Instrumenting the code (step 3) implies inserting code alongside each structural element in order to record when that structural element has been exercised. 
Determining the actual coverage measure (step 5) is then a matter of analyzing the recorded information.

Coverage measurement of code is best done using tools and there are a number of such tools on the market. These tools can help to increase quality and productivity of testing. 

They increase quality by ensuring that more structural aspects are tested, so defects on those structural paths can be found. They increase productivity and efficiency by highlighting tests that may be redundant, i.e. testing the
same structure as other tests.

In common with all structure-based testing techniques, code coverage techniques are best used on areas of software code where more thorough testing is required. Safety-critical code; code that is vital to the correct operation of a system, and complex pieces of code are all examples of where structure-based techniques are particularly worth applying. 

For example, DO178-B [RTCA] requires structural coverage for certain types of system to be used by the mili-tary. Structural coverage techniques should always be used in addition to specification-based and experience-based testing techniques rather than as an alternative to them.


Structure-based test case design 

If you are aiming for a given level of coverage (say 95%) but you have not reached your target (e.g. you only have 87% so far), then additional test cases can be designed with the aim of exercising some or all of the structural elements not yet reached. This is structure-based test design. 

These new tests are then run through the instrumented code and a new coverage measure is calculated. This is repeated until the required coverage measure is achieved

Ideally all the tests ought to be run again on the un-instrumented code. We will look at some examples of structure-based coverage and test design for statement and decision testing below.


Statement coverage and statement testing


                                             Number of statements exercised
Statement coverage = --------------------------------------------- x 100%
                                                    Total number of statements


Statement coverage is calculated by:
Studies and experience in the industry have indicated that what is considered reasonably thorough blackbox testing may actually achieve only 60% to 75% statement coverage. 

Typical ad-hoc testing is likely to be around 30% - this leaves 70% of the statements untested.

Different coverage tools may work in slightly different ways, so they may give different coverage figures for the same set of tests on the same code, although at 100% coverage they should be the same.


Example1 code sample

READ A
READ B
IF  A>B  THEN  
     C = 0
ENDIF


To achieve 100% statement coverage of this code segment just one test case is required, one which ensures that variable A contains a value that is greater
than the value of variable B, for example, A = 12 and B = 10.

Note that here we are doing structural test design first, since we are choosing our input values in order ensure statement coverage.

Let's look at an example where we measure coverage first. In order to sim-plify the example, we will regard each line as a statement. 

A statement may be on a single line, or it may be spread over several lines. One line may contain more than one statement, just one statement, or only part of a statement. Some statements can contain other statements inside them.

Example 2 code sample : we have two read statements, one assignment statement, and then one IF statement on three lines, but the IF statement contains another statement (print) as part of it.


1 READ A
2 READ B
3 C =A + 2*B
4 IF C> 50 THEN
5      PRINT large C
6 ENDIF


Although it isn't completely correct, we have numbered each line and will regard each line as a statement.
However, we will just use numbered lines to illustrate the principles of coverage of statements (lines). 

Let's analyze the coverage of a set of tests on our six-statement program:

TEST SET 1 
  • Test 1_1: A = 2, B = 3 
  • Test 1_2: A = 0, B = 25 
  • Test 1_3: A = 47, B = 1


Which statements have we covered?
• In Test 1_1, the value of C will be 8, so we will cover the statements on lines 1 to 4 and line 6.
• In Test 1_2, the value of C will be 50, so we will cover exactly the same state ments as Test 1_1.
• In Test 1_3, the value of C will be 49, so again we will cover the same state ments.

Since we have covered five out of six statements, we have 83% statement coverage (with three tests).
What test would we need in order to cover state-ment 5, the one statement that we haven't exercised yet?

How about this one:
Test 1_4: A = 20, B = 25

This time the value of C is 70, so we will print 'Large C and we will have exercised all six of the statements, so now statement coverage = 100%. 

Notice that we measured coverage first, and then designed a test to cover the statement that we had not yet covered.

Note that Test 1_4 on its own is more effective- (towards our goal of achiev-ing 100% statement coverage) than the first three tests together. Just taking Test 1_4 on its own is also more efficient than the set of four tests, since it has used only one test instead of four. 

Being more effective and more efficient is the mark of a good test technique.


Decision coverage and decision testing

A decision is an IF statement, a loop control statement (e.g. DO-WHILE or REPEAT-UNTIL), or a CASE statement, where there are two or more possible exits or outcomes from the statement. 

With an IF statement,  the exit can either be TRUE or FALSE, depending on the value of the logical condition that comes after IF.
With a loop control statement, the outcome is either to perform the code within the loop or not - again a True or False exit. 


Decision coverage is calculated by:

                                      Number of decision outcomes exercised
Decision coverage = ----------------------------------------------- x 100%
                                         Total number of decision outcomes


What feels like reasonably thorough functional testing may achieve only 40% to 60% decision coverage. 
Typical ad-hoc testing may cover only 20% of the decisions, leaving 80% of the possible outcomes untested.

Even if your testing seems reasonably thorough from a functional or specification-based perspective, you may have only covered two-thirds or three-quarters of the decisions.

Decision coverage is stronger than statement coverage. It 'sub-sumes' statement coverage - this means that 100% decision coverage always guarantees 100% statement coverage. Any stronger coverage measure may require more test cases to achieve 100% coverage. 

For example, consider code sample 2
We saw earlier that just one test case was required to achieve 100% state-ment coverage.

However, decision coverage requires each decision to have had both a True and False outcome. Therefore, to achieve 100% decision coverage, a second test case is necessary where A is less than or equal to B.

This will ensure that the decision statement 'IF A > B' has a False outcome. So one test is sufficient for 100% statement coverage, but two tests are needed for 100% decision coverage. 

100% decision coverage guarantees 100% state-ment coverage, but not the other way around!

Example 3 code sample 

1 READ A
2 READ B
3 C=A-2*B
4 IFC <0THEN
5     PRINT "C negative"
6 ENDIF


TEST SET 2
Test 2_1: A = 20, B=15

Control flow diagram 


Which decision outcomes have we exercised with our test? The value of C is -10, so the condition 'C < 0' is True, so we will print 'C negative' and we have exercised the True outcome from that decision statement.

But we have not exercised the decision outcome of False. What other test would we need to exercise the False outcome and to achieve 100% decision coverage? 

Before we answer that question, let's have a look at another way to represent this code. Sometimes the decision structure is easier to see in a control flow diagram
The dotted line shows where Test 2_1 has gone and clearly shows that we haven't yet had a test that takes the False exit from the IF statement.

Let's modify our existing test set by adding another test:

TEST SET 2
Test 2_1: A = 20, B = 15
Test 2_2: A = 10, B = 2

This now covers both of the decision outcomes, True (with Test 2_1) and False (with Test 2_2). If we were to draw the path taken by Test 2_2, it would be a straight line from the read statement down the False exit and through the ENDIF. 

Note that we could have chosen other numbers to achieve either the True or
False outcomes.


Other structure-based techniques

There are other structure-based techniques that can be used to achieve testing to different degrees of thoroughness.

Some techniques are stronger (require more tests to achieve 100% coverage and therefore, have a greater chance of detecting defects) and others are weaker.

For example, branch coverage is closely related to decision coverage and at 100% coverage they give exactly the same results. Decision coverage measures the coverage of conditional branches; branch coverage measures the coverage of both conditional and unconditional branches. 


Other control-flow code-coverage measures include:
  •  linear code sequence and jump (LCSAJ) coverage,
  •  condition coverage, 
  • multiple condition coverage (also known as condition combination coverage) 
  • condition determination coverage (also known as multiple condition decision coverage or modified condition decision coverage, MCDC). 


Another popular, but often misunderstood, code-coverage measure is path coverage. Sometimes any structure-based technique is called 'path testing'

However, strictly speaking, for any code that contains a loop, path coverage is impossible since a path that travels round the loop three times is different from the path that travels round the same loop four times. This is true even if the rest of the paths are identical. 

So if it is possible to travel round the loop an unlimited number of times then there are an unlimited number of paths through that piece of code. For this reason it is more correct to talk about 'independent path segment coverage' though the shorter term 'path coverage is frequently used.



Comments

Popular posts from this blog

Types of Review

Roles and Resposibilities for a Formal Review