Halstead Metrics

Halstead Metrics

CodeSonar computes various Halstead metrics, as defined by Maurice H. Halstead in Elements of Software Science (1977). Halstead metrics are based on definitions of operators and operands. These are language dependent; CodeSonar uses the following definitions when computing Halstead metrics for C and C++. (Note that there is no "official" definition of the Halstead operators and operands for C and C++, so other Halstead Measure tools may use different definitions for one or both.)

  • operators: arithmetic('+'), equality/inequality('<'), assignment('+='), shift ('>>'), logical ('&&'), and unary ('*') operators. Reserved words for specifying control points ('while') and control infrastructure ('else'), type ('double'), and storage ('extern'). Function calls, array references, etc.
  • operands: identifiers, literals, labels, and function names. Each literal is treated as a distinct operand.

CodeSonar ships with an assortment of metrics at function, file, and analysis granularity. An 'x' in the following table indicates that the corresponding metric is computed at the specified granularity.

Click any metric name to learn more about the metric.

Underlying the various Halstead measures are counts of operators and operands.

METRIC GRANULARITIES
Project File Procedure
Total Operators »The total number of operators present. N1 x
Total Operands »The total number of operands present. N2 x
Distinct Operators »The number of distinct operators present. n1 x
Distinct Operands »The number of distinct operands present. As noted above, every constant is treated as distinct. n2 x

Once N1, N2, n1, and n2, have been determined, the remaining metrics can be computed.

Halstead Program Length »N = N1 + N2
Halstead Program Length describes the size of the abstracted program obtained by removing everything except operators and operands from the original program. It has some similarities with the Lines with Code metric, but also some important differences. A function's Lines with Code measure can be altered simply by adding or removing line breaks; Halstead Length is not sensitive to this manipulation. Similarly, Lines with Code does not tell us anything about how complex the lines are: a line containing an extremely complex expression is not treated any differently to one that is very simple. Halstead Length gives a better accounting of the overall statement complexity.
N x
Halstead Program Volume »V = N * log2(n1 + n2)
This models the number of bits required to store the abstracted program of length N if the operators and operands are encoded as binary strings of uniform (and potentially nonintegral) length.
V x
Halstead Program Level »L = (2/n1)*(n2/N2)
L describes the ratio between the volume V of the current program and the volume V* of the "most compact" implementation of the same algorithm (as defined by Halstead). A longer implementation of an algorithm will have a lower program level than a shorter implementation of the same algorithm.
L x
Halstead Program Difficulty »D = (n1/2) * (N2/n2) = 1/L
Difficulty is the inverse of Level: a longer implementation of an algorithm will have a higher difficulty than a shorter implementation of the same algorithm. Difficulty increases as the number of unique operators increases, and as the average number of occurrences per operand increases.
D x
Halstead Programming Effort »E = D * V
Halstead's formulation for the effort required to author (or understand) a program characterizes effort as proportional to both difficulty and volume.
E x
Halstead Programming Time »T = E/18 seconds
Programming time is considered to be directly proportional to programming effort.
T x
Halstead Intelligent Content »I = (V / D)
Halstead intended this metric to be a language-independent measure of algorithmic complexity.
I x

Custom Halstead Metrics

Are there other Halstead-style metrics that you'd like to compute?
CodeSonar's static analysis engine provides a mechanism for you to define derived metrics – mathematical manipulations of one or more already-defined metrics – and have the analysis compute their values.

Do you have different sets of operators and operands that you need to count?
CodeSonar's API provides access to the various code representations generated by its static source code analysis, including the Abstract Syntax Trees (ASTs). By combining AST-based operator and operand counting with the API's functionality for defining arbitrary metrics, you can define custom metrics that match your needs precisely.

Example

We will look at the Halstead metric values for the following example functions.

#include <stdio.h>

int twice (int a){ return 2*a; }

int f_switch(int a){
    switch (a){
      case 0 :
        printf("zero");
      case 10 :
        printf("an even number");
        break;
      default:
        printf("I don't know that one");
        return -1;
        break;
    }
    return 0;
}



int max (int a, int b){
    if (a > b) return a;
    return b;
}

int f_nestedif(int a, int b){
    if (a*a < b*b)
        if (a > 60) return max(a,b);
    else if (b > 60) return 60;
    return b;
}

int f_while(int a){
    int i = 2;
    if (a < 0) return 0;
    while (a > 0){
        a -= 100;
        i *= 2;
    }
    return i;
}

The Halstead metric values for these functions are as follows.

Function N1 N2 n1 n2 N V L D E T I
twice() 5 4 4 3 9 25.3 0.3 2.7 67.4 3.7s 9.5
max() 9 7 5 3 16 48 0.2 5.8 280 15.6s 8.2
f_nestedif() 19 15 9 6 34 132.8 3.6 11.25 1494.4 83.0s 11.8
f_while() 17 14 10 9 31 131.7 6.4 7.8 1024.2 56.9s 16.9
f_switch() 21 10 8 9 31 126.7 7.2 4.4 563.1 31.2s 28.5

Where the operators and operands are as shown in the following table.

Function Operators Operands
twice() int (×2), return, *, semicolon twice, a (×2), 2
max() int (×3), if, >, return (×2), semicolon (×2) max, a (×3), b (×3)
f_nestedif() int (×3), if (×3), * (×2), > (×2), <, return (×3), max(), else, semicolon (×3) f_nestedif, a (×5), b (×6), 60,60,60
f_while() int (×3), =, if, <, >, while, return (×2), -=, *=, semicolon (×5) f_while, a (×4), i (×3), 2,2, 0,0,0, 100
f_switch() int (×2), switch, switch-case (×3), -, printf (×3), break (×2), return (×2), semicolon (×7) f_switch, a (×2), 0,0, 10, 1, "zero", "an even...", "I don't..."