tex/thesis/introduction/introduction.tex


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129

\chapter{Introduction}

\section{Motivation}

%Main points:
%  - software is taking on more important roles 
%  - software bugs are dangerous
%  - software bugs are expensive
%  - we have to minimise the number of bugs we have

In today's increasingly technological world, software bugs can have
significant consequences, ranging from the relatively minor
frustration caused to average users, to causing deaths. There have
been several incidents in recent years in which a bug in a software
system has led directly to injury or death. X-ray machines which
provide too high a dose of radiation, cars which continue to
accelerate against the driver's wishes and other dangerous situations
have all come about as a direct result of software bugs.

Software bugs also have significant financial costs, with Hailpern and
Santhanam static that debugging, verification and testing can easily
comprise 50\% to 75\% of the total development cost of a
system~\cite{Hailpern01softwaredebugging}.

In order to limit the number of bugs in a software system, it has become commonplace
to employ sophisticated testing approaches. Different testing 
strategies aim to ensure that a software system is bug-free. These
testing strategies, although extremely useful, are inherently incapable of
\emph{guaranteeing} that software is free of bugs. This is especially
a problem in critical software systems, such as those found in
aeroplanes or large industrial machinery, where software failures can
have catastrophic consequences.

In order to provide a guarantee that software is free of bugs we
employ \emph{software verification}, that analyses software without
running it.  While it is possible to write programs amenable for
software verification, it is still a difficult and expensive process
to perform verification on a software system.

There has been a continuous stream of research in
program analysis for finding bugs starting from the late 70s to the
present~\cite{CousotCousot77-1,EasyChair:117}. This work is broadly
categorised as \emph{static program analysis}, and this thesis contributes 
to this research area.

Due to Rice's Theorem~\cite{Rice} we know that it is undecidable to
statically infer properties of programs. In order to overcome this
limitation, many approaches to static analysis have been devised
including the framework of abstract interpretation introduced by
Cousot and Cousot~\cite{CousotCousot77-1}. To overcome the
undecidability issues of static program analysis, a program is
represented in an abstract domain which becomes decidable. A
connection (also known as homomorphism) between the abstract domain
and the concrete domain (actual semantics of the program) is
established.  This connection is called a Galois-connection and
provides a proof framework to show that the static program analysis is
a ``sound'' approximation of the semantics.  The precision of the
analysis depends on the granularity of the approximation. In general,
more precise program analyses result in higher complexity classes.

Two abstract domains which are particularly well-known and useful are
those of \emph{interval} and \emph{zones}. The interval abstract
domain consists of restricting variables in the range $[a,b]$. This
domain can be useful for avoiding such program errors as division by
zero errors and out of bounds memory accesses. The abstract domain of
zones allows us to further place a bound on the difference between
variables $x - y \le c$. This allows us to be more precise in our
analysis, thereby avoiding false-positives, at the cost of speed.

The process of determining appropriate bounds for program variables
has traditionally been performed by performing several Kleene
iterations, followed by a \emph{widening} and \emph{narrowing}. These
widening and narrowing operators ensure termination, but  at
the cost of precision. As an alternative, we employ in this work  
a game-theoretic approach to compute the abstract
semantics of a program~\cite{EasyChair:117}. This approach is less
general than widening and narrowing, but performs precise abstract
interpretation over zones.


\section{Contribution}

In this thesis we present an implementation of the strategy-iteration
based static analyser presented by Gawlitza et
al.~\cite{EasyChair:117}. Our implementation has several enhancements
which significantly improve the practical performance of the analyser
on real-world programs.

We present one major theoretical contribution:
\begin{enumerate}
\item
  We present a demand-driven strategy improvement algorithm for
  solving monotonic, expansive equation systems involving $\min$ and
  $\max$ operators.
\end{enumerate}

We present several systems contributions:
\begin{enumerate}
\item
  We develop a solver for monotonic, expansive equation systems based
  on the work of Gawlitza et al.~\cite{EasyChair:117} with several
  improvements.
\item
  We analyse the performance of our improved solver on a set of
  equation systems to demonstrate the effect of our improvements.
\item
  We integrate our solver into the LLVM/Clang
  framework~\cite{LLVM,LLVM-paper,Clang} to perform analysis over Zones~\cite{mine:padoII}.
\item
  We analyse the performance of our LLVM analysis on various program
  inputs.
\end{enumerate}

\section{Structure of the thesis}

In Chapter \ref{chap:litreview} we review the background literature in
the field of static analysis, laying the basis for the rest of the
thesis. %TODO: more

We present our theoretical contribution in \ref{chap:contribution}: a
demand-driven strategy improvement algorithm. We present the algorithm
along with an evaluation of its performance on realistic problems.

Our implementation is discussed in Chapter
\ref{chap:implementation}. We discuss several aspects of our
implementation before evaluating the results of our analysis on
several programs.

We draw our conclusion in \ref{chap:conclusion}.