\newcommand{\Env}{\mathsf{Env}}
\newcommand{\PP}{\mathsf{PP}}
\newcommand{\states}{\mathsf{states}}
\newcommand{\state}{\mathsf{state}}
\renewcommand{\b}[1]{\mathbf{#1}}
\newcommand{\semantics}[1]{\llb #1 \rrb}

\newcommand{\Stmt}{\mathsf{Stmt}}
\newcommand{\Vars}{\mathsf{Vars}}

\chapter{Background} \label{chap:litreview}

In this chapter we review the relevant literature pertaining to static
analysis. We briefly cover the semantics of programs generally before
looking more closely at the framework of abstract interpretation for
program analysis. The chapter concludes with an explanation of the
particular method of abstract interpretation over intervals and zones
which is implemented in this thesis.

\section{Program Semantics}

In order to perform any static analysis of a program it is necessary
to have some notion of the \emph{semantics}, or meaning, of a
program. The semantics of a program can be captured in several ways,
but for our purposes we consider \emph{denotational}, or
\emph{mathematical}, \emph{semantics} \cite{SCOTT70B}.

Denotational semantics views a program as a (partial) function built
up by the composition of many smaller functions. By expressing our
expressions, assignments, conditionals and loops as smaller functions
we can compose them to determine the semantics of the whole
program. The operator $\semantics{\cdot}$ represents the semantics of
its argument as a mathematical function.

In imperative languages programs are not a series of functional
transformations, however, each statement or expression may have
side-effects, so it is necessary to introduce some notion of ``state''
into our semantics. By defining $\semantics{\cdot}: \states \to
\states$ for all statements $x$ we provide a mechanism for statements
to perform side effects. Elements of $\states$ each represent a
possible state for the program to be in. For example: for an integral
assignment statement $I := E$, assigning the value of an expression
$E$ to a variable $I$, we have the following denotation:
\begin{align*}
  \semantics{I := E} = \lambda \rho . (\rho \oplus \{I \mapsto \semantics{E}(\rho)\})
\end{align*}
where $\rho \in \states$ and $\semantics{E}: \states \to \Z$. Our
$\states$ in this case are functions $\rho: \Vars \to \D$ from
variables to values.

Denotational semantics provide a formal abstract framework for us to
analyse programs mathematically, but they do not provide a meaningful
way to perform automated analysis of programs. As they perfectly
capture the semantics of a program they do not avoid the consequences
of Rice's Theorem \cite{Rice} or the Halting Problem \cite{Halting}
which preclude us from performing precise analysis of programs in
general.

\section{Abstract Interpretation} \label{sec:abstract-interpretation}

In order to work around the limitations on precise static analysis in
general, Cousot et al.~ developed a framework for performing sound
program analysis called ``abstract
interpretation'' \cite{CousotCousot77-1}. In this framework we consider
a program's semantics under an abstraction. The original semantics we
call ``concrete semantics'' and the abstracted semantics we call
``abstract semantics''. The abstract semantics consist of replacing
$\D$ and $\state$ from the denotational semantics with ``abstracted''
structures.

To provide us with some intuition of these concepts, consider the
following example:

\begin{example}
  Let us assume that our program's state is represented by $\rho:
  \Vars \to \Z$, with values coming from $\Z$.

  One possible abstraction would be to instead consider a variable's
  \emph{sign}. That is, to consider $x$ as satisfying one of the
  following (the symbols in parentheses are used to refer to these
  values from now):
  \begin{align*}
    x &\in \Z & (\pm) \\
    x &\ge 0 & (+) \\
    x &\le 0 &(-) \\
    x &= 0  &(0)
  \end{align*}
  From this abstraction we can then determine something about normal
  arithmetic operations in our program. We know, for instance, that
  $(+) \times (-) = (-)$, or that $0 \times (\pm) = 0$. We also, then,
  know that $x / 0$ is certainly an error.
  
  \label{example:abstract-interpretation} 
\end{example}

Abstraction sacrifices precision of analysis for decidability in
general. By abstracting our domain in a way which is ``sound'', which
safely over-approximates our concrete semantics, we ensure that any
analysis we perform on the abstraction is also valid for the concrete
semantics. Unfortunately in order to make this abstraction we
sacrifice precision, so it is possible that the analysis will identify
errors which are not errors in the concrete semantics.

\subsection{General Framework}

In order to transform between the concrete and abstract domains we
define two functions: $\alpha$ and $\gamma$. The function $\alpha:
2^\states \to \states_\alpha$ transforms a concrete state into an
abstract state. The function $\gamma: \states_\alpha \to 2^\states$
transforms an abstract state into a set of concrete states. We require
these two functions to satisfy the properties of a Galois connection:
\begin{align*}
  a \le \gamma(b) &\iff \alpha(a) \le b & \forall a \subseteq
  2^\states, b \in \states_\alpha.
\end{align*}
\noindent As a consequence of this, the result of abstracting $a \in
\states$ must at least contain $a$. In particular we are looking for
the \emph{best} abstract transformers, such that $\alpha(a)$ is as
tight as possible for a state $a$.

Abstract interpretation follows on from the idea of denotational
semantics, and it is possible to determine the abstract semantics of
an operation based on its concrete semantics. We use
$\semantics{\cdot}^\#$ to represent the abstract semantics of a
statement, and define the transformation as $\semantics{s}^\# = \alpha
\circ \semantics{s} \circ \gamma$.


We can now formalise the notion of a Control Flow Graph (henceforth
CFG) and define its abstraction. We consider a CFG to be a tuple $G =
(N, E, s, st)$. A finite set of \emph{control-points} $N$, a finite
set $E \subseteq N \times N$ of \emph{edges} between these control
points (ie. if $(u,v) \in E$ then an edge exists in the CFG from $u$
to $v$). A mapping $s: N \times N \to \Stmt$ taking each edge in the
CFG to a \emph{statement} $\in \Stmt$. $st \in N$ denotes the special
\emph{start control-point}.

We denote our abstract semantics of a node in our CFG by $V^\#[v],
\forall v \in N$ such that the following hold:
\begin{align}
  V^\# [st] &\ge \alpha(\Z^n) \label{eqn:start-node} \\
  V^\# [v] &\ge \llb s_{(u,v)} \rrb ^\# (V^\# [u]) & \forall (u,v) \in
  E \label{eqn:confluence}
\end{align}

In the first equation, the possible abstract states of the start node
must always be the completely unconstrained abstract domain. Any value
is possible for any variable on the first node of the CFG.

The second equation requires that the possible abstract states of a
node, $v$, contains either the same amount of information or less
information than each of the predecessors of $v$ in the CFG.

\begin{figure}
  \begin{tikzpicture}[->,>=stealth',shorten >=1pt,auto,node
      distance=2cm,main node/.style={circle,fill=blue!20,draw},every
      loop/.style={min distance=1.5cm}]
    
    \node[main node] (V) {$v$};
    \node[main node] (U) [above left of=V] {$u$};
    \node[main node] (W) [above right of=V] {$w$};
    
    \path[every node/.style={fill=none}]
    (U) edge node[left]{$s_{(u,v)}$} (V)
    (W) edge node[right]{$s_{(w,v)}$} (V);
    
  \end{tikzpicture}
  \caption{A simple CFG fragment.}
  \label{cfg:confluence}
\end{figure}

So, if we have the CFG fragment shown in Figure \ref{cfg:confluence}
then we know that $V^\#[v]$ must `contain', or be larger than, each of
the edges coming in to $v$. In this case there are two incoming edges
to $v$: $(u,v)$ and $(w,v)$.

The edge $(u,v)$ leads to the state $\llb s_{(u,v)} \rrb^\#
(V^\#[u])$. It is clear that $V^\#[v]$ must be either the same size or
`larger' than this value, as it must either equal this constraint or
it must be a looser constraint. This gives us the inequality
\eqref{eqn:confluence}: $V^\#[v] \ge \llb s_{(u,v)} \rrb ^\#
(V^\#[u])$. Similarly by considering the edge $(w,v)$ and following
the same argument we find that $V^\# [v] \ge \llb s_{(w,v)} \rrb ^\#
(V^\# [w])$.

This analysis thus far has avoided defining any particular abstract
domain, but in practice we must choose an abstract domain before
performing our analysis. Much has been written on the topic of
abstract domains, ranging from the relatively simple, such as signs
and intervals \cite{CousotCousot77-1} to the more complex zones
\cite{mine:padoII}, octagons \cite{mine:hosc06} and more general
convex polyhedra \cite{DBLP:conf/vmcai/SankaranarayananSM05}. For this
thesis we are only concerned with \emph{intervals} and \emph{zones}.

\subsection{Intervals}

The abstract domain of \emph{intervals} was introduced by Cousot et
al. \cite{CousotCousot77-1} along with abstract interpretation. In this
domain we constrain each variable $x$ to lie within a range $[a, b]
\in \CZ^2$. This domain is finer than signs, so it allows more
information about the program to be determined. Another equivalent
formulation of intervals is that we are finding bounds such that
\begin{align*}
  x &\le b \text{ and,} \\
  -x &\le -a.
\end{align*}
%\noindent We are finding \emph{upper bounds} for $x$ and $-x$ at each
%location in the program.


\subsection{Zones}

\emph{Zones} are an abstract domain introduced by Min\'e
\cite{mine:padoII} which generalise intervals to include also bounding
the differences between variables. Min\'e's work on zones builds off
work done in the model checking of timed automata
\cite{yovine-zones}. For each variable $x$ zones constrain the
variable and its negation in the same way as intervals,
\begin{align*}
  x &\le b \text{ and,} \\
  -x &\le a.
\end{align*}
Zones additionally provide us with bounds on \emph{differences}
between variables. So for each pair of variables $(x,y)$, zones
provide us with a bound
\begin{align*}
  x - y & \le c.
\end{align*}


\section{Min/max problem}

In \cite{Gawlitza:2007:PFC:1762174.1762203} a conversion from a
program's CFG to a system of equations involving only monotonic,
expansive operators is presented. This conversion results in an
equation system which can be solved in order to perform abstract
interpretation over zones.

\subsection{Partial-ordering Formulation}

\begin{example}
  \label{example:partial-ordered-construction}

  As an example, let us consider the following program:
  
  \begin{figure}[here]
    \begin{minipage}{5cm}
      \begin{lstlisting}
    x = 1
    while x <= 9:
      x = x + 1
      \end{lstlisting}
    \end{minipage}
    \hfill
    \begin{minipage}{0.45\textwidth}
      \begin{tikzpicture}[->,>=stealth',shorten >=1pt,auto,node
          distance=2cm,main node/.style={circle,fill=blue!20,draw},every
          loop/.style={min distance=1.5cm}]
        
        \node[main node] (A) {$A$};
        \node[main node] (B) [below of=A] {$B$};
        \node[main node] (C) [below of=B] {$C$};
        
        \path[every node/.style={fill=none}]
        (A) edge node[right]{$x = 1$} (B)
        (B) edge [loop right] node[right]{$x \le 9: x = x + 1$} (B)
        (B) edge node[right]{$x > 9$} (C);
        
      \end{tikzpicture}
    \end{minipage}
    
    \caption{A simple program with its corresponding CFG}
    \label{fig:partial-ordered-construction}
  \end{figure}

  We now attempt to derive a system of inequalities from this for the
  \emph{least upper bound} of $x$ at each program point.
  \begin{align*}
    ub: {} & ~\Vars \to \CZ \\
    ub(x) &= \min\{v \in \CZ ~|~ x \le v\}.
  \end{align*}
  We wish to find a value for $ub(x)$ at each program point in our
  original program. We denote these values by $ub_A(x)$, $ub_B(x)$ and
  $ub_C(x)$.

  For $A$ the value of $x$ is entirely unknown, so no useful bound can
  be given. The possible values of $x$ are bounded only by the top
  value in $\CZ$, which is $\infty$. This gives us the following
  inequality for node $A$:
  \begin{align*}
    ub(x_A) &\ge \infty
  \end{align*}

  When we reach $B$ we have two incoming edges. One comes from $B$
  itself, the other from $A$.

  The edge from $A$ to $B$ is very simple. It has no constraint and
  sets the value of $x$ to $1$. It corresponds to an equation of the
  form $ub(x_B) \ge 1$. However, there is also the possibility that
  $ub(x_A)$ will be unreachable, in which case $ub(x_B)$ will also be
  unreachable. This makes the inequality 
  \begin{align*}
    ub(x_B) \ge
    \left\{\begin{array}{lc}
      1 & \text{if } ub(u_A) > -\infty \\
      -\infty & \text{otherwise}
    \end{array}\right.
  \end{align*}

  The edge from $B$ to itself is somewhat more complex. To begin with
  it changes the value of $x$ depending on the previous value of
  $x$. It also is executed conditionally (when $x \le 99$), which
  affects the bound on $x$. Because the condition limits the growth of
  $x$ the corresponding equation for this edge is $ub(x_B) \ge \min(x,
  99) + 1$.

  This leaves us with two simultaneous constraints on the value of
  $ub(x_B)$.
  \begin{align*}
    ub(x_B) &\ge 1 \\
    ub(x_B) &\ge \min(ub(x_B), 9) + 1
  \end{align*}

  The upper bound on the value of $x$ for node $C$ is the same as the
  upper bound on the value of $x$ in $B$, because the edge from $B$ to
  $C$ does not change the value of $x$, nor does it provide any more
  details about the upper bound of $x$. (It does change the lower
  bound of $x$, but we have not considered the lower bound in this
  analysis.)  This gives us a very simple equation system for $C$.
  \begin{align*}
    ub(x_C) &\ge ub(x_B)
  \end{align*}

  So, the upper bounds for $x$ can be found at every program point in
  our original program by finding the smallest values of $ub(x_A),
  ub(x_B), ub(x_C)$ subject to these constraints:
  \begin{align*}
    ub(x_A) &\ge \infty \\
    ub(x_B) &\ge
    \left\{\begin{array}{lc}
      1 & \text{if } ub(u_A) > -\infty \\
      -\infty & \text{otherwise}
    \end{array}\right. \\
    ub(x_B) &\ge \min(ub(x_B), 9) + 1 \\
    ub(x_C) &\ge ub(x_B)
  \end{align*}

  The smallest values for $ub(x_A), ub(x_B), ub(x_C)$ which satisfy
  these constraints are:
  \begin{align*}
    ub(x_A) &= \infty \\
    ub(x_B) &= 10 \\
    ub(x_C) &= 10
  \end{align*}
  So, at program point $A$: $x \le \infty$; at $B$: $x \le 10$ and at
  $C$: $x \le 10$. 

\end{example}


\subsection{Lattice construction}
\label{subsec:lattice-construction}

The partial-ordering formulation considered above is a helpful
presentation of the problem in terms of constraints, but an equivalent
equation system can be constructed to aid us in locating the
\emph{least} solution to the constraint system.

For every partial-ordering on a set there exists an equivalent
lattice, defined in terms of a meet operator ($\vee$) and a join
operator ($\land$), which are defined by the following relationships:
\begin{align*}
  x \le y ~&\iff~ x \vee y = y \\
  x \le y ~&\iff~ x \land y = x
\end{align*}

In the case of Example \ref{example:partial-ordered-construction},
with $x, y \in \CZ$, the meet and join operators correspond with
$\max$ and $\min$, respectively. This allows us to simplify the system
of inequalities in Example \ref{example:partial-ordered-construction}
to the following:
\begin{align*}
  ub(x_A) &= \infty \\
  ub(x_B) &= \max(1, \min(ub(x_B), 99) + 1) \\
  ub(x_C) &= ub(x_B)
\end{align*}

The two constraints on the upper bound of $x_B$ can be combined into
one constraint by taking the maximum of them. This is because $x \ge
\max(a, b) ~\implies~ x \ge a \text{ and } x \ge b$. The
Knaster-Tarski Fixpoint Theorem \cite{Knaster-Tarski} permits us to
replace our inequalities with equalities, thereby forming a system of
equations which can be solved for the least solution to the earlier
constraint system.


\begin{example}
  \label{example:both-bounds}

  In order to determine a lower bound for $x$ at each program point it
  is sufficient to determine an \emph{upper} bound for $-x$. We again
  consider the program fragment in Figure
  \ref{fig:partial-ordered-construction}.

  We now try to derive an equation system for the \emph{upper} and
  \emph{lower} bounds of $x$.

  In order to capture the state of the variables at each program point
  we consider the abstract state as the two-dimensional vector
  $(ub(x), -lb(x))$, that is the upper bound of $x$, $ub(x)$, and the
  \emph{negated} lower bound of $x$, $-lb(x)$. The least fixpoint of
  our system of equations over $\CZ^2$ then gives us bounds for
  $ub(x)$ and $lb(x)$.

  To begin with, let us once again consider the node $A$. At the point
  of $A$ there is no information as to the bounds of $x$, so our
  constraint must contain the entirety of $\CZ$. This corresponds to
  $ub(x) = \infty, lb(x) = -\infty$, which gives us the following
  inequality:
  \begin{align*}
    \left(\begin{array}{c}
      ub(x_A) \\ -lb(x_A)
    \end{array}\right)
    &\ge
    \left(\begin{array}{c}
      \infty \\ \infty
    \end{array}\right)
  \end{align*}

  As before, the node $B$ is the first of particular interest to
  us. With two incoming edges we must have two constraints for this
  node.

  The first constraint comes from the edge $A \to B$. This edge does
  not do much of interest, it ignores the preceding value of $x$ and
  sets $x$ to $1$. This is captured in the following inequality:
  \begin{align*}
    \left(\begin{array}{c}
      ub(x_B) \\ -lb(x_B)
    \end{array}\right)
    &\ge
    \left(\begin{array}{c}
      1 \\ -1
    \end{array}\right)
  \end{align*}

  The constraint from the edge $B \to B$ is somewhat more
  complicated. The transformation $x = x + 1$ is represented by the
  matrix/vector operations of multiplication and
  addition. Multiplication by the identity matrix indicates that $x$
  itself is present in the right-hand side of the transformation,
  while the addition of the vector \(\left(\begin{array}{c} 1 \\ -1
  \end{array}\right) \) captures the effect of the addition on
  each of the bounds (increasing each of the bounds by one).

  The $x \le 9$ conditional, which is present on the edge, is captured
  by the restriction on the growth of the upper bound. \(\min(ub(x_B),
  9)\) ensures the upper-bound's growth is limited by the value
  $9$. It should be noted that there is no bound placed on the
  lower-bound as the program only bounds $x$'s growth from above on
  this edge.
  \begin{align*}
    \left(\begin{array}{c}
      ub(x_B) \\ -lb(x_B)
    \end{array}\right)
    &\ge
    \left[\begin{array}{cc}
      1 & 0 \\
      0 & 1
    \end{array}\right]
    \left(\begin{array}{c}
      \min(ub(x_B), 99) \\
      -lb(x_B)
    \end{array}\right)
    +
    \left(\begin{array}{c}
      1 \\
      -1
    \end{array}\right)
  \end{align*}

  The edge $B \to C$ is quite simple as it provides another
  restriction on the value of $x$ without any further
  transformation. This restriction is placed on the lower-bound of $x$
  only (leaving the upper-bound unchanged) and is captured by the
  following inequality:
  \begin{align*}
    \left(\begin{array}{c}
      ub(x_C) \\
      -lb(x_C)
    \end{array}\right)
    &\ge
    \left(\begin{array}{c}
      ub(x_B) \\
      \min(-lb(x_B), 99)
    \end{array}\right)    
  \end{align*}

  This gives us the following inequalities for this program:
  \begin{align*}
    \left(\begin{array}{c}
      ub(x_A) \\ -lb(x_A)
    \end{array}\right)
    &\ge
    \left(\begin{array}{c}
      \infty \\ \infty
    \end{array}\right)
    \\
    \left(\begin{array}{c}
      ub(x_B) \\ -lb(x_B)
    \end{array}\right)
    &\ge
    \left(\begin{array}{c}
      1 \\ -1
    \end{array}\right)
    \\
    \left(\begin{array}{c}
      ub(x_B) \\ -lb(x_B)
    \end{array}\right)
    &\ge
    \left[\begin{array}{cc}
      1 & 0 \\
      0 & 1
    \end{array}\right]
    \left(\begin{array}{c}
      \min(ub(x_B), 99) \\
      -lb(x_B)
    \end{array}\right)
    +
    \left(\begin{array}{c}
      1 \\
      -1
    \end{array}\right)
    \\
    \left(\begin{array}{c}
      ub(x_C) \\
      -lb(x_C)
    \end{array}\right)
    &\ge
    \left(\begin{array}{c}
      ub(x_B) \\
      \min(-lb(x_B), 99)
    \end{array}\right)
  \end{align*}

  By combining these expressions as discussed in Section
  \ref{subsec:lattice-construction} we can obtain the following system
  of equations:
  \begin{align*}
    \left(\begin{array}{c}
      ub(x_A) \\ -lb(x_A)
    \end{array}\right)
    &=
    \left(\begin{array}{c}
      \infty \\ \infty
    \end{array}\right)
    \\
    \left(\begin{array}{c}
      ub(x_B) \\ -lb(x_B)
    \end{array}\right)
    &=
    \max\left(
    \left(\begin{array}{c}
      1 \\ -1
    \end{array}\right),
    \left[\begin{array}{cc}
      1 & 0 \\
      0 & 1
    \end{array}\right]
    \left(\begin{array}{c}
      \min(ub(x_B), 99) \\
      -lb(x_B)
    \end{array}\right)
    +
    \left(\begin{array}{c}
      1 \\
      -1
    \end{array}\right)
    \right)
    \\
    \left(\begin{array}{c}
      ub(x_C) \\
      -lb(x_C)
    \end{array}\right)
    &=
    \left(\begin{array}{c}
      ub(x_B) \\
      \min(-lb(x_B), 99)
    \end{array}\right)
  \end{align*}
\end{example}

\section{Min-cost flow}

This matrix fomulation sufficient to calculate bounds on individual
program variables, as in the interval domain, but in doing so there is
a significant loss of precision. In order to reclaim some of this
precision we can use a relational domain. 


\endinput

\subsection{Example solve}
\label{sec-1-3-5}


Reconsidering our earlier equation system, we now attempt to find
the minimal fixpoint by means of \emph{policy iteration} and
\emph{fixpoint iteration}.

\begin{eqnarray*}
x &=& \max(1, \min(x, 99) + 1) \\
\end{eqnarray*}

At the beginning and end of each player's turn we will have a variable
assignment $\{ x \to ? \}$, which will start with the value
$\{ x \to -\infty \}$.

The $\max$ player then gets a chance to play. They choose to improve
the strategy by choosing for $x$ to take the value $1$. The $\min$
player then has no choice but to find a variable assignment $\{ x
\to 1 \}$, as the $\max$ player has left them no choice.

Play then returns to the $\max$ player who chooses, again, to improve
their strategy to instead map $x$ to $\min(x, 99) + 1$. This time the
$\min$ player must make a decision. It solves $x = \min(x, 99) + 1$
for it's maximal fixpoint, deciding to use the $99$ value (as the $x$
value will grow without bound with the $+ 1$, thereby making $99$ the
minimum of the two options) and returning $\{ x \to 100 \}$ as
the new assignment.

The $\max$ player has no further way to improve their strategy, as $1
< 100$, so the game is over. The minimal fixpoint of the original
system has been found as $x = 100$.