\documentclass{article} \usepackage[a4paper]{geometry} \usepackage{hyperref} \usepackage{amsmath} \title{INFO5993 \\ Literature Review} \author{Carlo Zancanaro} \begin{document} \maketitle \section{Introduction} Static analysis is concerned with determining as much as possible about a program from it without actually running the program. Being able to analyse the program statically can provide a wealth of information about the program's characteristics without executing it at all. The uses for static analysis vary significantly, from program optimisation (eg. constant propagation) to verification of program semantics and bug-finding. In critical applications (for example: medial equipment and flight computers) static analysis techniques are used to ensure that programs are free from errors which would cause catastrophic failure. Of particular use in static analysis is the determination of integral values within a program. By determining which possible values any integer variables in the program can take at each code point it is possible to detect a significant number of errors relating to improper integral values (divide-by-zero, array-out-of-bounds, etc.). Such errors are among the most common in real-world software development, so their detection is of much use to software developers. \section{Abstract Interpretation} \subsection{General Framework} Abstract interpretation is a technique introduced in \cite{CousotCousot77-1} in which a program's semantics are examined within an abstract domain. This abstract domain is a simplification of the concrete domain in which the program actually executes, but is constructed in such a way as to retain some properties of the original program for analysis. We perform our analysis by "executing" our program within this abstract domain which we have constructed. An example of an abstract domain is considering numbers as either positive $(+)$, negative $(-)$, zero $(0)$, unknown $(\pm)$ or undefined $(?)$. Using this abstraction we can still define some semantics which reflect the program itself (as an example: $(+) \times (-) = (-)$, or the product of a positive number with a negative number is a negative number). Each abstract domain must also contain a "top" $(\top)$ and a "bottom" $(\bot)$ element. The existence of these two elements turn our abstract domain into a complete lattice, which ensures some helpful properties of our abstract domain. By constructing appropriate abstract domains we can infer properties of the program's concrete semantics which are relevant to our application. The above abstract domain, for example, can be used to detect divide-by-zero errors (although it is clear from the above that not all divide-by-zero errors will be detected) or negative array-indexing (one form of an out of bounds error). A number of abstract domains have been conceived and formalised in literature. These are considered in the following subsection. An important factor to consider in the following is that as abstract domains get more "detailed" they become more computationally expensive to analyse. \subsection{Abstract Domains} Cousot and Cousot, in \cite{CousotCousot77-1} introduce an abstract domain providing an upper and lower bound to each integral value. This domain allows them to reason about numbers with greater precision than simply considering positive/negative/zero. The abstract domain of "zones", are handled in \cite{mine:padoII}. These zones provide a way to perform analysis on the bounds of variable differences (ie. to place bounds as $a - b \leq c$) and are also referred to as Difference Bound Matrices (which are used in other model-checking applications \cite{Yovine98modelchecking}). The domain of zones is then extended by the same author into the "Octagon Abstract Domain" in \cite{mine:hosc06} which additionally allows bounds on the sum of variables ($\pm a \pm b \leq c$). All of these above domains are actually shown to be instances of the polyhedra abstract domain in \cite{DBLP:conf/vmcai/SankaranarayananSM05}. The polyhedra domain expresses bounds on $a_1 x_1 + a_2 x_2 ... + a_n x_n + c \ge 0$, where $a_i$ are fixed in advance (a predefined polyhedral shape). This allows for ranges to be expressed as $1 \times x_1 + c \ge 0$ provides a single bound for a single variable (two such bounds is, of course, a range). Similarly for zones one can define a polyhedra $1 \times x_1 + (-1) \times x_2 + c \ge 0$ which is equivalent. The octagon domain is $(\pm 1) \times x_1 + (\pm 1) \times x_2 + c \ge 0$. \section{Fixpoints} When performing static analysis there is a great need to perform some sort of fixpoint solving to determine the possible resultant values for some operations. When looping, for instance, it is necessary to "iterate" (in some sense) to find what values each variable may take during a loop iteration. The most basic method for doing so is simply to perform a complete iteration in the abstract domain, but doing so has no guarantee of termination and would not be expected to be performant. In \cite{CousotCousot77-1} an approach of "widening" and "narrowing" is introduced. This widening/narrowing method performs some fixed number of iteration steps. If, after this fixed number, the iteration has not stabilised then it assumes that it will not stabilise and will instead continue infinitely, so any affected variables are "widened" to infinity. A "narrowing" step is then carried out wherein variables are brought back into the finite domain by some terminating condition on the loop. This approach is quite general, and so is applicable to many instances of real programs, but is a crude approximation of the reality of the program. In specific instances it is possible to perform a better approximation. In particular the case of monotonic, expansive operators is considered in \cite{}. % something from thomas here With this approach code points are converted into fixpoint equations with $minimum$ and $maximum$ expressions. These fixpoint equations can then be solved in order to give information about the program's abstract semantics. These equations are solved using the technique of strategy iteration \cite{} % something about strategy iteration here combined with a fast fixpoint-iteration algorithm (such as the one presented in \cite{DBLP:tr/trier/MI96-11}). %Other test citation: \cite{DBLP:tr/trier/MI96-11} \section{Conclusion} \bibliographystyle{abbrv} \bibliography{references.bib} \end{document}