From 608cf2e6a2ccc967e82b2b7a9693fd0bafb778fb Mon Sep 17 00:00:00 2001 From: Carlo Zancanaro Date: Wed, 5 Sep 2012 10:25:19 +1000 Subject: Delete a lot of tex stuff, better debug info Some other stuff, too, I think. Oh well. No biggie! --- tex/lit-review/main.tex | 159 ------------------------------------------------ 1 file changed, 159 deletions(-) delete mode 100644 tex/lit-review/main.tex (limited to 'tex/lit-review/main.tex') diff --git a/tex/lit-review/main.tex b/tex/lit-review/main.tex deleted file mode 100644 index 28a9495..0000000 --- a/tex/lit-review/main.tex +++ /dev/null @@ -1,159 +0,0 @@ -\documentclass{article} - -\usepackage[a4paper]{geometry} -\usepackage{hyperref} -\usepackage{amsmath} - -\title{INFO5993 \\ Literature Review} -\author{Carlo Zancanaro} - -\begin{document} -\maketitle - -\section{Introduction} - -Static analysis is concerned with determining as much as possible about a -program from it without actually running the program. Being able to analyse the -program statically can provide a wealth of information about the program's -characteristics without executing it at all. - -The uses for static analysis vary significantly, from program optimisation -(eg. constant propagation) to verification of program semantics and bug-finding. -In critical applications (for example: medial equipment and flight computers) -static analysis techniques are used to ensure that programs are free from errors -which would cause catastrophic failure. - -Of particular use in static analysis is the determination of integral values -within a program. By determining which possible values any integer variables in -the program can take at each code point it is possible to detect a significant -number of errors relating to improper integral values (divide-by-zero, -array-out-of-bounds, etc.). Such errors are among the most common in real-world -software development, so their detection is of much use to software developers. - -\section{Abstract Interpretation} - -\subsection{General Framework} - -Abstract interpretation is a technique introduced in \cite{CousotCousot77-1} in -which a program's semantics are examined within an abstract domain. This -abstract domain is a simplification of the concrete domain in which the program -actually executes, but is constructed in such a way as to retain some properties -of the original program for analysis. We perform our analysis by "executing" our -program within this abstract domain which we have constructed. - -An example of an abstract domain is considering numbers as either positive -$(+)$, negative $(-)$, zero $(0)$, unknown $(\pm)$ or undefined $(?)$. Using this -abstraction we can still define some semantics which reflect the program itself -(as an example: $(+) \times (-) = (-)$, or the product of a positive number with -a negative number is a negative number). Each abstract domain must also contain -a "top" $(\top)$ and a "bottom" $(\bot)$ element. The existence of these two -elements turn our abstract domain into a complete lattice, which ensures some -helpful properties of our abstract domain, including the existence of solutions -to fixpoint-equations (\cite{tarski}). - -By constructing appropriate abstract domains we can infer properties of the -program's concrete semantics which are relevant to our application. The above -abstract domain, for example, can be used to detect divide-by-zero errors -(although it is clear from the above that not all divide-by-zero errors will be -detected) or negative array-indexing (one form of an out of bounds error). - -A number of abstract domains have been conceived and formalised in literature. -These are considered in the following subsection. An important factor to -consider in the following is that as abstract domains get more "detailed" they -become more computationally expensive to analyse. - -\subsection{Abstract Domains} - -Cousot and Cousot, in \cite{CousotCousot77-1} introduce an abstract domain -providing an upper and lower bound to each integral value. This domain allows -them to reason about numbers with greater precision than simply considering -positive/negative/zero. - -The abstract domain of "zones", are handled in \cite{mine:padoII}. These zones -provide a way to perform analysis on the bounds of variable differences (ie. to -place bounds as $a - b \leq c$) and are also referred to as Difference Bound -Matrices (which are used in other model-checking applications, such as -\cite{Yovine98modelchecking}). - -The domain of zones is then extended by the same author into the "Octagon -Abstract Domain" in \cite{mine:hosc06} which additionally allows bounds on the -sum of variables ($\pm a \pm b \leq c$). - -All of these above domains are actually shown to be instances of the polyhedra -abstract domain in \cite{DBLP:conf/vmcai/SankaranarayananSM05}. The polyhedra -domain expresses bounds on $a_1 x_1 + a_2 x_2 ... + a_n x_n + c \ge 0$, where -$a_i$ are fixed in advance (a predefined polyhedral shape). This allows for -ranges to be expressed as $1 \times x_1 + c \ge 0$ provides a single bound for a -single variable (two such bounds is, of course, a range). Similarly for zones -one can define a polyhedra $1 \times x_1 + (-1) \times x_2 + c \ge 0$ which is -equivalent. The octagon domain is $(\pm 1) \times x_1 + (\pm 1) \times x_2 + c -\ge 0$. - -Non-integral abstract domains have also been investigated in papers such as -\cite{springerlink}, which considers the abstract domain -of strings with abstract values being regular expressions, but it seems the -majority of the literature is concerned with providing bounds for integral, or -at the very least numerical, expressions. The reasons for this are fairly -obvious when one considers the ubiquity of integers and the relative simplicity -of the numerical domain compared to many others. - -\section{Fixpoints} - -When performing static analysis there is a great need to perform some sort of -fixpoint solving to determine the possible resultant values for some operations. -When looping, for instance, it is necessary to "iterate" (in some sense) to find -what values each variable may take during a loop iteration. - -The most basic method for doing so is simply to perform a complete iteration in -the abstract domain, but doing so within the integers has no guarantee of -termination (due to the integers being an unbounded set) and would not be -expected to be performant. In \cite{CousotCousot77-1} an approach of "widening" -and "narrowing" is introduced. - -This widening/narrowing method performs some fixed number of iteration steps. -If, after this fixed number, the iteration has not stabilised then it assumes -that it will not stabilise and will instead continue infinitely, so any affected -variables are "widened" to infinity. A "narrowing" step is then carried out -wherein variables are brought back into the finite domain by some terminating -condition on the loop. - -This approach is quite general, and so is applicable to many instances of real -programs, but is a crude approximation of the reality of the program (albeit, -with surprisingly good results). In specific instances it is possible to perform -a better approximation. In particular the case of addition, multiplication (by -positive constants), minimum and maximum is considered in -\cite{Gawlitza:2007:PFC:1762174.1762203}. With this approach code points are -converted into fixpoint equations with $minimum$ and $maximum$ expressions. -These fixpoint equations can then be solved in order to give information about -the program's abstract semantics. These equations are solved using a form of -policy iteration (referred to as strategy iteration) over the $maximum$ -expressions present in each term. A second step consisting of ordinary -fixpoint-iteration over the resulting $minimum$-expressions is then performed. - -When fixpoint-iteration is known to terminate then it is possible to get -significant speed gains by processing each expression according to their -dependencies. The approach presented in \cite{DBLP:tr/trier/MI96-11}, in -particular, improves the real-world performance of their fixpoint-equation -solver. It should be noted, however, that such evaluation strategies can do -nothing to affect the worst-case complexity as it is always possible to -construct a system in which evaluation of each variable is necessary at every -stage until the system stabilises. - -\section{Conclusion} - -Static analysis is of great practical importance to real world developers. It -gives us the ability to create tools which automatically identify common errors -in programs due to developer errors and so allows us to provide greater -guarantees about the behaviour of our software. - -Unfortunately complete static analysis of a program is an undecidable problem, -and so we must perform abstract interpretation of a program's semantics in order -to provide useful results of analysis. Many different abstract domains exist in -which to consider programs, with varying precision and computational complexity. -In particular the abstract domain of polyhedra (and the related domains of -zones and ranges) are of great use when performing analysis of a program's -integral semantics. - -\bibliographystyle{abbrv} -\bibliography{references.bib} -\end{document} -- cgit v1.2.3