diff options
Diffstat (limited to 'tex/lit-review')
-rw-r--r-- | tex/lit-review/Makefile | 8 | ||||
-rw-r--r-- | tex/lit-review/main.tex | 159 | ||||
-rw-r--r-- | tex/lit-review/references.bib | 117 |
3 files changed, 0 insertions, 284 deletions
diff --git a/tex/lit-review/Makefile b/tex/lit-review/Makefile deleted file mode 100644 index 66eb399..0000000 --- a/tex/lit-review/Makefile +++ /dev/null @@ -1,8 +0,0 @@ -all: - pdflatex main.tex - bibtex main - pdflatex main.tex - pdflatex main.tex - -clean: - rm -f *.bbl *.aux *.blg *.log *.pdf *.out diff --git a/tex/lit-review/main.tex b/tex/lit-review/main.tex deleted file mode 100644 index 28a9495..0000000 --- a/tex/lit-review/main.tex +++ /dev/null @@ -1,159 +0,0 @@ -\documentclass{article} - -\usepackage[a4paper]{geometry} -\usepackage{hyperref} -\usepackage{amsmath} - -\title{INFO5993 \\ Literature Review} -\author{Carlo Zancanaro} - -\begin{document} -\maketitle - -\section{Introduction} - -Static analysis is concerned with determining as much as possible about a -program from it without actually running the program. Being able to analyse the -program statically can provide a wealth of information about the program's -characteristics without executing it at all. - -The uses for static analysis vary significantly, from program optimisation -(eg. constant propagation) to verification of program semantics and bug-finding. -In critical applications (for example: medial equipment and flight computers) -static analysis techniques are used to ensure that programs are free from errors -which would cause catastrophic failure. - -Of particular use in static analysis is the determination of integral values -within a program. By determining which possible values any integer variables in -the program can take at each code point it is possible to detect a significant -number of errors relating to improper integral values (divide-by-zero, -array-out-of-bounds, etc.). Such errors are among the most common in real-world -software development, so their detection is of much use to software developers. - -\section{Abstract Interpretation} - -\subsection{General Framework} - -Abstract interpretation is a technique introduced in \cite{CousotCousot77-1} in -which a program's semantics are examined within an abstract domain. This -abstract domain is a simplification of the concrete domain in which the program -actually executes, but is constructed in such a way as to retain some properties -of the original program for analysis. We perform our analysis by "executing" our -program within this abstract domain which we have constructed. - -An example of an abstract domain is considering numbers as either positive -$(+)$, negative $(-)$, zero $(0)$, unknown $(\pm)$ or undefined $(?)$. Using this -abstraction we can still define some semantics which reflect the program itself -(as an example: $(+) \times (-) = (-)$, or the product of a positive number with -a negative number is a negative number). Each abstract domain must also contain -a "top" $(\top)$ and a "bottom" $(\bot)$ element. The existence of these two -elements turn our abstract domain into a complete lattice, which ensures some -helpful properties of our abstract domain, including the existence of solutions -to fixpoint-equations (\cite{tarski}). - -By constructing appropriate abstract domains we can infer properties of the -program's concrete semantics which are relevant to our application. The above -abstract domain, for example, can be used to detect divide-by-zero errors -(although it is clear from the above that not all divide-by-zero errors will be -detected) or negative array-indexing (one form of an out of bounds error). - -A number of abstract domains have been conceived and formalised in literature. -These are considered in the following subsection. An important factor to -consider in the following is that as abstract domains get more "detailed" they -become more computationally expensive to analyse. - -\subsection{Abstract Domains} - -Cousot and Cousot, in \cite{CousotCousot77-1} introduce an abstract domain -providing an upper and lower bound to each integral value. This domain allows -them to reason about numbers with greater precision than simply considering -positive/negative/zero. - -The abstract domain of "zones", are handled in \cite{mine:padoII}. These zones -provide a way to perform analysis on the bounds of variable differences (ie. to -place bounds as $a - b \leq c$) and are also referred to as Difference Bound -Matrices (which are used in other model-checking applications, such as -\cite{Yovine98modelchecking}). - -The domain of zones is then extended by the same author into the "Octagon -Abstract Domain" in \cite{mine:hosc06} which additionally allows bounds on the -sum of variables ($\pm a \pm b \leq c$). - -All of these above domains are actually shown to be instances of the polyhedra -abstract domain in \cite{DBLP:conf/vmcai/SankaranarayananSM05}. The polyhedra -domain expresses bounds on $a_1 x_1 + a_2 x_2 ... + a_n x_n + c \ge 0$, where -$a_i$ are fixed in advance (a predefined polyhedral shape). This allows for -ranges to be expressed as $1 \times x_1 + c \ge 0$ provides a single bound for a -single variable (two such bounds is, of course, a range). Similarly for zones -one can define a polyhedra $1 \times x_1 + (-1) \times x_2 + c \ge 0$ which is -equivalent. The octagon domain is $(\pm 1) \times x_1 + (\pm 1) \times x_2 + c -\ge 0$. - -Non-integral abstract domains have also been investigated in papers such as -\cite{springerlink}, which considers the abstract domain -of strings with abstract values being regular expressions, but it seems the -majority of the literature is concerned with providing bounds for integral, or -at the very least numerical, expressions. The reasons for this are fairly -obvious when one considers the ubiquity of integers and the relative simplicity -of the numerical domain compared to many others. - -\section{Fixpoints} - -When performing static analysis there is a great need to perform some sort of -fixpoint solving to determine the possible resultant values for some operations. -When looping, for instance, it is necessary to "iterate" (in some sense) to find -what values each variable may take during a loop iteration. - -The most basic method for doing so is simply to perform a complete iteration in -the abstract domain, but doing so within the integers has no guarantee of -termination (due to the integers being an unbounded set) and would not be -expected to be performant. In \cite{CousotCousot77-1} an approach of "widening" -and "narrowing" is introduced. - -This widening/narrowing method performs some fixed number of iteration steps. -If, after this fixed number, the iteration has not stabilised then it assumes -that it will not stabilise and will instead continue infinitely, so any affected -variables are "widened" to infinity. A "narrowing" step is then carried out -wherein variables are brought back into the finite domain by some terminating -condition on the loop. - -This approach is quite general, and so is applicable to many instances of real -programs, but is a crude approximation of the reality of the program (albeit, -with surprisingly good results). In specific instances it is possible to perform -a better approximation. In particular the case of addition, multiplication (by -positive constants), minimum and maximum is considered in -\cite{Gawlitza:2007:PFC:1762174.1762203}. With this approach code points are -converted into fixpoint equations with $minimum$ and $maximum$ expressions. -These fixpoint equations can then be solved in order to give information about -the program's abstract semantics. These equations are solved using a form of -policy iteration (referred to as strategy iteration) over the $maximum$ -expressions present in each term. A second step consisting of ordinary -fixpoint-iteration over the resulting $minimum$-expressions is then performed. - -When fixpoint-iteration is known to terminate then it is possible to get -significant speed gains by processing each expression according to their -dependencies. The approach presented in \cite{DBLP:tr/trier/MI96-11}, in -particular, improves the real-world performance of their fixpoint-equation -solver. It should be noted, however, that such evaluation strategies can do -nothing to affect the worst-case complexity as it is always possible to -construct a system in which evaluation of each variable is necessary at every -stage until the system stabilises. - -\section{Conclusion} - -Static analysis is of great practical importance to real world developers. It -gives us the ability to create tools which automatically identify common errors -in programs due to developer errors and so allows us to provide greater -guarantees about the behaviour of our software. - -Unfortunately complete static analysis of a program is an undecidable problem, -and so we must perform abstract interpretation of a program's semantics in order -to provide useful results of analysis. Many different abstract domains exist in -which to consider programs, with varying precision and computational complexity. -In particular the abstract domain of polyhedra (and the related domains of -zones and ranges) are of great use when performing analysis of a program's -integral semantics. - -\bibliographystyle{abbrv} -\bibliography{references.bib} -\end{document} diff --git a/tex/lit-review/references.bib b/tex/lit-review/references.bib deleted file mode 100644 index d746821..0000000 --- a/tex/lit-review/references.bib +++ /dev/null @@ -1,117 +0,0 @@ -@article{DBLP:tr/trier/MI96-11, - author = {Christian Fecht and - Helmut Seidl}, - title = {An Even Faster Solver for General Systems of Equations}, - journal = {Universit{\"a}t Trier, Mathematik/Informatik, Forschungsbericht}, - volume = {96-11}, - year = {1996}, - bibsource = {DBLP, http://dblp.uni-trier.de} -} -@inproceedings{CousotCousot77-1, - author = {Cousot, P{.} and Cousot, R{.}}, - title = {Abstract interpretation: a unified lattice model for static - analysis of programs by construction or approximation of - fixpoints}, - pages = {238--252}, - booktitle = {Conference Record of the Fourth Annual ACM SIGPLAN-SIGACT - Symposium on Principles of Programming Languages}, - address = {Los Angeles, California}, - publisher = {ACM Press, New York, NY}, - year = 1977, -} -@article{mine:hosc06, - author = {Min\'e, A{.}}, - title = {The Octagon Abstract Domain}, - journal = {Higher-Order and Symbolic Computation}, - editor = {O. Danvy}, - publisher = {Springer}, - year = {2006}, - volume = {19}, - number = {1}, - pages = {31--100}, - note = {\url{http://www.di.ens.fr/~mine/publi/article-mine-HOSC06.pdf}} -} -@inproceedings{mine:padoII, - author = {Min\'e, A{.}}, - title = {A New Numerical Abstract Domain Based on Difference-Bound Matrices}, - booktitle = {Proc. of the 2d Symp. on Programs as Data Objects (PADO II)}, - series = {Lecture Notes in Computer Science}, - year = {2001}, - month = {May}, - pages = {155--172}, - volume = {2053}, - publisher = {Springer}, - location = {Aarhus, Danemark}, - note = {\url{http://www.di.ens.fr/~mine/publi/article-mine-padoII.pdf}} -} -@inproceedings{DBLP:conf/vmcai/SankaranarayananSM05, - author = {Sriram Sankaranarayanan and - Henny B. Sipma and - Zohar Manna}, - title = {Scalable Analysis of Linear Systems Using Mathematical Programming}, - booktitle = {VMCAI}, - year = {2005}, - pages = {25-41}, - ee = {http://dx.doi.org/10.1007/978-3-540-30579-8_2}, - bibsource = {DBLP, http://dblp.uni-trier.de} -} -@inproceedings{Dor:2001:CCS:647170.718291, - author = {Dor, Nurit and Rodeh, Michael and Sagiv, Shmuel}, - title = {Cleanness Checking of String Manipulations in C Programs via - Integer Analysis}, - booktitle = {Proceedings of the 8th International Symposium on Static - Analysis}, - series = {SAS '01}, - year = {2001}, - isbn = {3-540-42314-1}, - pages = {194--212}, - numpages = {19}, - url = {http://dl.acm.org/citation.cfm?id=647170.718291}, - acmid = {718291}, - publisher = {Springer-Verlag}, - address = {London, UK, UK}, -} -@INPROCEEDINGS{Yovine98modelchecking, - author = {Sergio Yovine}, - title = {Model checking timed automata}, - booktitle = {In European Educational Forum: School on Embedded - Systems}, - year = {1998}, - pages = {114--152}, - publisher = {Springer-Verlag} -} -@inproceedings{Gawlitza:2007:PFC:1762174.1762203, - author = {Gawlitza, Thomas and Seidl, Helmut}, - title = {Precise fixpoint computation through strategy iteration}, - booktitle = {Proceedings of the 16th European conference on Programming}, - series = {ESOP'07}, - year = {2007}, - isbn = {978-3-540-71314-2}, - location = {Braga, Portugal}, - pages = {300--315}, - numpages = {16}, - url = {http://dl.acm.org/citation.cfm?id=1762174.1762203}, - acmid = {1762203}, - publisher = {Springer-Verlag}, - address = {Berlin, Heidelberg}, -} -@article{tarski, - author = {Tarski, Alfred}, - title = {A lattice-theoretical fixpoint theorem and its - applications.}, - year = {1955} -} -@incollection {springerlink, - author = {Christensen, Aske and Mø, Anders and Schwartzbach, Michael}, - affiliation = {BRICS Denmark}, - title = {Precise Analysis of String Expressions}, - booktitle = {Static Analysis}, - series = {Lecture Notes in Computer Science}, - editor = {Cousot, Radhia}, - publisher = {Springer Berlin / Heidelberg}, - isbn = {978-3-540-40325-8}, - keyword = {Computer Science}, - pages = {1076-1076}, - volume = {2694}, - year = {2003} -} |