Personal tools
You are here: Home VSR Private Verified Software Roadmap 2006 Technical Background
[Overview] >>

Technical Background

Document Actions
last edited 3 years ago by shankar

Verification has been a fertile and active area of research over the last forty five years. There are several well-developed strands of research with deep inter-relationships. This section provides a brief background to the technical ideas that are relevant to the road map. The United States Food and Drug Administration publishes a glossary of technical terms related to software development. We focus on the background that is relevant to the formal development and verification of software. Rushby's FAA Handbook chapter on Formal Methods and the Certification of Critical Systems also provides extensive background.

The key technical areas of formal software verification can be classified as

  1. Problem and domain modeling
  2. Semantics of computation
  3. Programming methodology
  4. Verification tools

Problem and Domain Modeling

A typical software development project is based on a V-diagram where the left half of the V is the development phase and the right half consists of validation. The development phase starts with requirements gathering. These requirements are turned into specifications which are used to develop the high-level design which is finally implemented in software. The validation phase consists of unit tests on the individual software components, integration tests on the combined software, and system tests on the software together with the other system components. Formal methods have a role to play in all phases of the design lifecycle.

Studies reveal that many costly design errors are introduced during the early part of the software lifecycle where requirements are gathered and made precise. The requirements here can include domain models that capture aspects of the larger system or context within which the software operates. These formal models can be analyzed by means of symbolic calculation for

  1. Inconsistencies: The requirements are contradictory and cannot be satisfied by any software.
  2. Incompleteness: There are gaps in the requirement where the desired behavior is left unspecified.
  3. Ambiguities: The informal requirement is not precise enough so that the formal model may not correspond to the intended requirement.
  4. Anomalies: The requirements are complete, consistent, and unambiguous, but the corresponding formal model reveals undesirable behavior suggesting further requirements.

The formal models also serve as the basis for

  1. Test case generation: The formal models are used to generate test cases for checking the software for conformance.
  2. Analysis: Expressing and deriving properties of the software such as the ranges and types of values assigned to variables, relationships between variables, datatype invariants, execution time, and space usage.
  3. Refinement: Systematic construction of the software from the formal specifications through the construction of a series of formal models of greater specificity.
  4. Verification: The demonstration that the software meets its specification through the construction of a formal argument.
  5. Validation: The process of gaining assurance that the software delivers the intended functionality through testing, verification, and analysis of failure scenarios (fault-tree analysis, failure modes and effects analysis).

Formal models can be operational so that they specify a reference implementation for the intended behavior of the software, or declarative so that they circumscribe the accepted range of behaviors that the software can exhibit. Model-based design emphasizes operational models of software using state machines and flow diagrams. Declarative models are expressed in specification languages based on logic. A variety of logics are used in specification. These include Boolean logic, equational logic, rewriting logic, relational algebras, Hoare logic, modal logic, dynamic logic, linear-time and branching-time temporal logics, first-order logic, and higher-order logic. General-purpose logics like first-order and higher-order logic can be augmented with theories given by specific sets of axioms, including those for sets, numbers, finite and infinite sequences, arrays, bit-vectors, recursive datatypes like lists and trees, among others. In addition to these, there are a host of specialized theories for dealing with concurrent interaction (process algebras), security (including access control and information flow), and timed and hybrid systems.

Semantics of Computation

The challenge of describing the meaning of a program written in a high-level programming language in a machine-independent manner has occupied computer scientists from the beginning. Such a semantics can serve as the basis for an implementation as well as a foundation for reasoning about the correctness of programs. The main approaches to semantics can be expressed as

  1. Denotational: The meaning of a program is composed from the meaning. For example, the meaning of a function application is given in terms of the meaning of the function and those of its argument, and a meaning of an imperative program is built from those of its statements.
  2. Operational: The rules for executing the program is given by its operational semantics. These rules provide a reference semantics for evaluating an actual implementation. It is customary to present this semantics by means of inference rules given using Plotkin's Structured Operational Semantics. A big-step semantics presents rules in terms of the final value of a given program, whereas a small-step semantics is presented in terms of a transition from one program state to the next. Other frameworks for presenting small-step semantics include Abstract State Machines and fair transition systems.
  3. Axiomatic: The axiomatic semantics of a program captures the meaning of a program statement in terms of the valid post-condition assertions corresponding to a pre-condition assertion. Such a Hoare triple of the form {p} S {q} expresses the judgment that program S when executed in a state satisfying assertion p terminates in a state satisfying post-condition q. Dijkstra's weakest pre-condition calculus presents the semantics of a statement by means of a weakest pre-condition operator wp so that wp(S)(q) is the set of those state from which any execution of S terminates in a state satisfying q.
  4. Algebraic: The algebraic semantics of a program is given by operators for composing programs and rules for proving equalities between program fragments.

The above semantic approaches are used heavily in designing verification and analysis tools and in framing verification problems.

Programming Methodology

The core issue in programming methodology is that of constructing programs that can be proved to meet their specifications. The key techniques for constructing correct programs are abstraction, composition, and refinement. Abstraction masks the details of any particular instance to achieve a general result that is applicable to many other instances as well. Abstraction mechanisms in programming languages include functions, procedures, and modules. Such abstractions can be verified once for all possible instances. Abstraction is also a technique for obtaining a reduced model that retains all the essential features of a particular model for the purpose of verifying a specific property. Composition is another key technique in constructing correct programs using modules with functional, relational, sequential, and parallel composition. The structure of the proof must follow the composition structure of the program. Compositional verification methods allow properties of complex programs to be derived from those of the components. Assume-guarantee proof techniques prove component properties by making assumptions about the environment in which they are placed. Component properties are combined to obtain proofs for the properties of the composition of these components. Conversely, decompositional proof methods are used to break the verification of a large program into those of its sub-components.

Verification Tools

Formal verification is an activity that requires tremendous attention to detail. There would be no hope of scaling this activity to large software examples without automated support. Automated verification runs the gamut from static analysis and dynamic analysis which look for specific classes of bugs, to model checking as a way of exploring the state space of a model and deductive verification as a technique for constructing a proof of correctness for a program. These techniques can be applied to software as well as models of the software.

Static analysis examines a program for type correctness, type inference, aliasing, null-pointer dereferences, uncaught exceptions, overflows/underflows, out-of-bounds array references, and memory leaks. The advantage of static analysis is that the specification is generic and there is no need to annotate the program with assertions. Abstract interpretation is a typically technique for static analysis where an abstract lattice, such as the signs of integers, replaces the data values, such as integer values, and the program is evaluated to a fixpoint through the aggressive use of approximation (e.g., widening) to achieve finite termination. Static analysis yields information about the set of values that a variable can assume at each program point.

Dynamic analysis (Klaus will provide material).

Model checkers explore the state space of a system to either detect violations of a putative property or to establish the absence of such violations. One of the strengths of model checking is that when a verification fails, a counterexample is generated in the form of an execution trace that highlights the error. Even for small systems with concurrent interaction, model checking can find bugs that have eluded careful human examination. Explicit state model checking explores the reachable state space of a system by enumerating the successors of each examined state in a depth-first or a breadth-first manner. Many temporal properties including refinement can be verified in this manner as long as the reachable state space is finite. Symbolic model checkers represent subsets of the state space in a symbolic form such as a Boolean function represented as a binary decision diagram or a polyhedron formed from a conjunction of linear inequalities. For example, the symbolic representation of clock differences using difference bounded matrices (DBMs) is employed in model checking real-time temporal properties of timed automata. Real-time model checkers can be used to verify timing behavior, untimed properties such as mutual exclusion which follow from timing considerations, and also to synthesize schedulers. Model checking can also be used to generate test cases, synthesize plans, check for state machine equivalence, language containment, and to generate controllers or environments that ensure that a system can operate safely. The latter capability can be used for compositional verification. The biggest challenge for model checking approaches to verification is state explosion. Large models cannot be explored either because the transition relation or the set of reachable states cannot be represented compactly within the available memory. The state explosion problem is a stubborn one, but there are many ways to ameliorate it using decomposition and abstraction. The counterexample-guided abstraction refinement (CEGAR) approach works by generating an abstract model to which model checking is applied. If a counterexample is found that has a concrete counterpart, then this yields a genuine error. Otherwise, the failure to find a concrete counterpart is used to iteratively refine the abstraction.

SAT solvers are a class of tools that solve Boolean satisfiability problems. These tools take their input either as arbitrary Boolean formulas or in a specific normal form like the conjunctive normal form (CNF) as a conjunction of clauses. Each clause is a disjunction of literals, and each literal is either a Boolean variable or its negation. These solvers either return a Boolean assignment satisfying the given formula or the signal the unsatisfiability of the formula. Modern SAT solvers have achieved dramatic speedups due to careful attention to efficient backtracking and cache-aware data representation. SAT solvers can be applied to a range of verification problems involving finite-state systems. For example, a bounded model checker uses a SAT solver to explore the entire computation tree of a finite-state model or program up to a bounded depth for violations of temporal properties. Bounded model checking can also be used to verify system invariants by means of k-induction by showing that if a property holds for k steps in a computation, then it holds for k+1 steps. Bounded model checking is also useful for generating test cases. SAT solvers also suffer from state explosion but they are less sensitive to irrelevant information. SAT solvers can be instrumented to generate proofs corresponding to unsatisfiability as well as interpolants which can decompose a proof of unsatisfiability for two sets of clauses such that the interpolant is implied by one set of clauses and contradicts the other set of clauses. Interpolants have many applications such as in the refinement phase of the CEGAR approach.

While satisfiability problems involving scalar types and arrays over finite types can be directly reduced to Boolean form, many software problems involve datatypes such as integers, rationals, lists, and trees, involve terms built from uninterpreted functions, and predicates. such as equality and inequality. Boolean satisfiability can be extended to satisfiability modulo theories (SMT) by combining decision procedures for these datatype theories together with Boolean SAT solving. SMT solvers leverage the speed of SAT solvers and can be similarly used for bounded model checking in richer software models, including those with real-time behavior. These procedures can also generate proofs and counterexamples, and also explanations in terms of the set of core formulas needed to achieve unsatisfiability. SMT solvers can also be used for test case generation, extended static checking, translation validation, predicate abstraction, and plan synthesis.

Deductive approaches to verification are implemented in various automated theorem proving frameworks that employ induction, case analysis, simplification, rewriting, and quantifier instantiation to establish deeper properties of software that may not be amenable to automated approaches such as model checking and static analysis. Deductive approaches have been applied to such critical system properties such as cache coherence, fault tolerance, and distributed algorithms.

It is important to note that all of the approaches to automated verification can be applied at all levels of abstraction from high-level specification to hardware designs. They can also be used for verifying properties of the software as well as for relating different levels of abstraction through refinement.

« November 2009 »
Su Mo Tu We Th Fr Sa
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30
What's up ?
Be notified when a document is published in this folder or below.
 

Powered by Plone, the Open Source Content Management System

This site conforms to the following standards: