Working outline
Introduction
This report outlines a road map for research in the semantics and axiomatics of notations for programs, designs, specifications, and requirements models. This report has been written by Theory Panel with the following membership: Egon Boerger, Xavier Leroy, Markus Mueller-Olm, David Naumann, Peter O'Hearn, John Reynolds, Peter Sestoft and Steve Zdancewic. Additional contributors include: Mark-Oliver Stehr.
Goals of the Panel
- To create a road map that identifies research challenges in the theory of program construction and analysis that are critical for the overall goals of the verification grand challenge.
- To interact with the other Verified Software panels in order to refine these challenges as increasingly specific goals and activities emerge.
- Coordinating the activities of researchers who seek to meet those challenges.
- Generating publicity to broaden the participation in the effort.
Many challenges and activities in the Verified Software roadmap involve theory topics. In many cases it is neither necessary nor feasible to include a separate discussion in the Theory section of the roadmap. Therefore, our aim is for the Theory roadmap to focus on areas that are of significant independent interest and/or which can be pursued by theorists not necessarily working closely with developers of a particular tool or case study. Moreover, the theory roadmap is organized so as to facilitate inclusion of the theory challenges and milestones under the main research headings in the overall roadmap.
Focus area of this panel:
- Semantics and logics of "realistic" programming and modeling languages used in experiments
- Correctness of compilers, compilation schemes, and program analysis tools
- Correctness of refinement and analysis tools for models
- Semantics for multi-stage notations (meta-programming, program generation, scripting for tool interoperation,...)
- Linking theories, as unifying foundation for integration of disparate tools
Background on theory for verified software
Theory is needed to support the following.
- Construction and analysis of ``ground models'', i.e. abstract rigorous system blueprints, to correctly and completely capture and document system requirements as starting point (contract) for further design leading to code.
- Derivation of one system description from another, e.g. definition of a refinement of an abstract system model, and proving the desired correctness properties (e.g. equivalence or preservation of behavioral properties), either in the meta-theory ("correctness by construction") or as post hoc mathematical verification or experimental validation of a particular refinement step.
- Composition and modularization techniques, together with corresponding proof principles, needed to support the analysis of run-time features.
These items encompass design validation, correctness of ordinary programs, and what might be called correctness at the meta level, e.g., soundness of a proof rule or correctness of a compiler with respect to the language and hardware semantics. Theory provides the semantics that underlies notations that express system requirements, design specifications, and implementations. It provides definitions for notions such as satisfaction and refinement. It provides general laws and theorems that justify methods for reasoning about particular specifications, designs, and implementations. The qualifier "meta" is sometimes used for theories about other theories. Such theories bridge gaps such as between a semantics embodied in a verification condition generator (VCG) and a more operational semantics used to reason about compilation, or between sub-logics of two theorem provers.
Theory has to provide methods for linking in a verifiable way the application-centric meaning of a system, which is expressed by a ground model prior to coding, to the meaning of the implementing code that is verified by the compiler. In particular, theory has to define provably correct refinement schemes that support further detailing of abstract specifications to their implementation by compilable code (read: the design process leading from system requirements to executable code). Theory also has to justify the integration of heterogeneous tools necessary to construct and validate links in the refinement chain.
Current state of theory
CS Theory started in the first third of the 20-th century with computability theory, whose main discovery was Turing's notion of a programmable universal computing machine. With the advent of computers CS Theory quickly focussed
- on the hardware side on the theory of circuits and computer architectures,
- on the software side on the theory of algorithms, data structures and programming languages.
Numerous theoretical achievements in these fields are nowadays incorporated into advanced tools for the design and the analysis of modern programming languages and compilable programs. This includes powerful static and dynamic analysis techniques, e.g. type inference engines, mechanical verification of type correctness and similar program properties, etc.
With the growing complexity and diversity of computer-based systems one more abstraction level came into view, where prior to coding a system has to be specified by an abstract model to capture the system requirements in a faithful and complete manner, which can be used as starting point for its implementation by executable code. CS Theory has provided a large variety of such specification languages. Some of them come with tools supporting the development of abstract system models and their refinements to code.
Various tools are based on coherent individual theories (e.g., higher order logic theorem provers; first order SMT tools; analyzers based on abstract interpretation) but large experiments in verified software will require integration of disparate tools. The software to be verified is itself implemented in a variety of languages. A central challenge is for unification of theories; this can take the form of broad general theories but also linking theories that bridge particular gaps. This depends on linking theorists, one purpose of the roadmap. Tony Hoare has drafted Goals of Unifying Theories.
(Contributors are encouraged to insert mention of additional lines of research here, each linked to a page with highlights including who are the active researchers.)
Theory challenges
Open problems, research areas, and challenge problems are enumerated in a separate document, the Outline of Theory Challenges. It is organized according to the research categories in the overall Verified Software roadmap, to facilitate inclusion of selected theory challenges and milestones therein.
Planning and roadmap
The following questions need to be answered. Which challenge problems should be addressed first? In which time frames (say, 1-5, 5-10, 10-15 years)? Who are the researchers/groups interested? What concrete activities are to be taken to address the challenges and how will these activities be integrated with the other verified software activities?
The list below was presented by at the SRI Workshop, April 2006, but is expected to be folded into the overall roadmap. This list includes challenges that belong elsewhere in the roadmap, as they go beyond theory.
Five years goals
- Type theories to support tool checking of multistage notations
- Specifications for a range of interesting security properties.
- Linking theory for sound embedding ACL2 and HOL
- "POPLmark challenge": http://fling-l.seas.upenn.edu/~plclub/cgi-bin/poplmark/index.php?title=Main_Page (TODO Appel and Leroy's list-machine benchmark)
- Runtime verifier challenge, consisting in instrumenting current high-level model execution tools (e.g. interpreters for ASMs or event-B models or model checkers for TLA+ models) to monitor the truth of selected properties at runtime, enabling in particular the exploration of ground models to detect undesired or hidden effects or missing behavior. (Link to runtime verification community.)
- Modular reasoning about frame conditions in OO programs (separation logic meets JML)
- Certifying compiler for information flow.
- Develop today's experimental automatic verification technology to maturity and integrate it into standard program development environments such that it can be routinely used by average programmers.
Ten years goals
- Verification of an individual analysis tool (e.g., SAT solver, or Java type checker), from high level requirements in terms of program logic down to the code of a production implementation. This involves specification of the analyzed language and its analyis and also the semantics of the implementation language.
- The refinement tool support. This splits into three subgoals:
refinement generator goal: define practical model refinement schemes, which capture established system development and programming knowledge, together with justifications of their correctness. Where the refinement process is highly creative, interactive schemes can still be helpful to guide the refinement process by a combination of user-insight and mechanized tactics. In particular, these refinement schemes should allow the verifier to turn model properties into software interface assertions comprising behavioral component aspects, to be used where state-based run-time features are crucial for a satisfactory semantically founded correctness notion for code.
refinement verifier challenge: enhance current computer-based verification systems by means to prove the correctness of practical model refinement schemes.
refinement validator challenge: link the refinement of ground models to model simulators to make the generation and systematic comparison of corresponding test runs of abstract and refined machines possible. In particular relating system and unit level test results should be supported by this enhancement of model simulation tools.
- Verification of a compiler for a widely used language.
- Specifying and verifying intensional properties. E.g., time and space in lazy functional programs; generation of executables based on resource parameters for embedded components.
Fifteen years goals
- A well developed theory of design and reasoning principles to design and verify system components and their compositions (say, a web service model, design, or implementation) using only the interface specifications. By system component we emphasize components in the large, with complex interfaces and behaviors. A sufficiently general notion of composition could subsume the next item.)
- A unifying theory of theorem proving serving as a framework for linking theories connecting different logics and subsets. This is the basis to justify particular integrations of tools that are sound in virtue of using only subsets of the logics as needed for a particular verification. TODO link to tool interoperability roadmap items: bridging semantic gaps. (E.g., verify a compiler in HOL and a processor in PVS; justify the combination, which may be sound only owing to the subsets of PVS and HOL being used.) Logosphere.org is relevant.
- Theories and tools that permit complete verification of a realistic runtime system, for instance a real-time runtime system for Java, including its intensional properties such as space usage (code will not overflow the stack) and time usage (the code is fast enough, and will meet its deadlines).