Compiler Verification
Context
Formal verification methods such as program proof, model checking and static analyses increase tremendously the confidence we can have in critical software. However, the code that is formally verified is rarely, if ever, identical to the code that is eventually executed. The latter is generally machine code for a hardware processor. Formal verification methods are almost never applied directly to machine code, with the notable exceptions of Foundational Proof-Carrying Code and Typed Assembly Language. Instead, the form of the code that is formally verified -- the model of the code -- is generally one of:
- Source code written either in a general-purpose programming language (C, C++, Java, etc) or a domain-specific language (Scade, Esteral, Simulink diagrams, etc)
- An intermediate representation such as Java or .Net virtual machine code.
- An abstract presentation of the program as algorithmic pseudo-code, labeled transition systems, functional or relational specifications in constructive logics, TLA+ specifications, B abstract machines, etc.
In the first two cases, the executable code is obtained from the model of the code by compilation, that is, automatic translation of a high-level or intermediate language to machine code. While globally well understood, compilation -- and especially optimizing compilation, where the generated code must meet demanding speed and size constraints -- is a complex process involving delicate code transformations backed up by static analyses. Bugs in compilers can and do happen, causing the generated machine code to crash or behave differently from the source code. An incorrect compiler therefore has the potential to completely invalidate the guarantees obtained by formal verification of the source code.
In the third case above, the model of the code used for verification is further removed from the actual executable code. Sometimes, source code is generated automatically from the model of the code (e.g. code generation in the Atelier B; program extraction in the Coq theorem prover). This can be viewed as an additional compilation step, with the same risks described above for traditional compilation. In other cases, the model of the code is extracted mechanically from the actual source code, using abstract interpretation or static analyses. The soundness of this extraction is again critical for the formal guarantees to carry over to executable code. Often, the model of the code and the source code are both written by hand, leaving ample room for human error when one is transcribed from the other.
In the remainder of this text, we will assume that no such manual transcription takes place and that formal verification takes place over source code of one form or another, from which executable code is automatically generated by a compiler. The question, therefore, is: how to establish trust in this compilation process? how can we ensure that compilation preserves the guarantees obtained by formal verification of the source code?
Approaches to compiler verification
Two approaches have been consider to gain confidence in compilation processes:
- Compiler verification. Formal methods are applied to the compiler itself to prove that it meets its specifications. These specifications can be full correctness properties of the form "the generated code is observationally equivalent to the source code", or weaker safety properties such as "the generated code is type- and memory-safe".
- A posteriori code verification. Here, the desired correctness property
is established by a separate tool, the verifier, which takes
as inputs the source and compiled code, plus annotations (also
called certificates) also produced by the compiler. The verifier
replies "yes" or "no" depending on whether it was able to establish
the desired property for this particular run of the compilation
process with the help of the compiler-generated annotations. This
approach encompasses:
- Translation validation. The annotation is empty or minimal (debugging annotation) and the verifier proceeds by static analysis over the source code and the generated code, e.g. to construct and check valid a simulation relation between the two codes.
- Proof-carrying code (PCC). Here, the compiler-generated certificate is more substantial: in the original presentation of PCC, it is a proof term in a constructive logic. The verifier is correspondingly simpler: it checks this proof against a verification condition generated from the code.
- Type-preserving compilation and typed intermediate or assembly languages (TIL, TAL). In this approach, popularized by Java bytecode verification, the desired property is type- and memory-safety of the generated (virtual) machine code; it is established by type-checking of the machine code using type annotations left by the compiler.
The two approaches --a priori verification of the compiler or a posteriori verification of the results of one run of the compiler-- are not exclusive. In particular, a verifier that was itself formally verified (we have a proof that whenever it says "yes", the desired property of the compiled code indeed holds) gives formal guarantees as strong as those of a verified compiler. For each pass of a compiler, we have a choice between verifying the compiler code for this pass, or performing it by non-verified code and checking its results a posteriori using a verified verifier.
Challenges
- Formal verification of realistic compilers usable for critical
embedded software.
Many proofs of compiler correctness have been carried out already, both on paper and on machine, but for non-optimizing compilers operating on simplified source and target languages. The first challenge is to scale these efforts all the way to the verification of a realistic compiler that can be used in the context of critical embedded software. The source and target languages must be among those used in this context, e.g. C as source language and machine code for popular embedded microprocessors as targets. The compiler must generate code that is efficient enough and compact enough to meet application constraints, meaning that a number of optimizations must be deployed and proved correct.
- Formal verification of static analyses.
Static analysis plays a crucial role both in optimizing compilers and in code verification tools such as static bug finders. Formal verification of static analyzers is needed both to prove the correctness of optimizing compilers and to establish trust in these code verification tools. An early success in this area was the verification of type-checkers for machine-level languages, such as Java/.Net bytecode verification and Typed Assembly Language. It remains to extend this success to finer, more diverse static analyses.
- Adequate semantics.
Verification of compilers and static analyzers is carried out relative to the semantics for the source and target languages. It is therefore crucial -- and definitely not obvious -- that these semantics match reality.
For the target languages, the formal semantics of machine-level languages are well understood and relatively easy to validate against real hardware by testing. However, this is true only for sequential code: the semantics of parallel machine code is often unclear, especially for multiprocessors with shared memory and relaxed consistency between each processor's view of the memory.
For source languages, it is essential that the semantics used to prove compilers and static analyzers match, together and with the semantics used to verify source code by program proof or model-checking; otherwise, there is no hope that guarantees established by source code verification carry through to the executable code. In particular, the operational semantics typically used for compiler proofs must be validated against the axiomatic semantics typically used for program proofs. More generally, a systematic effort should be made to share and reuse semantics across compilers and verification tools, instead of the current practice of developing one's own semantics for one's own purposes.
Another dimension is modularity. Good axiomatic semantics supports modular reasoning in accord with program and specification structure. Operational semantics needs to model things like separate compilation and multi-stage computation.
- Extension to finer properties of the generated code.
Past and current work on compiler verification concentrates on two classes of guarantees over the generated code: type- and memory-safety in the context of security of mobile code, and observational equivalence or refinement with the source code in the context of high-assurance software. Some applications demand stronger guarantees. For instance, consider the compilation of high-level security constructs (such as secure channels, authentication logic, etc) down to low-level cryptographic operations. To ensure that no attacks are possible on the generated code that are not also possible on the source code, semantic preservation is insufficient: a full abstraction result between the source and compiled code is needed. Another example is the compilation of code fragments written in different languages and eventually combined by linking the generated machine code. Here, semantic preservation results that apply only to whole programs are insufficient and must be refined into properties that take the context into account.
- Extension to code generators from higher-level source languages and
from specification languages.
As mentioned previously, what we consider as "the source code" (written in a conventional programming languages) is, in many cases, automatically generated from higher-level, possibly domain-specific programming or specification languages. Examples include the generation of C code from reactive languages or Simulink diagrams, or the code extraction facilities of proof assistants such as Atelier B and Coq. This approach is to be encouraged as it automates error-prone encodings of high-level specifications into low-level code. The formal verification of the correctness of such code generators is important for the same reasons that compiler verification is important. More generally, all tools that participate in code generation for a critical application should ideally be formally verified at appropriate levels of confidence. Taking a broad view on tool verification in general, verification of code generation tools is more important than verification of code verification tools, as a bug in the latter can only mask the existence of a bug in the software, while a bug in the former can introduce additional bugs.
TODO integrate the following items into preceding text.
- Building proven to be correct compilers for real-life languages and target machines (using prover-based theoretical methods and development tools).
- Proving existing compilers to be correct. As part of this challenge one may consider the further development of proof carrying code methods to make them fit real-life languages. (An example is the certifying compilation using type carrying code developed for the completeness proof of Java compilers with respect to the JVM bytecode verifier, see Ch.16.5 of the Jbook and also the JBook Milestone
- Develop a theory of compiler verification that is generic as far as possible for different target processors and (less important as changes occur less often) source languages. Such a theory should provide general theorems that are applicable in many contexts. Current state-of-the-art are rather specific considerations for specific scenarios.
- Develop the theory in such a way that compiler verification
interfaces directly to
- the specification from which the target processor is designed/generated
- the programming language model that is used in verifying source programs relative to program specifications.
There is clearly a relationship to "unifying theories" here. (Some older work in this direction was done by the ProCoS project in the early 90ties by the groups of Hoare/Langmaack/Olderog/Ravn/Bjoerner, see e.g. LNCS 1283, and by a national German project, Verifix).
If the compilers translate to an intermediate format like to the JVM the target processor becomes less critical for the compiler proof. However, we then have to study how to come to provably correct execution mechanisms/interpreters for the JVM.