Personal tools
You are here: Home The QPQ Manifesto
Document Actions

The QPQ Manifesto

by shankar last modified 2006-12-06 14:35

This document outlines the underlying purpose of the QPQ deductive software repository.

The QPQ Manifesto
An Open Source Repository for Deductive Software Components

Summary

Science, technology, and commerce are increasingly dependent on software. The production of high-quality scientific software is a legitimate and valuable contribution to the scientific infrastructure, to science itself, and eventually to society. Scientific software can involve both numeric and symbolic calculations. Deductive software consists of symbolic routines for logical manipulations. This includes decision procedures, solvers, rewriters, model checkers, and theorem provers. Such software is used in a number of scientific and engineering applications in areas such as hardware and software verification, program synthesis and analysis, data and knowledge based systems, artificial intelligence, and linguistics.

We outline an initiative for an open source repository for deductive software components called QPQ (QED Pro Quo, pronounced ``quid pro quo'') as a marketplace for publishing, exchanging, and refining deductive software components. The refereed publication and distribution of such scientific software components in open-source form will yield higher quality, greater visibility, and accelerated productivity.

What is Deductive Software?

Deductive software is one of the oldest endeavors of computing. The 17th century philosopher and polymath Gottfried Leibniz had expressed the hope that with the development of symbolic logic, mechanized formal calculation could be used to settle scientific and legal arguments. He predicted that this would give humanity an instrument more powerful than the telescope. Computing pioneers such as Alan Turing and John von Neumann already understood that computing in the symbolic realm was at least as important as its use in numeric calculation. Symbolic computing includes artificial intelligence systems, compilers, interpreters, and computer algebra packages. Deductive software is a critical part of the software infrastructure and includes symbolic algorithms based on logic such as those used in logic programming, constraint satisfaction, term rewriting, theorem proving, model checking, and program analysis and verification.

The efficient implementation of deductive algorithms is an extremely complex and delicate task. It usually requires a serious investment in basic infrastructure such as concrete and abstract syntax, syntactic tools (parsers, typecheckers, pretty-printers), primitive operations (substitution, matching, unification, normalization), and advanced operations (constraint satisfaction, forward chaining, backward chaining, rewriting, simplification, decision procedures, induction methods). Since many of the algorithms are of exponential or super-exponential complexity, implementations must exploit complicated heuristics and nontrivial optimizations. Correctness in the form of soundness and completeness is a paramount concern.

Formalizations are also an important component of a deductive software. Many deductive software systems are built around specific formalizations of the relevant domain knowledge. Formalizations play an important role as challenge problems and benchmarks for deductive software systems, often driving the technology.

Who needs deductive software?

Deductive software is a critical and increasingly valuable part of the software infrastructure. The safety and functionality of hardware and software can be enhanced through the greater use of deduction. High-level programming languages such as logic and constraint languages are based on deductive operations. Advances in deductive software have led to dramatic increases in the execution speeds of such high-level languages. We are continuing to see dramatic improvements in deduction speeds to a point where deduction of high-level descriptions can become a viable execution framework for many applications. The accuracy of program tools can be improved through the use of deduction. For example, typecheckers and compilers can use theorem provers to accurately detect array bound violations, aliasing, and common subexpressions, and to implement numerous optimizations. Model checking is routinely used to find subtle bugs in hardware and concurrent software. Model checkers for timed and hybrid systems are crucial tools for the analysis of real-time and control software. Theorem proving is an important tool in the specification and verification of safety-critical systems. Large software and hardware design houses are increasingly reliant on deductive tools for finding design errors. The semantic web requires capabilities for producing and consuming logical descriptions and proofs.

Educational software aids for scientific subjects like physics, mathematics, and computing, use deduction (solvers, simplifiers, proof checkers) as a teaching tool. Several teaching tools for logic and mathematics already use deductive software.

The dissemination of high-quality deductive software can greatly improve the safety, reliability, and efficacy of the software infrastructure. It can also lead to novel uses of deduction in education and industry.

The availability of careful formalizations of significant bodies of mathematical, scientific, and engineering knowledge can be a valuable shared resource for research and education.

Why build a deductive software repository?

We have argued that deductive software is a critical component of the software infrastructure. Deductive software has made rapid strides in the last decade. Algorithmic and software solutions for many difficult problems in theorem proving and model checking have been developed. Powerful and efficient technologies are now available in the form of propositional satisfiability (SAT) solvers, model checkers, rewrite engines, decision procedures for arithmetic and geometry, and high-level theorem provers. However, there are some serious impediments to the further development of deductive software. One major impediment is that the capabilities of this technology are not that widely known. Even when the need for deductive software is apparent, such software is not easily accessible. Much of the technology is tucked away inside large, monolithic, closed systems. Another impediment is that the effort involved in building and maintaining such systems is monumental. Students are discouraged from working in this area because of the high start-up cost. Hard-won code is often rendered unusable through lack of maintenance.

Currently, most deductive software is available only in the form of closed systems. These systems are closed in two ways: their source code is not openly available, and the interfaces to the components of these systems are not open to external access. The developers of new deductive software systems must reinvent not merely the wheel, but the road, traffic lights, and much else, because there is no shared infrastructure base. Student projects, even at the Ph.D. level, are immensely daunting because of the enormous preparatory effort needed, and most student-generated code is seldom maintained over a long period. Industrial applications of deductive software are also thwarted by the lack of openness. The large, closed systems have only a limited industrial appeal. Most users are solely interested in augmenting their existing software with deductive capabilities. This requires availability of the source code so that the software can be ported or adapted as needed for industrial use. The interoperability between different deductive software systems is also hindered by the lack of openness in the interfaces and the code.

The deductive software discipline has not achieved the level of organization and reuse of fields like computer algebra, graphics, and numerical analysis. Deductive software chose to organize itself along the lines of disparate closed systems, whereas these other disciplines have pursued the development of libraries of utilities with simple application programmer interfaces (API). Since deductive components can interact with each other in quite rich ways, their APIs must be designed with great care. Some subfields within deductive software have been more active about presenting their software as library utilities. The popular BDD (binary decision diagram) packages share a uniform API so that they can be used interchangeably and the BDD community has been active in developing benchmarks that reflect their intended use.

How will a repository help?

Some of the potential contributions of the repository to the development and distribution of deductive software components are listed below.

Availability.

Software components that are currently locked up in proprietary systems will be made available for wider use. These components can be used as building blocks for larger systems.

Reliability.

Software that is published in open source form can be debugged by a large community of sophisticated users.

Quality.

Scientific software must meet high standards of rigor and elegance. Its design should reflect engineering principles like hierarchy, abstraction, information hiding, and separation of concerns. A repository can serve as a forum for grading the quality of contributed software along these criteria.

Durability.

The software available through the repository will be maintained by the primary developers or their surrogates, but others are free to extend, modify, adapt, refine, port, and improve the software and publish their results through the repository. As a result, the software components published in the repository will remain available over a longer duration. A useful piece of software can thus survive the apathy of its original developer or the obsolescence of its implementation medium.

Productivity.

Since software components are not easily available, a lot of effort is wasted in reimplementing functionality for which the software exists but has not been made available. A library of interoperating components can be used to implement systems with greater productivity and functionality than is currently possible.

Visibility.

A centralized repository can serve as a marketplace for software components. Claims regarding the capabilities of the software can be examined by potential users. Misconceptions regarding the technology can be quickly dispelled since there is an extraordinary range of powerful deductive software that is ready to be made available.

Uniformity.

A repository can serve as the basis for developing standardized interfaces to related software components. Gratuitous differences in the interfaces and representations used by similar software components can be avoided. Similar pieces of software can thus be compared and benchmarked. Components can be made more interoperable through better channels of communication between the users and developers. Uniformity leads to competition, which promotes quality and excitement. New deductive software can build on the infrastructure made available through the repository. Deductive software is especially suitable for organization into a repository because the field is rich in algorithmic ideas, the software is complex but admits rigorous study, and the interfaces can usually be crisply characterized.

Synergy.

The software components can themselves become the focus of collaboration between different research agendas. One group can build highly efficient decision procedures while another group works on proof objects for these procedures. In this way, the software itself becomes the raw material for further research.

Serendipity:

Once such components become available, novel, exciting, and unforeseen combinations of these components become viable. We can imagine potential uses in engineering, sciences, government, law, linguistics, commerce, and in computation itself.

History:

The repository will include the source code from historically significant pieces of deductive software so that researchers can trace the origin and evolution of specific ideas.

Why now?

This is an opportune moment for the construction of a deductive software repository. There has been tremendous recent progress in both the theory and implementation of deductive software. Many new and potential applications for such software have been identified. Computers have become fast enough so that we can contemplate novel applications of deduction in databases, artificial intelligence, compilers, schedulers, and symbolic interpreters. The component-based architecture of deductive software is now well understood, and powerful component technologies have become available through libraries for logical, algebraic, and geometric manipulations. Various scripting and markup languages make it easier to distribute executable content over the worldwide web. Technologies for integrating components written in different programming languages are now widely available. The developers have realized the need for making deductive software available in open source form so that it can be carefully scrutinized, effectively maintained, and easily adapted. Deductive software is a critical part of the software infrastructure, and the open source movement has argued convincingly that such code should be freely and openly available. The growth of the Internet and the reduced cost of storage have made it technologically feasible to gather and disseminate software through a web-based repository.

The costs of not building QPQ are high, since it solidifies the current Tower of Babel of closed, proprietary software that leads inevitably to a decay of the deductive infrastructure. Novel applications of deduction will be missed, and the opportunity for stimulating greater academic interest in this important discipline will be lost. Exciting avenues for studying the interoperability of software components will remain unexplored.

What are some of the obstacles?

Software is an extremely complex, engineered artifact. It does not easily lend itself to effective management and manipulation. Deductive software manipulates logical context in particularly complicated ways. A few of the obstacles to achieving a useful repository of deductive software components are addressed below. These obstacles raise very interesting research challenges. The construction of the repository might spur researchers to address these challenges.

Different deductive components may rely on incompatible logical foundations. QPQ does not attempt to reconcile the logical foundations underlying deductive software. That would be well beyond the scope of such a repository. Each component must clearly specify its underlying logical assumptions. Users of these components are responsible for employing them in a logical coherent manner. Software for bridging gaps between different semantic foundations through mappings will be an important part of the repository.

Extracting deductive software components is hard. It is not easy to break deductive software into library subroutines as can be done with numerical or algebraic software. The interfaces for deductive software are much more complicated. For example, a decision procedure is usually regarded as a black-box that takes a conjecture and returns a valid or invalid judgment. In practice, a decision procedure has a complicated ask/tell interface that can be used to create and manipulate a context of assertions. The evolution of standards for the APIs of components and families of components is an important aspect of the development of the QPQ repository. As with any kind of software component, the interface is the message. A well-designed interface reveals the useful functionality of a software component without exposing the implementation details.

Achieving interoperability between deductive components is hard. For reasons similar to those above, there have been very few successful instances of interoperating deductive components. Even subtle differences in syntax and semantics can pose significant barriers to interoperability. However, it might be hoped that a repository will provide the context for a different component design philosophy that is more flexible with respect to such differences. The availability of the source code makes it possible to adapt a component so as to achieve certain kinds of interoperability.

Integrating components implemented in disparate languages is challenging. While most languages have mechanisms for foreign calls, and there are protocols such as CORBA for mediating between different components, there are many complex challenges in achieving a multi-language integration. Especially important are mechanisms for achieving control over interrupt and exception handling and garbage collection. It is hoped that these are well-confined problems that can be solved in a permanent and reusable manner. The challenges here are, by and large, not specific to deductive software.

How will the repository be organized?

Any such repository requires the voluntary cooperation of a large number of people, particularly the developers and users of the software. The repository has to be nurtured carefully so that it reflects careful organization with minimal rigidity. Since there is very little prior experience with repositories that one can build on, much of the organization and ground rules will have to evolve out of a small set of principles. These principles are outlined below.

Uniqueness.

There will be exactly one QPQ repository. The repository might be mirrored at many sites, but their contents may not diverge. Sites deviating from this guideline will have their authorization revoked.

Access.

There will be no restrictions on access to the software components on the repository. Anyone will be allowed to browse software components from the repository in conformance with the licensing terms for the software. The opportunity to submit a software component will also be unrestricted, though acceptance to the repository will be based on peer reviews. The authors of the software are free to choose the licensing terms under which their software is published. It will be recommended (but not required) that software be published under terms similar to those for published academic works and in accordance with guidelines in the open source definition given in http://www.opensource.org/docs/definition_plain.html.

Relevance.

Loose guidelines will be developed for judging whether a software contribution is relevant to deduction.

Format.

The submission of software must conform to format guidelines that are evolved for this purpose. At a minimum, submissions must include the names and affiliations of the authors, a system description paper (that may be published elsewhere), API documentation, the source files, and benchmarks. The choices of operating systems and programming languages will have to be restricted to certain standard and popular versions since it will not be feasible to support all the available options.

Quality control.

The repository is intended to serve as an online journal for deductive software. Submissions will be peer-reviewed for quality and will have to meet minimum standards of clarity, efficiency, and reliability. However, software publication is different from journal publication in that software can be continuously improved. From this point of view, any published software in the repository will be graded as part of the refereeing process so that improvements can yield a higher grade. Each component will maintain its own help and bug bulletin boards.

Version control.

Open source development often involves collaboration between developers who are not co-located. Version control is needed to ensure the consistency of the code under development. It is also needed to ensure that prior versions of a software component remain indefinitely available.

Academic credit.

Even though software is a serious contribution to science, it is very difficult to obtain tangible credit for it. The source code can contain ideas and solutions to problems that are of wider interest. If the source code can be cited in the scientific literature, such a reference can be verified by others and the sources of ideas can be properly credited.

Related Projects

The open source movement grew out of academia as an attempt to promote software as a scientific and cultural artifact that is not shackled by proprietary ownership. Richard Stallman and the Free Software Foundation have been among the most vocal proponents of the idea that software should be open and shared. The movement took off with the arrival of the World Wide Web as a communication and collaboration medium, and the spread of Linux. Open source development has many advantages. The ability to view the sources makes it easier for a large number of people with a wide range of skills to contribute ideas, improvements, and bug fixes. Source code contains ideas and components that are more widely useful The QPQ repository builds on these tenets of the open source movement but also acknowledges that software does contain scientific contributions and ideas that independently deserve scrutiny, publication, dissemination, and archiving. We list some of the relevant efforts that are related to QPQ.

Source Forge (http://sourceforge.net) offers support for locating and managing open source development projects. CollabNet (http://www.collab.net) is a similar effort that helps companies develop software in a collaborative (possibly, open source) process.

The Open Channel Foundation (http://www.openchannelfoundation.org) is an effort from the academic community to provide a forum for publishing open source software but without any editorial supervision. The netlib repository (http://www.netlib.org) is a library of numerical software packages. The Repository in a Box project (http://www.nhse.org/RIB/) provides software tools for creating and managing a distributed repository. The bioinformatics community has set up an open source repository for bioinformatics software at http://bioinformatics.org. The National Partnership for Advanced Computational Infrastructure (http://www.npaci.edu) is an NSF-funded effort to coordinate the scientific computational resources across a number of centers.

The VIS system (http://www-cad.eecs.berkeley.edu/Respep/Research/vis/) has an exemplary presentation of its source code, but such presentations are quite rare in the deductive software community. Several BDD packages and associated benchmarks are available through http://www.bdd-portal.org. The QED project (http://www-unix.mcs.anl.gov/qed/) has the goal of finding a common semantic foundation for mathematical discourse that unifies all the different systems of automated reasoning. The MathWeb initiative at http://www.mathweb.org is a repository for sharing and exchanging formalized mathematics. The QPQ project will be closely associated with the MathWeb project. The HE$\Lambda$M project (http://www.CS.UniBO.it/helm/) is aimed at building a hypertext library for mathematics that can be used for integrating different tools. The PROSPER project (http://www.dcs.gla.ac.uk/prosper/ ) is an attempt to provide a tool bus for deductive software tools. The ETI project (http://www.eti-service.org) is another such tool bus effort. Tool integration is an important aspect of the QPQ repository. QPQ is not solely aimed at tool integration, but it is meant to facilitate it.

Many repositories have pointers to the locations of software packages. Only a few of these attempt to provide editorial supervision, organization, and coordination. We will rely on the software and experience of the popular repositories while attempting to create a model for software publishing that is worthy of wider emulation.

Conclusions

Deductive software is a core part of the scientific software infrastructure. Deductive mechanisms can be used for the manipulation of semantically rich data, for high-level computation, and for assuring the safety and reliability of software. As science and technology become increasingly reliant on software, the need for high quality deductive software becomes more urgent. We therefore need efficient means for the production, maintenance, and distribution of deductive software components. The QPQ repository will serve as an online journal for the publication of deductive software. We hope that QPQ will become a model for the development and dissemination of scientific software.


« July 2008 »
Su Mo Tu We Th Fr Sa
1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30 31
What's up ?
Be notified when a document is published in this folder or below.
 

Powered by Plone, the Open Source Content Management System

This site conforms to the following standards: