Research Themes in Programming Language Semantics
The words pi and π can both mean, or denote, the well known irrational number 3.14159263... Syntax is the study of the principles through which words (such as pi), or sentences, of languages are defined. A programming language manual tells you how to construct "syntactically correct" programs. Semantics is the study of meaning. It concerns the relationship between things that "signify", such as pi, and what is "signified", such as 3.14159263... We could have a different relationship: I could tell you that pi means 42. The word "pi" now has a different semantics. Syntax and semantics are at the heart of what we broadly term programming language semantics. You may have written a syntactically correct program, but is the semantics correct? Does it do the right thing? What does the program "mean"?
The broad aim of my research is to study and develop semantic models, principles and theories of programming languages and data. The rationale for doing this is
I use mathematical tools to specify programming language semantics. A particular feature of my work is the use of categorical logic and categorical type theory: in these subjects, roughly speaking, one develops universal principles for giving the semantics of programming logics and type theories in categories with suitable semantic structure.
I am involved with programs and programming in a broad sense. I work with existing tools such as C#, F#, Haskell, HOL, Isabelle, Java, ML, Python. I sometimes develop and study new systems of computation, programming and reasoning within these frameworks, especially Isabelle HOL.
The descriptions of my work that appear below are intended for the research community, and especiallly for potential PhD students who may be seeking supervision in one of my research themes or related areas.
This is a new project in 2021, with details to follow.
Language interoperability is the ability of two or more programming languages to interact as a unique, integrated system. A multi-language is a programming language arising from a combination of two or more programming languages. But how can we formally combine programming languages? What is the meaning (semantics) of multi-language programs?
We would like to create a system in which starting from two (or more) programming languages specified according to some given design principles, we obtain a single combination of these languages with its own syntax and semantics and satisfying the same design principles: a multi-language. Why could this be useful? Here is one example. Static analysers are tools that provide analyses of program behaviour at compile time. They usually rely on mathematical frameworks such as abstract interpretation or data-flow equations. But it is very difficult to apply this technique to situations where two separate languages only interoperate, that is, communicate and exchange data. What would be far more useful and robust is to work with a (single) multi-language. The same goes for static or dynamic analysers, interpreters, compilers and so on, and these are all fundamental tools used everyday by computer scientists and engineers. We believe there will be considerable interest in multi-languages since practitioners find it useful to be able to blend key features from different languages. However, as noted, building multi-languages is not easy. There has been some practical work in recent years, but there is no theory of multi-languages. This is an area that is relatively under-developed. A solid theoretical body of knowledge would greatly improve our understanding of this area, and would enable the further development and enhancement of practical programming.
The key aim of the proposed research programme is to deliver a comprehensive theory and practice of multi-languages, and in tandem, demonstrate how multi-languages can be designed, implemented and reasoned about. This work involves the Department of Computer Science, Verona, Italy, and The Department of Computer Science and Technology, Cambridge, UK.
See On Multi-language Abstraction - Towards a Static Analysis of Multi-language Programs and Equational Logic and Set-Theoretic Models for Multi-Languages and Equational Logic and Categorical Semantics for Multi-Languages.
Category theory has played a key role in programming language semantics for many years. The idea that (theories in) both type theory, and logic, correspond to categories with structure is especially important. You will find an account of basic category theory in Categories for Types. The category theory/type theory correspondence, for theories in algebra, higher order functions, second order polymorphic functions, and higher order polymorphic functions, can be found in this book. The idea that one can derive a categorical semantics for a type theory, based on certain assumptions such as the way in which syntactic substitution is modelled by categorical composition, is explored in detail in Deriving Category Theory from Type Theory. Given a category with a specified structure, the corresponding theory is (sometimes) known as its internal language. For an account of the theories that correspond to interaction categories see An Internal Language for Interaction Categories. Since around 1999 there has been considerable interest in the use of nominal techniques to study properties of names. We demonstrate a category theory/type theory correspondence for the nominal lambda calculus and equivariant cartesian closed categories in A Sound and Complete Categorical Semantics for a Nominal Lambda Calculus..
Theorem provers can be used to specify the operational semantics of programming languages; the logic of the system can then be used to verify properties of the language. In Mechanised Operational Semantics via (Co)Induction we code up the semantics of a small functional language, and formally define notions of bisimulation of programs, coinductively, and contextual equivalence, inductively. By making use of Howe's method, we then show that the two notions of semantic equality coincide. This work led us to begin to think about the problems of encoding languages with variable binding. We wanted to combine (co)inductive methods with a formulation of higher order abstract syntax, which was known at the time to present technical difficulties. We developed the Hybrid system in response to these challenges, presented in Combining Higher Order Abstract Syntax with Tactical Theorem Proving and (Co)Induction and discuss alternative methodologies in A Comparison of Formalizations of the Meta-Theory of a Language with Variable Bindings in Isabelle. There is a mathematical model of Hybrid, presented using the theory of logical frameworks. In the paper Representational Adequacy of Hybrid I develop this model, and prove that Hybrid is well-behaved by proving it is representationally adequate. I also give the first detailed proof of the adequacy of locally nameless de Bruijn expressions for the lambda calculus. More recently I worked on a dependently typed version of the Hybrid system
With Andrew Gordon, we solved the problem of how to formally integrate
the semantics of Input/Output with higher order functions, by using
labelled transition systems. This work continued a theme of using
monads (in this case the I/O monad of Plotkin). Our paper
A Sound Metalogical Semantics for Input/Output Effects presents
a neat operational equivalence of programs which is charaterised by
a novel domain-theoretic denotational semantics defined using the
minimal invariants of Freyd and Pitts. Some preliminary work appears in
Factoring and Adequacy Proof. The CSL results were extended in the
Relating Operational and Denotational Semantics for Input/Output
Gluing is a categorical construction that has its origins in topological Artin gluing. Amongst various applications, it has been used to prove the existence and disjunction properties of intuitionistic logic, and also the conservativity of various type theory extensions. Freyd pioneered these proofs, and refered to his construction as sconing. In On Fixpoint Objects and Gluing Constructions it was shown that extensions of equational theories over nat, fix, +, x, T are all conservative at ground type. The gluing construction that I define is a novel variation of the functional sconing methods of Freyd; functions are replaced by (categorical) logical relations that are more powerful than sconing, yet at the same time easier to manipulate. Recently I have been thinking about formulating a version of gluing that involves categories of nominal sets. Since this requires some results about the Yoneda lemma for nominal sets, our findings appear in a dedicated paper The Yoneda Lemma and Cartesian Closure in the FM-World.
In my thesis, Programming Metalogics with a Fixpoint Type, I further developed the theory of computational monads due to Moggi by studying the notion of a fixpoint type fix, first presented in New Foundations for Fixpoint Computations, with a complete account in New Foundations for Fixpoint Computations:FIX Hyperdoctrines and the FIX Logic. An equational theory with types nat, fix, +, x, T is presented in which all endofunctions of type Tα have fixpoints. Such theories are shown to have classifying categories (FIX-categories). A modal predicate logic is defined, expressing properties of terms, with predicates that express properties of computation terms. This logic was used to analyse the static and dynamic semantics of languages similar to PCF, and the results, mainly concerning computational adequacy, appear in Computational Adequacy of the FIX-Logic. A dependent type theory with a universal type was also developed, in which fixpoints of equations over types ("domain equations for recursive types") can be obtained as fixpoints of equations over terms ("ordinary equations") Recursive Types via Fixpoint Objects.
Author: Roy Crole (R.Crole at mcs.le.ac.uk), T: +44 (0)116 252 3404 .