A Correspondence betweenType Checking via Reduction and Type Checking via Evaluation
Ilya Sergey
∗
,a,1
, Dave Clarke
a
a
DistriNet & IBBT, Dept. Computer Science, Katholieke Universiteit LeuvenCelestijnenlaan 200a, bus 2402, B3001 LeuvenHeverlee, Belgium
Abstract
We describe a derivational approach to proving the equivalence of different representations of a type system. Different ways of representing type assignments are convenient for particular applications such asreasoning or implementation, but some kind of correspondence between them should be proven. In thispaper we address two such semantics for type checking: one, due to Kuan et al., in the form of a termrewriting system and the other in the form of a traditional set of derivation rules. By employing a set of techniques investigated by Danvy et al., we mechanically derive the correspondence between a reductionbased semantics for typechecking and a traditional one in the form of derivation rules, implemented as arecursive descent. The correspondence is established through a series of semanticspreserving functionalprogram transformations.
Key words:
compositional evaluators, type checkers, continuationpassing style, defunctionalization,refunctionalization
1. Introduction
Awelldesignedtypesystemmakesatradeoffbetweentheexpressivenessofitsdeﬁnitionandtheeffectiveness of its implementation. Traditionally, type systems are described as collections of logical inferencerules that are convenient to reason about. Computationally, such a model, implemented straightforwardlyin a functional programming language, corresponds to a recursive descent over the inductivelydeﬁned language syntax. However, other more algorithmic representations allow one to reason about computationalaspects of a type inference procedure. As an example of such a system, we consider a reduction semanticsfor type checking, proposed initially by Kuan et al. [14]. Deﬁned as a set of termreduction rules, sucha termrewriting system gives an operational view on the semantics of type checking, which is useful fordebugging complex type systems since the developer can trace each step of the type computation. Dealingwith such a termrewriting system requires that one explicitly shows that the underlying type inferencealgorithm is equivalent to the traditional system described as a set of derivation rules. For this purpose,appropriate soundness and completeness theorems need to be proven [12].In this paper a correspondence between a traditional type system and a corresponding reductionbasedsemantics for type inference is provided by the construction and interderivation of their computationalcounterparts. Thus no soundness and completeness theorems need to be proven: they are instead corollariesof the correctness of interderivation and of the initial speciﬁcation. Starting from the implementationof a reductionbased semantics, we employ a set of techniques, investigated by Danvy et al. [1, 4, 6,7, 8, 9], to eventually obtain a traditional recursive descent for typechecking via a series of semanticspreserving functionalprogram transformations. The transformations we use are offtheshelf [5] and weinvite an interested reader to take a look on the overview of the available techniques with references to thecorresponding correctness proofs [4].
∗
Corresponding author
Email address:
ilya.sergey@cs.kuleuven.be
(Ilya Sergey)
1
This work was carried out in September 2010 while the ﬁrst author was visiting the Department of Computer Science at AarhusUniversity.
Preprint submitted to Elsevier September 12, 2011
SLC H
e
::
=
n

x

λ
x
:
τ
.
e

e e

τ
→
e

numCTX
T
::
=
T e

τ
T

τ
→
T

[ ]
TYPE
τ
::
=
num

τ
→
τ
n
::
=
number T
[
n
]
→
t
T
[
num
]
[tcconst]
T
[
λ
x
:
τ
.
e
]
→
t
T
[
τ
→ {
τ
/
x
}
e
]
[tclam]
T
[(
τ
1
→
τ
2
)
τ
1
]
→
t
T
[
τ
2
]
[tc
τβ
]
Hybrid language and typechecking contexts Typechecking reduction rules
Figure 1: Reduction semantics of
λ
H
(
x
:
τ
∈
Γ
)
Γ
⊢
x
:
τ
[tvar]
Γ
,
x
:
τ
1
⊢
e
:
τ
2
Γ
⊢
λ
x
:
τ
1
.
e
:
τ
1
→
τ
2
[tlam]
Γ
⊢
e
1
:
τ
1
→
τ
2
Γ
⊢
e
2
:
τ
1
Γ
⊢
e
1
e
2
:
τ
2
[tapp]
Γ
⊢
number
:
num [tnum]
Figure 2: Type system for the simply typed lambda calculus
1.1. Starting point: a hybrid language for type checking
We consider a reduction system for type checking the simply typed lambda calculus. The system wassrcinally proposed by Kuan et al. [14] and is presented as a case study in the scope of PLT Redex framework [13]. The approach scales to CurryHindley type inference and HindleyMilner letpolymorphism.The techniques presented in the current paper can be adjusted for these cases by adding uniﬁcation variables, so for the sake of brevity we examine only the simplest model. The hybrid language
λ
H
and itssemantics are described in Figure 1. The reduction system introduces a typechecking context
T
that induces a leftmost, innermost order of reduction. Variable occurrences are replaced by their types at themoment a
λ
abstraction is reduced according to rule
[tclam]
. Rule
[tclam]
also introduces the arrow typeconstructor. Finally, rule
[tc
τβ
]
syntactically matches the function parameter type against an argumenttype.The classical way to represent type checking is via a collection of logical derivation rules. Such rulesfor the simply typed lambda calculus are given in Figure 2. According to Kuan et al., a complete typereduction sequence is one that reduces to a type, which corresponds to a welltyped term. The followingtheorem states that a complete type reduction sequence corresponds to a complete type derivation proof tree for a welltyped term in the host language and vice versa.
Theorem 1.1.
[14] (Soundness and Completeness for
→
t
)
For any e and
τ
,
/0
⊢
e
:
τ
iff e
→
∗
t
τ
The question we address in this paper is whether a natural correspondence between these semanticsexists which avoids the need for the soundness and completeness theorems. The answer to this question ispositive and below we show how to derive a traditional typechecker mechanically from the given rewritingsystem. The contribution of the paper is a demonstation of how the application of wellstudied programderivations to type checkers is helpful to show an equivalence between them.
1.2. Paper outline
The remainder of the paper is structured as follows. Section 2 gives an overview of our method,enumerating the techniques involved. Section 3 describes an implementation of the hybrid language and itsreduction semantics in Standard ML. Section 4 describes a set of program transformations correspondingto the transition from the reductionbased semantics for type inference to a traditional recursive descent.Section 5 provides a brief survey of related work. Section 6 concludes.
2. Method overview
The overview of the program metamorphoses is shown in Figure 3. We start by providing the implementation of a hybrid language for the simply typed lambda calculus, a notion of closures in it and2
.
ReductionBasedType Checker
Refocusing (
§
4.1)+Contraction inlining (
§
4.2)
RecursiveDescentReductionFreeType Checker
Lightweight Fusion (
§
4.3)Transition Compression (
§
4.4)
BigStepCEK machine
DirectStyle Transform (
§
4.8)+Refunctionalization (
§
4.7)+Switching domains (
§
4.6)
Figure 3: Interderivation
a corresponding reduction semantics via contraction as a starting point for further transformations (Section 3). The reductionbased normalization function is transformed to a family of reductionfree normalization functions, i.e., ones where no intermediate closure is ever constructed. In order to do so, we ﬁrstrefocus the reductionbased normalization function to obtain a smallstep abstract machine implementingthe iteration of the refocus function (Section 4.1). After inlining the contraction function (Section 4.2), wetransform this smallstep abstract machine into a bigstep one applying a technique known as “lightweightfusion by ﬁxedpoint promotion” [7] (Section 4.3). This machine exhibits a number of corridor transitions,which we compress (Section 4.4). We then ﬂatten its conﬁgurations and rename its transition functionsto something more intuitive (Section 4.5). We also switch domains of evaluator functions to factor outartifacts of the hybrid language (Section 4.6). The resulting abstract machine is in defunctionalized form,so we refunctionalize it (Section 4.7). The result is in continuationpassing style, so we transform it intodirect style (Section 4.8). The ﬁnal result is a traditional compositional typechecker.Standard ML (SML) [15] is used as a metalanguage. SML is a staticallytyped, callbyvalue languagewith computational effects. In Section 4.8 we rely on the library of undelimited continuations to modeltoplevel exceptions. For the sake of brevity, we omit most program artifacts (sometimes only giving theirsignature), keeping only essential parts to demonstrate the corresponding program transformation.
2
Ateach transformation stage the trailing index of all involved functions is incremented.
3. A ReductionBased Type Checker
This section provides the initial implementation of
λ
H
in SML, which will be used for further transformations in Section 4.
3.1. Reductionbased hybrid term normalization
The reductionbased normalization of hybrid terms is implemented by providing an abstract syntax, anotion of contraction and a reduction strategy. Then we provide a onestep reduction function that decomposes a nonvalue closure into a potential redex and a reduction context, contracts the potential redex, if itis actually one, and then recomposes the context with the contractum. Finally we deﬁne a reductionbasednormalization function that repeatedly applies the onestep reduction function until a value (i.e., an actualtype of an expression) is reached.In the speciﬁcation of
λ
H
, the contraction of lambda expressions (rule
[tclam]
) is speciﬁed using ametalevel notion of captureavoiding substitutions. However, most implementations do not use actualsubstitutions and keep an
explicit representation
of what should be substituted on demand, leaving theterm untouched [10, pages 100–105]. To model
explicit substitutions
, we chose the applicative orderversion of Curien’s calculus, which uses closures, i.e, terms together with their lexical environment [3]
3
.The environments map variables to values (i.e., types in this case) while reducing an expression, whichcorresponds do the captureavoiding substitution strategy [5, Section 6]. The chosen calculus allows usto come eventually in Section 4 to a wellknown representation of a typechecking algorithm with anenvironment
Γ
, which predictably serves the same purpose.
2
The accompanying code and the technical report with detailed listings are available from
http://people.cs.kuleuven.be/ilya.sergey/typereduction/
3
The cited paper also relates values in the language of closures with values in
λ
calculus (see Section 2.5).
3
3.2. Abstract syntax of
λ
H
: closures and values
The abstract syntax for
λ
H
, which is presented in Figure 1, is described in SML below. It includesinteger literals, identiﬁers, lambdaabstractions, applications as well as “hybrid” elements such as numerictypes and arrows
τ
→
e
. Types are either numeric types or arrow types. The special value
T_ERROR s
is usedfor typing errors; it cannot be a constituent of any other type.
datatype
typ = T_NUM T_ARR
of
typ * typ T_ERROR
of
string
datatype
term = LIT
of
int IDE
of
string LAM
of
string * typ * term APP
of
term * term
datatype
hterm = H_LIT
of
int H_IDE
of
string H_LAM
of
string * typ * hterm H_APP
of
hterm * hterm H_TARR
of
typ * hterm H_TNUM
Typing environments
TEnv
represent bindings of identiﬁers to types, which are values in the hybridlanguage. In order to keep to the uniform approach for different semantics for type inference [18], weleave environments parametrized by the type parameter
’a
, which is instantiated with
typ
in this case.
signature
TEnv =
sigtype
’a gamma
val
empty : (string * ’a) gamma
val
extend : string * ’a * (string * ’a) gamma > (string * ’b) gamma
val
lookup : string * (string * ’a) gamma > ’a option
end
We introduce closures into the hybrid language in order to represent the environmentbased reductionsystem. A closure can either be a number, a ground closure pairing a term and an environment, a combination of closures, a closure for a hybrid arrow expression, or a closure for a value arrow element, namelyan arrow type. A value in the hybrid language is either an integer or a function type. Environments bindidentiﬁers to values.
datatype
closure = CLO_NUM CLO_GND
of
hterm * bindings CLO_APP
of
closure * closure CLO_ARR
of
typ * closure CLO_ARR_TYPE
of
typ
withtype
bindings = typ TEnv.gamma
We also specify the corresponding embeddings values to closures and terms to hybrid terms (the deﬁnitions are omitted):
val
type_to_closure : typ > closure
val
term_to_hterm : term > hterm
3.3. Notion of contraction
A potential redex is either a numeric literal, a ground closure pairing an identiﬁer and an environment,an application of a value to another value, a lambdaabstraction to be typereduced, an arrow type, or aground closure pairing a term application and an environment.
datatype
potential_redex= PR_NUM PR_IDE
of
string * bindings PR_APP
of
typ * typ PR_LAM
of
string * typ * hterm * bindings PR_ARR
of
typ * typ PR_PROP
of
hterm * hterm * bindings
A potential redex may trigger a contraction or it may get stuck. These outcomes are captured by thefollowing datatype:4
datatype
contract_or_error = CONTRACTUM
of
closure ERROR
of
string
The string content of
ERROR
is an error message.The contraction function
contract
reﬂects the typechecking reduction rules for
λ
H
. For instance, anyinteger literal contracts to a number type
T_NUM
, a lambda expression contracts to an arrow expression of the hybrid language, and the contraction of a potential redex
PR_APP
checks whether its ﬁrst parameter is afunction type and its parameter type matches the argument of the application.
(* contract: potential_redex > contract_or_error *)
fun
contract PR_NUM= CONTRACTUM CLO_NUM contract (PR_ARR (t1, t2))= CONTRACTUM (type_to_closure (T_ARR (t1, t2))) contract (PR_IDE (x, bs))= (
case
TEnv.lookup (x, bs)
of
NONE => ERROR "undeclared identifier" (SOME v) => CONTRACTUM (type_to_closure v)) contract (PR_LAM (x, t, e, bs))= CONTRACTUM (CLO_GND (H_TARR (t, e), TEnv.extend (x, t, bs))) contract (PR_APP (T_ARR (t1, t2), v))=
if
t1 = v
then
CONTRACTUM (type_to_closure t2)
else
ERROR "parameter type mismatch" contract (PR_PROP (t0, t1, bs))= CONTRACTUM (CLO_APP (CLO_GND (t0, bs), CLO_GND (t1, bs))) contract (PR_APP (t1, t2))= ERROR "nonfunction application"
A nonvalue closure is stuck when an identiﬁer does not occur in the current environment or nonfunction type is used in a function position or a function parameter’s type does not correspond to the actualargument’s type.
3.4. Reduction strategy
Reduction contexts are deﬁned as follows:
datatype
hctx = CTX_MT CTX_FUN
of
hctx * closure CTX_ARG
of
typ * hctx CTX_ARR
of
typ * hctx
A context is a closure with a hole, represented insideout in a zipperlike fashion. Following the description of
λ
H
’s reduction semantics we seek the leftmost innermost potential redex in a closure. Inorder to reduce a closure, it is ﬁrst decomposed. The closure might be a value and not contain any potential redex or it can be decomposed into a potential redex and a reduction context. These possibilities arecaptured by the following datatype:
datatype
type_or_decomposition = VAL
of
typ DEC
of
potential_redex * hctx
A decomposition function recursively searches for the leftmost innermost redex in a closure. Examples of some speciﬁc decomposition functions may be found in recent work of Danvy [5]. In our implementation we deﬁne decomposition (
decompose
) as a bigstep abstract machine with two statetransitionfunctions,
decompose_closure
and
decompose_context
. The former traverses a given closure and accumulatesthe reduction context until it ﬁnds a value and the latter dispatches over the accumulated context to determinewhetherthegivenclosureisavalueorapotentialredex. Thefunction
decompose
startsbydecomposinga closure within an empty context. For the full deﬁnition of the decomposition functions, see the accompanying code. The recomposition function
recompose
takes a context and a value to embed, peels off contextlayers and iteratively constructs the resulting closure. The signatures of these functions are:
val
decompose_closure : closure * hctx > type_or_decomposition
val
decompose_context : hctx * typ > type_or_decomposition
val
decompose : closure > type_or_decomposition
val
recompose : hctx * closure > closure
5