Type-based modeling
Design of DSLs
In our DSLs, we have tree important components: constructors, combinators, and run functions
Sometimes, the DSL type-signature for combinators might allow to construct objects such that have no semantics
As a result, run functions need to dynamically check that objects "make sense" before running them
A simple (non-monadic) DSL for integers and booleans expressions
The DSL
-- Type of objects data Expr -- Constructors int :: Int -> Expr bool :: Bool -> Expr -- Combinators plus :: Expr -> Expr -> Expr equal :: Expr -> Expr -> Expr if :: Expr -> Expr -> Expr -> Expr -- Run type Value eval :: Expr -> Value
Observe from the interface that the combinators can compose elements of different nature, which might trigger errors in the run function
What is the semantics of the following expressions?
ifc (lit 3) (lit 42) (bool False)
eqc (lit 3) (bool True)
The DSL allows to build both well- and ill-typed expressions!We can start given a complex (and arguably wrong) semantics to bad expressions (JavaScript does that!), e.g.,
1 + True == 1
, etc.We assume that ill-typed expressions have no semantics!Therefore, the run function needs to trigger errors when expressions are bad.
An EDSL implementation
We implement
Expr
as followsdata Expr where LitI :: Int -> Expr LitB :: Bool -> Expr (:+:) :: Expr -> Expr -> Expr (:==:) :: Expr -> Expr -> Expr If :: Expr -> Expr -> Expr -> Expr
The interesting case is the run function
eval :: Expr -> Value
Since expressions can be reduced to integers or booleans, an element of type
Value
is either an integer or a boolean.data Value = VInt Int | VBool Bool
The run function then checks that expressions are evaluated to the expected type of values at every combinator.
eval :: Expr -> Value eval (LitI n) = VInt n eval (LitB b) = VBool b eval (e1 :+: e2) = plus (eval e1) (eval e2) where plus (VInt n) (VInt m) = VInt $ n + m plus _ _ = error "plus: type error" eval (e1 :==: e2) = eq (eval e1) (eval e2) where eq (VInt n) (VInt m) = VBool $ n == m eq (VBool a) (VBool b) = VBool $ a == b eq _ _ = error "equal: type error" eval (If e1 e2 e3) = case eval e1 of VBool True -> eval e2 VBool False -> eval e3 _ -> error "if: type error"
Ill-typed expressions raise errors when evaluating them
eBad2 = If (LitI 20) (LitI 1) (LitI 7)
> eval eBad2 *** Exception: if: type error!
The picture
- Can we avoid constructing bad expressions at all?
A EDSL with type-aware combinators
Now, for every type, we have a new kind of expressions
data ExprI -- Integer expressions data ExprB -- Boolean expressions
Combinators can be defined on the right "kind of expressions"
(:+:) :: ExprI -> ExprI -> ExprI If :: ExprB -> ExprI -> ExprI -> ExprI
While this looks attractive, and indeed helps, it requires to introduce many new constructors to deal with all the possible "good" ways to build expressions.
data ExprI where LitI :: Int -> ExprI (:+:) :: ExprI -> ExprI -> ExprI IfI :: ExprB -> ExprI -> ExprI -> ExprI data ExprB where LitB :: Bool -> ExprB EqI :: ExprI -> ExprI -> ExprB EqB :: ExprB -> ExprB -> ExprB IfB :: ExprB -> ExprB -> ExprB -> ExprB
Observe the reproduction of equality constructors (
EqI
andEqB
) and branches (IfI
andIfB
) depending on the type of arguments they take.There are two run functions, one for each type of expressions
evalI :: ExprI -> Int evalI (LitI n) = n evalI (e1 :+: e2) = evalI e1 + evalI e2 evalI (IfI b t e) = if evalB b then evalI t else evalI e evalB :: ExprB -> Bool evalB (LitB b) = b evalB (EqI e1 e2) = evalI e1 == evalI e2 evalB (EqB e1 e2) = evalB e1 == evalB e2 evalB (IfB b t e) = if evalB b then evalB t else evalB e
Observe that
evalI
andevalB
do not raise type-errors.We can use this EDSL to validate a value of type
Expr
In other words, we take a value of type
Expr
and map it into eitherExprI
orExprB
.If successful, we can run a value of type
Expr
usingevalA
orevalB
, so no type errors!We need to write a function which infers the type for a value of type
Expr
infer :: Expr -> ?
There are three possible outcomes for
infer e
, i.e., it might produce an error (e
has no type) or either a value of typeExprI
orExprB
.import qualified Expr as E data TypedExpr = TInt ExprI | TBool ExprB infer :: E.Expr -> Maybe TypedExpr
The easy cases are the constructors.
infer (E.LitB b) = return $ TBool (LitB b) infer (E.LitI i) = return $ TInt (LitI i)
For addition, we need both argument to be integer expressions.
infer (e1 E.:+: e2) = do te1 <- infer e1 te2 <- infer e2 inferPlus te1 te2
Function
inferPlus
checks that bothte1 :: TypedExpr
andte2 :: TypedExpr
are integers and, if that is the case, it constructs the corresponding value of typeExprI
, i.e., a sum.inferPlus :: TypedExpr -> TypedExpr -> Maybe TypedExpr inferPlus (TInt e1) (TInt e2) = Just (TInt (e1 :+: e2)) inferPlus _ _ = Nothing
In a similar manner, we write the cases for equality and branches.
infer (e1 E.:==: e2) = do te1 <- infer e1 te2 <- infer e2 inferEq te1 te2 infer (E.If b t e) = do tb <- infer b tt <- infer t te <- infer e inferIf tb tt te
**Exercise**: write `inferEq` and `inferIf`We write
eval
only for typed expressions so we guarantee no type errorseval :: TypedExpr -> E.Value eval (TInt ei) = E.VInt $ evalI ei eval (TBool eb) = E.VBool $ evalB eb
Problems with this approach:
Duplication of constructors to denote "the same operation" but on different type of expressions
It does not scale up for type constructors (e.g., lists, tuples, monads, etc.)
For instance, if we want to introduce lists, we need to define a type for list expressions
List of integers.
data ExprLI where EmptyLI :: ExprLI ConsLI :: ExprI -> ExprLI -> ExprLI
data ExprB where ... EqLI :: ExprLI -> ExprLI -> ExprB
List of booleans.
data ExprLB where EmptyLB :: ExprLB ConsLB :: ExprB -> ExprLB -> ExprLB
data ExprB where ... EqLI :: ExprLI -> ExprLI -> ExprB EqLB :: ExprLB -> ExprLB -> ExprB
What about having lists of lists?
List of lists of integers.
data ExprLLI where EmptyLLI :: ExprLLI ConsLLI :: ExprLI -> ExprLLI -> ExprLLI
data ExprB where ... EqLI :: ExprLI -> ExprLI -> ExprB EqLB :: ExprLB -> ExprLB -> ExprB EqLLI :: ExprLLI -> ExprLLI -> ExprB
List of lists of booleans.
data ExprLLB where EmptyLLB :: ExprLLB ConsLLB :: ExprLB -> ExprLLB -> ExprLLB
data ExprB where ... EqLI :: ExprLI -> ExprLI -> ExprB EqLB :: ExprLB -> ExprLB -> ExprB EqLLI :: ExprLLI -> ExprLLI -> ExprB EqLLB :: ExprLLB -> ExprLLB -> ExprB
What about lists of lists of lists of integers/booleans?
Enter GADTs
We use GADTs of the form
Expr t
, wheret
is a Haksell type which indicates the value that the expression denotesdata Expr t where LitI :: Int -> Expr Int LitB :: Bool -> Expr Bool (:+:) :: Expr Int -> Expr Int -> Expr Int (:==:) :: Eq t => Expr t -> Expr t -> Expr Bool If :: Expr Bool -> Expr t -> Expr t -> Expr t
Observe constructors
LitI
andLitB
produce integer and boolean expressions, respectively. The combinator(:+:)
takes integer expressions, i.e., arguments of typeExpr Int
, and produces an integer expression as well. Equality of expressions of typeExpr t
requires thatt
supports equality in Haskell -- see the type constraintEq t
. ConstructorIf
takes a boolean expression as the guard (Expr Bool
) and both branches need to return the same type of expressions (Expr t
).There is no way to construct ill-typed expression in `Expr t`The run function is simply defined as follows.
eval :: Expr t -> t eval (LitI n) = n eval (LitB b) = b eval (e1 :+: e2) = eval e1 + eval e2 eval (e1 :==: e2) = eval e1 == eval e2 eval (If b t e) = if eval b then eval t else eval e
We can forget that an expression is typed
To go from typed expressions to untyped ones is easy! (type elimination)To go from untyped expressions to typed ones is not straightforward! (type inference)Type elimination
forget :: Expr t -> E.Expr forget e = case e of LitI n -> E.LitI n LitB b -> E.LitB b e1 :+: e2 -> forget e1 E.:+: forget e2 e1 :==: e2 -> forget e1 E.:==: forget e2 If e1 e2 e3 -> E.If (forget e1) (forget e2) (forget e3)
By doing so, we can reuse the pretty printing for the untyped version of
Expr
instance Show (Expr t) where showsPrec p e = showsPrec p (forget e)
Type inference is a bit more involved
If
e
is an expression which produces a boolen, theninfer e :: Maybe (Expr Bool)
If
e
is an expression which produces an integer, theninfer e :: Maybe (Expr Int)
So, what is the type signature for
infer
?infer :: E.Expr -> Maybe ?
The returning type depends on the (value of the) argument of the function!
Challenge
Our inference function will have to convince the Haskell type checker to allow us to construct an element of `Expr t` from an untyped expression `E.Expr`.
Existential types
When the returning type of a function depends on the (value of the) argument of a function, we can use existential types to make them type-check
data TypedExpr where (:::) :: forall t. Eq t => Expr t -> TypedExpr
It is counter intuitive to use
forall t
to express an existential type. The contructor(:::)
can be instantiated with any typet
which supports equality.Observe that type
t
does not show up in the data type declarationTypedExpr
. If we have a valued :: TypedExpr
, we know that it is built with the constructor(:::)
. Observe, however, we cannot guess which typet
has been used, thus we only know that there exists a typet
(with equality) which buildsd
!Typing
infer
infer :: E.Expr -> Maybe TypedExpr
We add more information to the constructor
(:::)
to include not only the expression but its type as welldata Type t where TInt :: Type Int TBool :: Type Bool data TypedExpr where (:::) :: forall t. Eq t => Expr t -> Type t -> TypedExpr
Observe that
(e ::: te) :: TypedExpr
indicates that expressione :: Expr t
has typete :: Type t
.Observe that types
Type Int
andType Bool
have only one inhabitant, respectively (we excludeundefined
).This kind of types are known as singleton types. Why is it useful?
Singleton types
There is a one-to-one correspondence between terms and types
This correspondence is useful to guide Haskell's type-system by coding using the singleton's types inhabitants.
In our example, function
infer
is going to use them to indicate that (i) expressions are either typed asExpr Int
orExpr Bool
, and (ii) some expressions should have the same type.
Implementing type inference
We start by defining a mechanisms to tell Haskell that two types must be the same.
data Equal a b where Refl :: Equal a a
Equal a b
is a type of proofs that two typesa
andb
are equal. The only way to prove two types equal is if they are in fact the same, and then the proof isRefl
.Evaluating one of these proofs to
Refl
will convince GHC's type checker that the two type arguments are indeed equal (how else could the proof beRefl
?).For instance, if we have a proof
p :: Equal Int Int
, then we know thatp
isRefl
sinceRefl :: Equal Int Int
and it is the only inhabitant for that type.On the other hand,
p :: Equal Int Float
cannot be constructed. Do you see why? The typeEqual Int Float
has no inhabitant, i.e., there is no constructor to build it (Refl
does not work for this case).We construct a function which takes two types for expressions and hints Haskell if they have the same type
(=?=) :: Type s -> Type t -> Maybe (Equal s t)
This function will only succeed when type
s
is equal to typet
.How are we going to indicate that in the body of the function? After all, functions implementations do not talk about types!
We have singleton types, so we can use their inhabitants to refer to types from the function body.
(=?=) :: Type s -> Type t -> Maybe (Equal s t) TInt =?= TInt = Just Refl TBool =?= TBool = Just Refl _ =?= _ = Nothing
In the
infer
function, we can make heavy use of the fact that pattern matching on aType t
or anEqual s t
will tell GHC's type checker interesting things abouts
andt
Inferring types for constructors
infer :: E.Expr -> Maybe TypedExpr infer e = case e of E.LitI n -> return (LitI n ::: TInt) E.LitB b -> return (LitB b ::: TBool)
Because the use of
LitI
(LitB
), GHC knows thatLitI n :: Expr Int
(LitB b :: TBool
), and consequently has typeType Int
. Consequently, we could have placeundefined
instead ofTInt
(TBool
) -- after all, GHC's type system already knows everything -- andinfer
will still type checks. However, it might produceinfer
to crash later on (see other cases)Inferring types for addition
infer :: E.Expr -> Maybe TypedExpr infer e = case e of ... r1 E.:+: r2 -> do e1 ::: TInt <- infer r1 e2 ::: TInt <- infer r2 return (e1 :+: e2 ::: TInt)
By doing pattern-matching using
TInt
(i.e.,e1 ::: TInt
ande2 ::: TInt
), we are telling GHC thatinfer r1 :: Maybe TypedExpr
andinfer r2 :: Maybe TypedExpr
needs to construct integer expressionse1 :: Expr Int
ande2 :: Expr Int
, respectively.Inferring types for equality
This case is interesting.
infer :: E.Expr -> Maybe TypedExpr infer e = case e of ... r1 E.:==: r2 -> do e1 ::: t1 <- infer r1 e2 ::: t2 <- infer r2 ... return (e1 :==: e2 ::: TBool)
We know that the result is something of type
Expr Bool
. We indicate by using(:===:)
andTBool
in the returned value.However, the expressions
e1
ande2
might have any type (e.g.,Expr Int
, orExpr Bool
) as long as they are the same. We hint that to GHC's type system by using function(=?=)
and pattern-matching on its result.infer :: E.Expr -> Maybe TypedExpr infer e = case e of ... r1 E.:==: r2 -> do e1 ::: t1 <- infer r1 e2 ::: t2 <- infer r2 Refl <- t1 =?= t2 return (e1 :==: e2 ::: TBool)
Inferring types for branches
We hint GHC that the guard needs to be boolean (by applying pattern-matching), and that the branches have the same type (by applying function
(=?=)
and pattern-matching on its result).infer :: E.Expr -> Maybe TypedExpr infer e = case e of ... E.If r1 r2 r3 -> do e1 ::: TBool <- infer r1 e2 ::: t2 <- infer r2 e3 ::: t3 <- infer r3 Refl <- t2 =?= t3 return (If e1 e2 e3 ::: t2)
Type checking and evaluation
We can do type checking by inferring a type and comparing it to the type we expect.
check :: E.Expr -> Type t -> Maybe (Expr t) check r t = do e ::: t' <- infer r Refl <- t' =?= t return e
Evaluation
As we did before, we first type-check an expression and then run it.
evalT :: E.Expr -> Maybe E.Value
Recall that
E.Value
has the constructorsVBool Bool
andVInt Int
.In this light, if an expression
E.Expr
has typeType Int
, thenevalT
should returnVInt n
(for somen
).evalTI :: E.Expr -> Maybe E.Value evalTI e = do i <- check e TInt return (E.VInt $ eval i)
Observe the use of
TInt
to hint GHC's type system.Similarly, we have a function for expression with type
Type Bool
.evalTB :: E.Expr -> Maybe E.Value evalTB e = do b <- check e TBool return (E.VBool $ eval b)
Observe that we are calling
eval
from our original approach (the one that might report type-errors). However, this function is not going to raise any of them!Function
evalT
simply tries to type expressions asType Bool
orType Int
and returns the appropriated value.evalT :: E.Expr -> Maybe E.Value evalT e = evalTB e `mplus` evalTI e
Summary
Adding type-inference / type-checking to DSLs
- Avoid constructing semantically bad entities
- No type-errors reported by the run function
GADTs
- It allows to express some invariants about how constructors and combinators should work (e.g., the guard of a branch is a boolean expression)
We verify that entities built with the DSL interface are well-typed, i.e., they accommodate to the GADTs invariants
We implement type-inference for the elements built with the DSL interface
We use existential types to give a type to the inference function
We use singleton types to hint GHC's type-system about the expected types for the GADTs elements
We use a proof of equality to hint GHC's type-system when two GADTs values should have the same type.