Library STypes

Types: Type Systems

From Software Foundations, modified by Alberto Momigliano to increase automation.
Our next major topic is type systems -- static program analyses that classify expressions according to the "shapes" of their results. We'll restrict to a typed version of the simplest imaginable language, to introduce the basic ideas of types and typing rules and the fundamental theorems about type systems: type preservation and progress.

Set Warnings "-notation-overridden,-parsing,-deprecated-hint-without-locality".
From Coq Require Import Arith.Arith.
From Hammer Require Import Hammer.

Typed Arithmetic Expressions

To motivate the discussion of type systems, let's begin as usual with a tiny toy language. We want it to have the potential for programs to go wrong because of runtime type errors: just two kinds (numbers and booleans) gives us enough material to tell an interesting story.

Syntax

Here is the syntax, informally:
    t ::= true
        | false
        | if t then t else t
        | 0
        | succ 0
        | pred t
        | ? t
And here it is formally:

Inductive tm : Type :=
  | tru : tm
  | fls : tm
  | ite : tm -> tm -> tm -> tm
  | zro : tm
  | scc : tm -> tm
  | prd : tm -> tm
  | iszro : tm -> tm.

Some parsing/pretty printing magic you can ignore

Declare Custom Entry tm.
Declare Scope tm_scope.
Notation "'true'" := true (at level 1): tm_scope.
Notation "'true'" := (tru) (in custom tm at level 0): tm_scope.
Notation "'false'" := false (at level 1): tm_scope.
Notation "'false'" := (fls) (in custom tm at level 0): tm_scope.
Notation "<{ e }>" := e (e custom tm at level 99): tm_scope.
Notation "( x )" := x (in custom tm, x at level 99): tm_scope.
Notation "x" := x (in custom tm at level 0, x constr at level 0): tm_scope.
Notation "'0'" := (zro) (in custom tm at level 0): tm_scope.
Notation "'0'" := 0 (at level 1): tm_scope.
Notation "'succ' x" := (scc x) (in custom tm at level 90, x custom tm at level 80): tm_scope.
Notation "'pred' x" := (prd x) (in custom tm at level 90, x custom tm at level 80): tm_scope.
Notation "'iszero' x" := (iszro x) (in custom tm at level 80, x custom tm at level 70): tm_scope.
Notation "'if' c 'then' t 'else' e" := (ite c t e)
                 (in custom tm at level 90, c custom tm at level 80,
                  t custom tm at level 80, e custom tm at level 80): tm_scope.
Local Open Scope tm_scope.

Values are the observable of our computations. Here we have boolean <{true}>, <{false}>, and numeric values...

Inductive bvalue : tm -> Prop :=
  | bv_True : bvalue <{ true }>
  | bv_false : bvalue <{ false }>.

Inductive nvalue : tm -> Prop :=
  | nv_0 : nvalue <{ 0 }>
  | nv_succ : forall t, nvalue t -> nvalue <{ succ t }>.

Definition value (t : tm) := bvalue t \/ nvalue t.
 Hint Unfold value : core.

Operational Semantics

Here is the single-step relation, first informally...
   (ST_IfTrue)  

if true then t1 else t2 --> t1
   (ST_IfFalse)  

if false then t1 else t2 --> t2
t1 --> t1' (ST_If)  

if t1 then t2 else t3 --> if t1' then t2 else t3
t1 --> t1' (ST_Succ)  

succ t1 --> succ t1'
   (ST_Pred0)  

pred 0 --> 0
numeric value v (ST_PredSucc)  

pred (succ v) --> v
t1 --> t1' (ST_Pred)  

pred t1 --> pred t1'
   (ST_IsZero0)  

iszero 0 --> true
numeric value v (ST_IszeroSucc)  

iszero (succ v) --> false
t1 --> t1' (ST_Iszero)  

iszero t1 --> iszero t1'
... and then formally:

Reserved Notation "t '-->' t'" (at level 40).

Inductive step : tm -> tm -> Prop :=
  | ST_IfTrue : forall t1 t2,
      <{ if true then t1 else t2 }> --> t1
  | ST_IfFalse : forall t1 t2,
      <{ if false then t1 else t2 }> --> t2
  | ST_If : forall c c' t2 t3,
      c --> c' ->
      <{ if c then t2 else t3 }> --> <{ if c' then t2 else t3 }>
  | ST_Succ : forall t1 t1',
      t1 --> t1' ->
      <{ succ t1 }> --> <{ succ t1' }>
  | ST_Pred0 :
      <{ pred 0 }> --> <{ 0 }>
  | ST_PredSucc : forall v,
      nvalue v ->
      <{ pred (succ v) }> --> v
  | ST_Pred : forall t1 t1',
      t1 --> t1' ->
      <{ pred t1 }> --> <{ pred t1' }>
  | ST_Iszero0 :
      <{ iszero 0 }> --> <{ true }>
  | ST_IszeroSucc : forall v,
       nvalue v ->
      <{ iszero (succ v) }> --> <{ false }>
  | ST_Iszero : forall t1 t1',
      t1 --> t1' ->
      <{ iszero t1 }> --> <{ iszero t1' }>

where "t '-->' t'" := (step t t').

Notice that the step relation doesn't care about whether the expression being stepped makes global sense -- it just checks that the operation in the next reduction step is being applied to the right kinds of operands. For example, the term <{ succ true }> cannot take a step, but the almost as obviously nonsensical term
[ <{ succ (if true then true else true) }>]
can take a step (once, before becoming stuck).

Normal Forms vs Values

The first interesting thing to notice is that there are terms that are normal forms (they can't take a step) but are not values (they are not included in our definition of possible "results of reduction"). Such terms are said to be stuck.

Definition stuck (t : tm) : Prop :=
 ~ (exists t', step t t' ) /\ ~ value t.

Example some_term_is_stuck :
  exists t, stuck t.
Proof.
  exists <{ succ false }>.
  sauto lq:on unfold: stuck.
Qed.

However, although values and normal forms are not the same in this language, the set of values is a subset of the set of normal forms. This is important because it shows we did not accidentally define things so that some value could still take a step.

Lemma value_is_nf : forall t,
    value t -> ~ exists t', step t t'.
  Proof.
  intros.
  destruct H.
  - sauto lq:on .
  - induction H;
      sauto lq: on .
Qed.

Determinism of the step relation

Although this is irrelevant wrt type soundness, it's always a good idea to check that the step relation is deterministic. This proof, by induction on the first judgment and inversion on the second is rather brittle to automate and we supplement sauto with some Ltac code to remove some impossible inversion cases.

Ltac solve_by_inverts n :=
  match goal with | H : ?T |- _ =>
  match type of T with Prop =>
    solve [
      inversion H;
      match n with S (S (?n')) => subst; solve_by_inverts (S n') end ]
  end
  end.

Theorem step_deterministic:
  forall x y1: tm, x --> y1 -> forall y2, x --> y2 -> y1 = y2.
Proof.
  intros x y1 Hy1.
  induction Hy1;
    intros y2 Hy2; inversion Hy2; subst;
    try solve_by_inverts 1;
    sfirstorder use: value_is_nf, nv_succ unfold: value.
Qed.

Typing

The next critical observation is that, although this language has stuck terms, they are always nonsensical, mixing booleans and numbers in a way that we don't even want to have a meaning. We can easily exclude such ill-typed terms by defining a typing relation that relates terms to the types (either numeric or boolean) of their final results.

Inductive ty : Type :=
  | Bool : ty
  | Nat : ty.

In informal notation, the typing relation is often written |- t \in T and pronounced "t has type T." The |- symbol is called a "turnstile." There are richer typing relations where one or more additional "context" arguments are written to the left of the turnstile. Here, the context is always empty.
   (T_True)  

|- true \in Bool
   (T_False)  

|- false \in Bool
|- t1 \in Bool    |- t2 \in T    |- t3 \in T (T_If)  

|- if t1 then t2 else t3 \in T
   (T_0)  

|- 0 \in Nat
|- t1 \in Nat (T_Succ)  

|- succ t1 \in Nat
|- t1 \in Nat (T_Pred)  

|- pred t1 \in Nat
|- t1 \in Nat (T_Iszero)  

|- iszero t1 \in Bool

Reserved Notation "'|-' t '\in' T" (at level 40).

Inductive has_type : tm -> ty -> Prop :=
  | T_True :
       |- <{ true }> \in Bool
  | T_False :
       |- <{ false }> \in Bool
  | T_If : forall t1 t2 t3 T,
       |- t1 \in Bool ->
       |- t2 \in T ->
       |- t3 \in T ->
       |- <{ if t1 then t2 else t3 }> \in T
  | T_0 :
       |- <{ 0 }> \in Nat
  | T_Succ : forall t1,
       |- t1 \in Nat ->
       |- <{ succ t1 }> \in Nat
  | T_Pred : forall t1,
       |- t1 \in Nat ->
       |- <{ pred t1 }> \in Nat
  | T_Iszero : forall t1,
       |- t1 \in Nat ->
       |- <{ iszero t1 }> \in Bool

where "'|-' t '\in' T" := (has_type t T).

Example has_type_1 :
  |- <{ if false then 0 else (succ 0) }> \in Nat.
Proof.
  sauto.
Qed.

It's important to realize that the typing relation is a conservative (or static) approximation of the operational semantics: it does not consider what happens when the term is reduced.
Some examples:

Example succ_hastype_nat_hastype_nat : forall t,
  |- <{succ t}> \in Nat ->
  |- t \in Nat.
Proof.
  sauto lq: on.
Qed.

Example has_type_not :
  ~ ( |- <{ if false then 0 else true}> \in Bool ).
Proof.
  sauto lq: on.
Qed.


Canonical forms

The following two lemmas capture the fundamental property that the definitions of boolean and numeric values agree with the typing relation.

Lemma bool_canonical : forall t,
  |- t \in Bool -> value t -> bvalue t.
Proof.
  sauto.
Qed.

Lemma nat_canonical : forall t,
  |- t \in Nat -> value t -> nvalue t.
Proof.
  sauto.
Qed.

Progress

The typing relation enjoys two critical properties. The first is that well-typed normal forms are not stuck -- or conversely, if a term is well typed, then either it is a value or it can take at least one step. We call this progress.

Ltac progt :=
  match goal with |h: ?H1 \/ ?H2|- _ =>
                     destruct h; sauto use:bool_canonical, nat_canonical
  end.

Theorem progress : forall t T,
  |- t \in T ->
  value t \/ exists t', t --> t'.
Proof.
  intros t T HT.
  induction HT; auto using bvalue, nvalue;
  try clear IHHT2 IHHT3; progt.
Qed.

A sketch of the informal proof (some cases only):
Theorem: If |- t \in T, then either t is a value or else t --> t' for some t'.
Proof: By induction on a derivation of |- t \in T.
  • If the last rule in the derivation is T_If, then t = if t1 then t2 else t3, with |- t1 \in Bool, |- t2 \in T and |- t3 \in T. By the IH, either t1 is a value or else t1 can step to some t1'.
    • If t1 is a value, then by the canonical forms lemmas and the fact that |- t1 \in Bool we have that t1 is a bvalue -- i.e., it is either true or false. If t1 = true, then t steps to t2 by ST_IfTrue, while if t1 = false, then t steps to t3 by ST_IfFalse. Either way, t can step, which is what we wanted to show.
    • If t1 itself can take a step, then, by ST_If, so can t.
  • as they say: the other cases are similar ...


Type Preservation

The second critical property of typing is that, when a well-typed term takes a step, the result is a well-typed term (of the same type).

Theorem preservation : forall t t' T,
  |- t \in T ->
  t --> t' ->
  |- t' \in T.
Proof.
  intros t t' T HT HE.
  generalize dependent t'.
  induction HT;
    sauto lq: on.
Qed.

We can prove the same property again, by induction on the evaluation derivation instead of on the typing derivation. .

Theorem preservation' : forall t t' T,
  |- t \in T ->
  t --> t' ->
  |- t' \in T.
Proof.
  intros t t' T HT HE.
    generalize dependent T.
    induction HE;
      sauto lq: on.
Qed.

Type Soundness

Putting progress and preservation together, we see that a well-typed term can never reach a stuck state.

Definition relation (X : Type) := X -> X -> Prop.

Inductive multi {X : Type} (R : relation X) : relation X :=
  | multi_refl : forall (x : X), multi R x x
  | multi_step : forall (x y z : X),
                    R x y ->
                    multi R y z ->
                    multi R x z.

Notation "t1 '-->*' t2" := (multi step t1 t2) (at level 40).

Corollary mpreservation : forall t t' T,
  |- t \in T ->
  t -->* t' ->
  |- t' \in T.
Proof.
  induction 2;
    sauto use: preservation.
  Qed.

Corollary soundness : forall t t' T,
  |- t \in T ->
  t -->* t' ->
  ~(stuck t').
Proof.
  induction 2;
        hauto lq: on use: progress, preservation unfold: stuck.
Qed.