Vouivre Digital

Dependent type infrastructure for machine learning

2024/06/12 – Dependent Type System Update

Two months have pasted since the last update. If you are still here, it's probably to complain about the dependent type system infrastructure that doesn't support dependent types. No more, because here it comes → Π. et le but! Follow along by downloading the latest files, vouivre/misc.scm, vouivre/system.scm, and vouivre/rules.scm, from the wip branch. You will see the following new types: dependent functions `Π', dependent pairs `Σ', identities `=', natural numbers `ℕ', generalized booleans `Bool', lists `List', and vectors/arrays `Vector'. It also includes derivations for: the ordinary function type `→' (yes, that's right, it's now derivable), the constant function `constant', the identity type inverse operation `=-inverse', the addition function on natural numbers `add', and the length of lists `length'. Below, we provide an overview of the salient features.

Dependent Functions

Maybe (sometimes?) after a long night of programming you start having dreams about functions with types that check although the elements don't... like a function to add two vectors, but no matter how hard you try, the type can't make the difference if the two inputs have the same length or not... the stuff of nightmares, really. Ordinary function sleep is a serious condition, and here's what can be done about it.

The formation, introduction, and elimination rules for dependent functions are given by:

scheme@(guile-user)> (define-rule Π (Γ ⊢ (type (Π (: x A) (B x)))) (Γ (: x A) ⊢ (type (B x))))
scheme@(guile-user)> (define-rule λ
                       (Γ ⊢ (: (λ x (b x)) (Π (: x A) (B x))))
                       (Γ (: x A) ⊢ (: (b x) (B x))))
scheme@(guile-user)> (define-rule evaluation
                       (Γ (: x A) ⊢ (: (f x) (B x)))
                       (Γ ⊢ (: f (Π (: x A) (B x)))))

What was the codomain for ordinary function is now a type (B x) dependent on the element x of A.

TIP: The "literals" argument to `define-rule' was removed, and judgments are no longer nested in a superfluous nesting level.

The corresponding `define-rule' for ordinary functions is no longer needed, and instead we derive:

scheme@(guile-user)> (define-derivation → ((Γ ⊢ (:= (→ A B) (type (Π (: x A) B))))
                                           (Γ ⊢ (type A))
                                           (Γ ⊢ (type B)))
                       (Π)
                       (weakening)
                       (exact)
                       (exact))
scheme@(guile-user)> (define-derivation →-λ ((Γ ⊢ (: (λ x (b x)) (→ A B)))
                                             (Γ (: x A) ⊢ (: (b x) B))
                                             (Γ ⊢ (type A))
                                             (Γ ⊢ (type B)))
                       (element-conversion A (Π (: x A) B))
                       (λ)
                       (exact)
                       (type-symmetric)
                       (definition)
                       (exact)
                       (exact))
scheme@(guile-user)> (define-derivation →-evaluation ((Γ (: x A) ⊢ (: (f x) B))
                                                      (Γ ⊢ (: f (→ A B)))
                                                      (Γ ⊢ (type A))
                                                      (Γ ⊢ (type B)))
                       (evaluation f f)
                       (element-conversion A (→ A B))
                       (exact)
                       (definition)
                       (exact)
                       (exact))

TIP: To evaluate along, derivations like `weakening' and other rules/derivations we talked about in the previous post need to be declared. Therefore, whenever you get an error like "parsing error: define-derivation: rule RULE is undefined." Simply copy the corresponding RULE from rules.scm into the REPL.

The new syntax `(:= X Y)' tells the system that, upon successful derivation of Y, X should be defined to be judgmentally equal to Y. Effectively, the system adds for us the formation rule for `→' with the same name and a judgmental type equality rule available at appropriate points under the name "definition", as you will see shortly.

TIP: The same syntax can be used to define named constants as elements of a type. See rules.scm for an example.

Natural Numbers

While the natural numbers don't form a dependent type, the induction principle does, and thus many familiar functions involving the natural numbers are derived using all the notions we want to talk about today: dependent types, derived (ordinary function) types, and proof by induction. Follows the formation, introduction, elimination, and computation rules for ℕ.

scheme@(guile-user)> (define-rule ℕ-formation (⊢ (type ℕ)))
scheme@(guile-user)> (define-rule ℕ-introduction-1 (⊢ (: 0 ℕ)))
scheme@(guile-user)> (define-rule ℕ-introduction-2 (⊢ (: succ (→ ℕ ℕ))))
scheme@(guile-user)> (define-rule ℕ-elimination
                       (Γ ⊢ (: (ind-ℕ p0 ps) (Π (: n ℕ) (P n))))
                       (Γ (: n ℕ) ⊢ (type (P n)))
                       (Γ ⊢ (: p0 (P 0)))
                       (Γ ⊢ (: ps (Π (: n ℕ) (→ (P n) (P (succ n)))))))
scheme@(guile-user)> (define-rule ℕ-computation-1
                       (Γ ⊢ (≐ ((ind-ℕ p0 ps) 0) p0 (P 0)))
                       (Γ (: n ℕ) ⊢ (type (P n)))
                       (Γ ⊢ (: p0 (P 0)))
                       (Γ ⊢ (: ps (Π (: n ℕ) (→ (P n) (P (succ n)))))))
scheme@(guile-user)> (define-rule ℕ-computation-2
                       (Γ (: n ℕ) ⊢ (≐ ((ind-ℕ p0 ps) (succ n))
                                       (ps n ((ind-ℕ p0 ps) n))
                                       (P (succ n))))
                       (Γ (: n ℕ) ⊢ (type (P n)))
                       (Γ ⊢ (: p0 (P 0)))
                       (Γ ⊢ (: ps (Π (: n ℕ) (→ (P n) (P (succ n)))))))

The formation rule says that ℕ is a type without artifice. Then, we introduce zero, and the successor function, which takes a natural number to the next. In the last sentence, the wording is indicative of intention only. That is, we state what these constructors turn out to be. That zero has no predecessor or that all natural numbers have a successor needs to be derived. Next, we have the elimination rule stated in terms of a type family over the natural numbers. A property of the natural numbers, for short. It says that if we have a witness of the given property on zero and a function that, for all natural numbers, takes a witness of the property on that number to a witness on its successor, then we have a witness of the property for all naturals, a.k.a. induction. The computation rules state that induction gives the expected results on zero and on a successor. We have everything we need to derive the addition on natural numbers. Hold on to your sombrero.

scheme@(guile-user)> (define-derivation add (((: m ℕ) ⊢ (: (ind-ℕ m ((λ n succ) m)) (→ ℕ ℕ))))
                       (element-conversion #:start 1 A (Π (: x ℕ) ℕ))
                       (ℕ-elimination #:start 1 (P 0) ℕ (P n) ℕ (P (succ n)) ℕ)
                       (weakening #:start 1)
                       (weakening)
                       (ℕ-formation)
                       (ℕ-formation)
                       (weakening)
                       (ℕ-formation)
                       (ℕ-formation)
                       (generic-element)
                       (ℕ-formation)
                       (element-conversion #:start 1 A (→ ℕ ℕ))
                       (→-evaluation f (λ n succ))
                       (→-λ)
                       (weakening)
                       (ℕ-formation)
                       (ℕ-introduction-2)
                       (ℕ-formation)
                       (→)
                       (ℕ-formation)
                       (ℕ-formation)
                       (ℕ-formation)
                       (→)
                       (ℕ-formation)
                       (ℕ-formation)
                       (definition #:start 1)
                       (weakening)
                       (ℕ-formation)
                       (ℕ-formation)
                       (weakening)
                       (ℕ-formation)
                       (ℕ-formation)
                       (weakening)
                       (ℕ-formation)
                       (type-symmetric)
                       (definition)
                       (ℕ-formation)
                       (ℕ-formation))

Element conversion is used to change between → and Π types. This works as long as we can prove that these two types are judgmentally equal (a hypothesis of element conversion). This is the case by definition! Remember earlier when we used the symbol `:=' in `(:= (→ A B) (type (Π (: x A) B)))'? The system created a new rule for `(≐ (→ A B) (Π (: x A) B))' and we invoke it in this derivation with "definition". The system will know what to do based on the type or throw an error if the judgmental equality doesn't exist. For the first use of "definition", the types come out in the right order, but for the second one, we need to use `type-symmetric'. `ℕ-elimination', induction on ℕ, is the other novel aspect of this derivation.

The formation and introduction rules for vectors of verifiable lengths are:

scheme@(guile-user)> (define-rule vector-formation
                       (Γ ⊢ (type (Vector A n)))
                       (Γ ⊢ (type A))
                       (Γ ⊢ (: n ℕ)))
scheme@(guile-user)> (define-rule vector-introduction-1 (Γ ⊢ (: #() (Vector A 0))) (Γ ⊢ (type A)))
scheme@(guile-user)> (define-rule vector-introduction-2
                       (Γ ⊢ (: :: (Π (: n ℕ)
                                     (→ A (→ (Vector A n)
                                             (Vector A (succ n)))))))
                       (Γ ⊢ (type A)))

as you can see, things are really coming together in the last rule with a dependent function on the natural numbers, a polymorphic type, and types for vectors of different lengths.

2024/04/04 – Type System Update

It has been a while since the last update, but this is a good one. First, you can visit this news section to stay up-to-date on the latest developments. Second, use the Git repository to access the latest source code. Clone with git://git.vouivredigital.com/vouivre.git. Third, the main topic of this post, progress towards a dependent-type system worthy of its name.

Your Type, My Type, Everyone's Type

The type system presented in the last release is just a placeholder for something better. Starting today, the wip branch enables cool new features:

There are two pieces: system.scm and rules.scm. The former contains meta stuff to create type systems, while the latter defines the type system presented in E. Rijke, Introduction to Homotopy Type Theory. 2022. We only go over some examples here; look at the code for more. Here are six rules presented in the book and defined with our system that we will use next:

scheme@(guile-user)> (define-rule formation-3 () ((Γ ⊢ (type B)) (Γ ⊢ (≐ A B))))
scheme@(guile-user)> (define-rule type-symmetric () ((Γ ⊢ (≐ B A)) (Γ ⊢ (≐ A B))))
scheme@(guile-user)> (define-rule variable-conversion ()
                       (Γ (: x A') Δ ⊢ J)
                       (Γ ⊢ (≐ A A'))
                       (Γ (: x A) Δ ⊢ J))
scheme@(guile-user)> (define-rule substitution ()
                       (Γ (/ Δ a x) ⊢ (/ J a x))
                       (Γ ⊢ (: a A))
                       (Γ (: x A) Δ ⊢ J))
scheme@(guile-user)> (define-rule weakening () ((Γ (: x A) Δ ⊢ J) (Γ ⊢ (type A)) (Γ Δ ⊢ J)))
scheme@(guile-user)> (define-rule generic-element () ((Γ (: x A) ⊢ (: x A)) (Γ ⊢ (type A))))

The syntax is simple. To define a rule, we must give it a name, a list of literals (ignored at the moment), and a s-expression with the conclusion at the root and the hypotheses as leaves. The four kinds of judgments are expressed using the reserved symbols `:', `≐', `⊢', and `type'. Substitution uses the reserved symbol `/', and the generic judgment thesis (`J' here) can be any unreserved symbol as long as it is used as above. For example, the formation-3 rule says that, given two judgmentally equal types `A' and `B', we conclude that `B' is a type.

Now, let's see how to derive the "element conversion rule" that says: "In some context, if you have an element of a type and that type is judgmentally equal to another, then, in conclusion, the element must also be an element of that other type." We start by creating a "derivation" with the conclusion:

scheme@(guile-user)> (d '(Γ ⊢ (: a A')))
* = #<derivation
      tree: ((Γ ⊢ (: a A')))
      points: (((Γ ⊢ (: a A'))))>

A derivation consists of a tree and a list of derivation points, i.e., incomplete points of the derivation. Then, we apply various rules, thus expanding the derivation tree and adding new points. When we achieve a hypothesis that suits our needs, we invoke `exact' to terminate the current derivation point and move on to the next one:

scheme@(guile-user)> (apply-rule * 'substitution 'a 'a 'x 'a' 'A 'A)
* = #<derivation
      tree: ((Γ ⊢ (: a A')) ((Γ ⊢ (: a A))) ((Γ (: a' A) ⊢ (: a' A'))))
      points: (((Γ ⊢ (: a A))) ((Γ (: a' A) ⊢ (: a' A'))))>
scheme@(guile-user)> (apply-rule * 'exact)
* = #<derivation
      tree: ((Γ ⊢ (: a A')) ((Γ ⊢ (: a A)) . #{.}#) ((Γ (: a' A) ⊢ (: a' A'))))
      points: (((Γ (: a' A) ⊢ (: a' A'))))>
scheme@(guile-user)> (apply-rule * 'variable-conversion #:index 1 'A 'A')
* = #<derivation
      tree: ((Γ ⊢ (: a A'))
                 ((Γ ⊢ (: a A)) . #{.}#)
                 ((Γ (: a' A) ⊢ (: a' A'))
                      ((Γ ⊢ (≐ A' A)))
                      ((Γ (: a' A') ⊢ (: a' A')))))
      points: (((Γ ⊢ (≐ A' A))) ((Γ (: a' A') ⊢ (: a' A'))))>
scheme@(guile-user)> (apply-rule * 'type-symmetric)
* = #<derivation
      tree: ((Γ ⊢ (: a A'))
                 ((Γ ⊢ (: a A)) . #{.}#)
                 ((Γ (: a' A) ⊢ (: a' A'))
                      ((Γ ⊢ (≐ A' A)) ((Γ ⊢ (≐ A A'))))
                      ((Γ (: a' A') ⊢ (: a' A')))))
      points: (((Γ ⊢ (≐ A A'))) ((Γ (: a' A') ⊢ (: a' A'))))>
scheme@(guile-user)> (apply-rule * 'exact)
* = #<derivation
      tree: ((Γ ⊢ (: a A'))
                 ((Γ ⊢ (: a A)) . #{.}#)
                 ((Γ (: a' A) ⊢ (: a' A'))
                      ((Γ ⊢ (≐ A' A)) ((Γ ⊢ (≐ A A')) . #{.}#))
                      ((Γ (: a' A') ⊢ (: a' A')))))
      points: (((Γ (: a' A') ⊢ (: a' A'))))>
scheme@(guile-user)> (apply-rule * 'generic-element #:index 1)
* = #<derivation
      tree: ((Γ ⊢ (: a A'))
                 ((Γ ⊢ (: a A)) . #{.}#)
                 ((Γ (: a' A) ⊢ (: a' A'))
                      ((Γ ⊢ (≐ A' A)) ((Γ ⊢ (≐ A A')) . #{.}#))
                      ((Γ (: a' A') ⊢ (: a' A')) ((Γ ⊢ (type A'))))))
      points: (((Γ ⊢ (type A'))))>
scheme@(guile-user)> (apply-rule * 'formation-3 'A 'A)
* = #<derivation
      tree: ((Γ ⊢ (: a A'))
                 ((Γ ⊢ (: a A)) . #{.}#)
                 ((Γ (: a' A) ⊢ (: a' A'))
                      ((Γ ⊢ (≐ A' A)) ((Γ ⊢ (≐ A A')) . #{.}#))
                      ((Γ (: a' A') ⊢ (: a' A'))
                           ((Γ ⊢ (type A')) ((Γ ⊢ (≐ A A')))))))
      points: (((Γ ⊢ (≐ A A'))))>
scheme@(guile-user)> (apply-rule * 'exact)
* = #<derivation
      tree: ((Γ ⊢ (: a A'))
                 ((Γ ⊢ (: a A)) . #{.}#)
                 ((Γ (: a' A) ⊢ (: a' A'))
                      ((Γ ⊢ (≐ A' A)) ((Γ ⊢ (≐ A A')) . #{.}#))
                      ((Γ (: a' A') ⊢ (: a' A'))
                           ((Γ ⊢ (type A')) ((Γ ⊢ (≐ A A')) . #{.}#)))))
      points: ()>

When no points remain, we are done. We can summarize, using the same format as `define-rule', the hypotheses that lead to the conclusion:

scheme@(guile-user)> (list (conclusion *) (list-hypotheses *))
* = ((Γ ⊢ (: a A')) ((Γ ⊢ (: a A)) (Γ ⊢ (≐ A A'))))

Finally, we can internalize a derivation as a new rule with a name that can then be used in other derivations:

scheme@(guile-user)> (define-derivation element-conversion ((Γ ⊢ (: a A'))
                                                            (Γ ⊢ (: a A))
                                                            (Γ ⊢ (≐ A A')))
                       (substitution a a x a' A A)
                       (exact)
                       (variable-conversion #:index 1 A A')
                       (type-symmetric)
                       (exact)
                       (generic-element #:index 1)
                       (formation-3 A A)
                       (exact))

The difference between `define-rule' and `define-derivation' is that the former interns a fundamental rule that cannot be derived from anything else, whereas the latter checks, prior to adding a new rule, that the provided conclusion is derivable from the listed hypotheses.

Bools and Whistles

In this section, we show how to create common non-dependent programming constructs (these will be generalized once dependent types are fully working to the corresponding type indicated in between parenthesis): ordinary functions (dependent product type), booleans, pairs (dependent sum type), and records (dependent records). To define these, we need four kinds of rules: formation, introduction, elimination, and computation. Often, these come associated with some congruence rules, ignored in this short exposé.

Ordinary Functions

Ordinary functions are your usual function type in languages like Haskell. It has one rule of formation, introduction, and elimination and two computation rules:

scheme@(guile-user)> (define-rule → () ((Γ ⊢ (type (→ A B))) (Γ ⊢ (type A)) (Γ ⊢ (type B))))
scheme@(guile-user)> (define-rule λ ()
                       (Γ ⊢ (: (λ x (b x)) (→ A B)))
                       (Γ ⊢ (type B))
                       (Γ (: x A) ⊢ (: (b x) B)))
scheme@(guile-user)> (define-rule ev () ((Γ (: x A) ⊢ (: (f x) B)) (Γ ⊢ (: f (→ A B)))))
scheme@(guile-user)> (define-rule β ()
                       (Γ (: x A) ⊢ (≐ ((λ y (b y)) x) (b x) B))
                       (Γ ⊢ (type B))
                       (Γ (: x A) ⊢ (: (b x) B)))
scheme@(guile-user)> (define-rule η () ((Γ ⊢ (≐ (λ x (f x)) f (→ A B))) (Γ ⊢ (: f (→ A B)))))

These (or their dependent generalizations) are used ubiquitously in type theory. See the source code for a derivation of the ordinary identity function.

Booleans

scheme@(guile-user)> (define-rule bool-formation () ((⊢ (type Bool))))
scheme@(guile-user)> (define-rule bool-introduction-1 () ((⊢ (: true Bool))))
scheme@(guile-user)> (define-rule bool-introduction-2 () ((⊢ (: false Bool))))
scheme@(guile-user)> (define-rule bool-elimination ()
                       (⊢ (: if (→ Bool (→ A (→ A A)))))
                       (⊢ (type A)))
scheme@(guile-user)> (define-rule bool-computation-1 () ((⊢ (≐ (((if true) x) y) x Bool))))
scheme@(guile-user)> (define-rule bool-computation-2 () ((⊢ (≐ (((if false) x) y) y Bool))))

Now we can derive the `not' function!

scheme@(guile-user)> (define-derivation not ((⊢ (: (λ x (((if x) false) true)) (→ Bool Bool))))
                       (λ)
                       (bool-formation)
                       (substitution a true x y A Bool)
                       (weakening #:index 0)
                       (bool-formation)
                       (bool-introduction-1)
                       (ev #:index 1)
                       (substitution a false x y A Bool)
                       (weakening #:index 0)
                       (bool-formation)
                       (bool-introduction-2)
                       (ev #:index 1)
                       (ev #:index 0)
                       (bool-elimination)
                       (bool-formation))

Pairs

scheme@(guile-user)> (define-rule pair-formation ()
                       (⊢ (type (Pair A B)))
                       (⊢ (type A))
                       (⊢ (type B)))
scheme@(guile-user)> (define-rule pair-introduction ()
                       (⊢ (: (pair a b) (Pair A B)))
                       (⊢ (: a A))
                       (⊢ (: b B)))
scheme@(guile-user)> (define-rule pair-elimination-1 ()
                       (⊢ (: first (→ (Pair A B) A)))
                       (⊢ (type A))
                       (⊢ (type B)))
scheme@(guile-user)> (define-rule pair-elimination-2 ()
                       (⊢ (: second (→ (Pair A B) B)))
                       (⊢ (type A))
                       (⊢ (type B)))
scheme@(guile-user)> (define-rule pair-computation-1 ()
                       (⊢ (≐ (first (pair a b)) a A))
                       (⊢ (: a A))
                       (⊢ (: b B)))
scheme@(guile-user)> (define-rule pair-computation-2 ()
                       (⊢ (≐ (second (pair a b)) b B))
                       (⊢ (: a A))
                       (⊢ (: b B)))

Records

Record types can be created in the same way as pairs, with the intended types and accessors for each slot.