Hoof! Lots of papers, lots of writing!

Taking a peek into the Firefox verified crypto, and a bunch of Coq libraries and frameworks.

I looked at the paper on HACL*, that fancy-pants verified cryptographic library in Firefox Quantum.

There were some concerns, like those voiced by Perry Metzger. As there should be! Paranoia is great!

Basically there was some conversation on the #coq IRC channel which lead to a little bit of investigating. I don’t really remember how it started, but there were some questions about undefined behaviours in CompCert. Using my trusty (rusty?) memory of the DSSS17 lectures, I was pretty sure that Xavier Leroy mentioned that CompCert will make assumptions about undefined behaviour, which makes sense, because of course it does! Every compiler has to.

Anyway, in this particular case we were wondering about the semantics of CompCert with respect to signed integer overflow. I quickly pulled up the link to the manual which happens to say “yep, CompCert wraps the result!”

http://compcert.inria.fr/man/manual004.html

Perry pointed out that LLVM definitely doesn’t make this same assumption on higher optimization levels, and since the CompCert semantics might differ from those of another compiler with respect to undefined behaviour, you could verify code against CompCert’s semantics and just completely ruin your guarantees by compiling with another compiler. Obviously this is the case anyway, as Clang doesn’t necessarily produce correct code, but it’s a little scary to consider that differences in the treatment of undefined behaviours could also result in issues even if Clang did produce correct code according to the spec.

So, who knows, maybe there’s a bug in HACL* because of this! Although, I have heard tell that the developers avoided signed integers, there could be something similar somewhere!

Additionally after going through the paper, I was a little bit surprised! After seeing claims of a “formally verified” cryptographic library in Firefox, I had initially made the assumption that this was an end-to-end proof of correctness, with a small trusted codebase (like the F* kernel, or whatever).

This isn’t quite the case, though! The paper itself is quick to point this out — the trusted codebase is still pretty large, at least compared to that ideal.

HACL* is implemented in a DSL embedded in F* called low*. This low* DSL is related to CompCert’s Clight by a manual proof, and the extraction of low* goes through a compiler called Kremlin, which is probably a pretty straightforward pass, but it’s not proven correct. There could be mistakes in this, there could be mistakes in the manual proof linking low* to the semantics of Clight.

This is still a great development. It’s certainly much MUCH better than trusting an entire large C codebase. There’s still lots of work that we can do to make this project even better, though!

I reviewed a couple of papers this week. The paper on Fiat, and a paper on using Fiat for writing fast Haskell code.

Fiat is refinement framework in Coq. The initial paper documents its usage in verifying abstract data types, and in particular query structures, which may be used in an SQL like way.

The basic idea of refinement is that you start out with a specification, and then through various small steps you refine that specification into an implementation by replacing subterms of the program. Each small step will preserve the behavior of the previous specification / artifact from the previous step. Equivalence in this case is the usual “the resulting program is at least as deterministic”, i.e., any possible behaviour of the newly refined program must be a subset of the behaviours from the previous step. A refined program introduces *no additional behaviours*.

In Fiat much of the refinement rules can be automated, and since refinement is transitive a proof that the final program implements the initial spec falls out naturally. This is what is meant by “correct by construction”.

Refinements interact really nicely with monads in Fiat. This is what you would expect. If you have a command `a`

which refines to `aᵣ`

and you have a sequence `x ← a;b(x)`

, then `x ← aᵣ;b(x)`

is obviously a refinement of this as well.

Similarly, if you know that `bᵣ(x)`

is a refinement of `b(x)`

, then `x ← a;bᵣ(x)`

is a refinement of `x ← a;b(x)`

.

This is great, because it means you should be able to refine different portions of a monadic computation independently of one another, which should allow you to build up an imperative / procedural program, which is correct by construction, with as little effort as possible.

Refined computations are expressed as computations paired with proofs that this computation is in fact a refinement of the spec, Fiat calls a pair like this a `SharpenedComputation`

. And again, since refinement is transitive we can use these to make itty-bitty single steps towards a final implementation, and fiat provides a bunch of so-called “honing” tactics for making things itty-bitty refinements.

One realization that I had was that refining is actually very similar to the “Type Driven Development” approach advocated in Idris. This approach in Idris puts a bit more emphasis on changing the type of a function as requirements change, or as you develop and realize you want more restrictions, and so on… But, ultimately, the approach of taking a specification (type), and then finding a program which matches the specification (type) through successive refinements is very similar. In Idris this is done interactively through some keyboard shortcuts to automatically case split variables, and run proof search on holes. Fiat, however, relies upon Coq tactics to do this kind of thing. What’s interesting to me is that the tactics are kind of a more powerful way to do this kind of development, since a developer can easily write their own little pieces of Ltac to change how a proof search is done.

Fiat also has a library for handling “query structures”, which are ADTs which can support some SQL-like operations. Fiat would allow you to create constraints, like say a Foreign key constraint in SQL, and is actually able to automate these away, allowing a user to refine a program without having to consider the constraints.

Abstraction relations are used heavily for justifying refinements. My understanding is that this is just a mapping from the initial set-theoretic implementation onto the implementation?

The paper “Using Coq to Write Fast and Correct Haskell” extends upon Fiat in an interesting way, allowing the developer to reason about memory allocations on a heap. Within Fiat all of the data types used to build up the data structures are managed by a garbage collector, which might not be good enough if you want to write something that needs to be lightning-fast, like Haskell’s `ByteString`

data type.

“Using Coq to Write Fast and Correct Haskell” uses a heap data type within Fiat to allow a developer to reason about things which will ultimately not be managed by the garbage collector. The idea uses a representation type for a heap, which exposes things like allocating memory, peeking at memory, and poking memory. Then when extracting the code to Haskell these calls can be replaced with appropriate calls to the Haskell heap manipulation functions.

A problem that has to be addressed is how to keep track of this state, different heaps and such. Initially the paper looks at incorporating the heap along with the ADTs, but this ends up causing problems if you have methods which interact with multiple ADTs — which heap do you choose! The solution was to just use a state monad to automatically weave this heap through operations.

This approach does not currently make any static guarantees with respect to the heap, and assumes the client is well behaved and doesn’t do things like free memory which it shouldn’t.

Read a bit about Bedrock as well, which is a framework for verifying low level systems code in Coq. It’s actually a pretty cute DSL, and it looks like it’s a nice and simple way to play around with verified low-level code. There’s a strong emphasis on high levels of automation — as is the standard with Dr. Chlipala’s projects. It seems chocked full of separation logic goodness. Functions are specified using preconditions and postconditions.

This was a fairly quick glance at Bedrock, so I’m not familiar with its handling of global mutable state or anything of the sort.

One aspect of Bedrock which I might revisit in the near future is that it actually has its own intermediate language. Might be interesting to look more deeply into how that’s designed.

I am hoping to look into VST in the near future, and it may be worth comparing how Bedrock compares to VST. I imagine it may actually be quite a lot more difficult to produce a verified program with VST, since it has to deal with a lot of C baggage.

I have had some fears about the automation of proofs in the past. I really enjoy writing proofs, so I find it kind of sad that at some point this will be a joy taken away from me by the fact that computers will probably be much better at proofs than most people in the future. It gives me some existential dread to have no reason to do this, and to think of the fact that we may start to make “human thinking” obsolete. Whether or not this will actually happen is a matter of speculation, but if nothing else a computer can already try a large number of proof strategies a heck of a lot faster than I can with a pencil and paper!

But here’s the thing… I don’t know that I think it’s sad if this happens anymore. Why? Because formal verification is such a niche and expensive thing right now that it’s essentially not really done anyway. Having computers be able to automatically prove theorems quickly could really change that, and give us better software much faster. It is more sad to me to not have any verified software.

]]>This week has largely been administrative, with letters written, and servers managed.

I went through the Tapir/LLVM paper, and had some thoughts about theorem proving…

This paper introduces a very elegant way of expressing fork-join parallelism in an extended version of the LLVM IR. It does so by introducing three instructions, `detach`

, `reattach`

, and `sync`

.

`detach`

is how you introduce parallelism, it takes two labels. One for the detached block, and one for the continue block. Tapir has a really slick set of semantics for the parallel code – it should be the same as running the detached block, and then the continue block sequentially.

In practice the detached block will be run in parallel with the continue block, and it will terminate when it hits a `reattach`

instruction which links bank to the continue block. The `sync`

command can be used by the continue block to wait for this reattach.

What’s impressive is that Tapir/LLVM is a very small extension of the IR, but it allows for many of LLVMs optimizations to work on the parallel code with little to no modification.

I’ve been thinking a bit about why I should care about theorem proving, and I think it’s best summarized with a few points.

Any advance in theorem proving is essentially an advance in:

- Being able to write programs. If you can automatically find proofs, you can automatically write programs.
- Being able to test programs. If you can automatically find proofs / counterexamples then this makes testing better.
- Being able to optimize programs. If compilers can automatically try to prove properties about your program, then they can use this knowledge to better optimize this program. Better proof search might mean we can write better compilers.

I find that it’s quite valuable to be able to reason about my code in a formal system. Helps to keep things organized, and makes a lot of things easier to write in a way. Having proof search available, like in Idris, when writing code can make development so much nicer and faster. I really do think that with just a bit more development it can be way better and faster to write programs.

I was having a number of problems proving properties with `Rpower`

in Coq. I couldn’t find any theorems or lemmas which could help me prove even the simplest thing like `ln (Rpower a b) = b * ln a`

.

Turns out in these situations it’s worth printing the definition of the offending function:

```
Print Rpower.
```

```
Rpower = fun x y : R => exp (y * ln x)
: R -> R -> R
```

Ah. Yeah. That would help. I couldn’t find anything relating `Rpower`

to `exp`

or `ln`

with `Search`

! Turns out I just needed to unfold the definition this entire time.

This is one thing that’s worth keeping in mind when writing Coq. You really need to know how the things you are working with are defined. Makes it much more obvious how things will simplify, and can also make it way more obvious how to approach a problem.

But seriously, maybe we should make `Rpower a b = exp (b * ln a)`

show up in the search or something?

Coqplexity is ultimately a project in Ltac, at least I’m hoping to automate as much as possible with it.

Previously I was just working on a `big_O`

tactic which can automatically solve Big O problems using some common tricks, like recognizing that if $f1 \in O(g)$, and $f2 \in O(g)$, then $\lambda \; n \rightarrow f1 \; n + f2 \; n \in O(g)$. This works very well in a lot of cases, but this approach is kind of rigid and the tactics currently can only solve fairly trivial Big O problems.

What I am trying to do now is write a much more general set of tactics which encompass the usual path that I take when trying to solve these problems. This should allow me to automate a lot more, and possibly even automate theorems that I would otherwise need to write custom proofs for.

However, I’m finding Ltac is a bit fiddly. I’m ending up with situations where it’s hard to match on what I actually need, and keep track of the current state and any backtracking I might want to do. Often Ltac seems to rely upon heavy-handed tactics like `inversion`

and `subst`

which drastically alter the context and goal. I want to be a lot more principled, so I can print out the steps taken. We’ll have to wait and see how this goes!

This has been a productive week with my Coqplexity project really starting to get off of the ground. Coqplexity is at the point now where it can automatically prove pretty much any polynomial Big O relation (as long as it’s true).

Some more work needs to be done in Coqplexity in order to make it more useful, but the foundation is there, and it’s been a good bit of experience in writing tactics and dealing with real numbers.

So, more stuff about subset types, tactics, and more!

I had a brief conversation with Paolo G. Giarrusso while complaining about not being able to replace / rewrite a subterm in Coq. Initially I thought it was a notation thing, as I essentially had a goal like this:

```
a <= b <= c
```

Which uses a notation which unfolds to this:

```
a <= b /\ b <= c
```

I wanted to change, say, `b <= c`

with `b <= c + 0`

, and I was having a lot of problems with this.

Turns out this had nothing to do with the notation. I actually had a subset type that I was using.

```
sc : {c : R | (0 < c)%R}
```

And my goal was this:

```
0 <= f n <= proj1_sig sc * g n
```

But I could not change `proj1_sig sc * g n`

with ```
proj1_sig sc * g
n + 0
```

using:

```
replace (proj1_sig sc * g n) with (proj1_sig sc * g n + 0) by lra.
```

Which should have worked, dammit! And it turns out it does work if you have something like:

```
a, b, c : R
============================
a <= b <= c
```

However, because `sc`

is some fancy-pants dependent type I think the `rewrite`

/ `replace`

tactics can’t actually tell that ```
proj1_sig sc
= proj1_sig sc
```

(in theory they could, but it might be a bug / limitation).

However, if you destruct the subset type to get `c : R`

, and ```
Hc : 0
< c
```

, and simplify so you just have:

```
f, g : nat -> R
n : nat
c : R
Hc : 0 < c
============================
0 <= f n <= c * g n
```

Then replacements work as expected. Go figure! It does make some sense because equality between dependent types can be very complicated, and I guess that’s what’s going on here.

Coq seems to have a lot of wonderful (and probably annoying when you delve deep enough) features for extending the environment. One example of this is the remarkably powerful notation feature, which I have used in Coqplexity to provide $f \in O(g)$ style notation. Awesome stuff! Another thing in a similar vein that I have discovered is the coercion feature.

Basically you can define an automatic coercion between types, which is super useful if you want to use numerical literals and whatnot to convert between things. Also just useful to be able to specify what implicit coercions can be made as a user of the programming language, because the built in implicit coercions are never going to be perfectly what you want!

Coq’s aforementioned notation feature also extends into tactics. I have been writing a number of tactics for Coqplexity. One of the things that I wanted to be able to do with my tactics was implement the behavior that you see with tactics like `simpl`

, where you can say `simpl in H`

to apply simplification in the hypothesis `H`

, or `simpl in *`

which applies the tactic to every hypothesis and the current goal.

It took me a little while to find out how to do this because searching for something like “Coq in Ltac” isn’t great!

Turns out this is done with tactic notations. I don’t know that I have done this perfectly, but this kind of thing is working well for me, though a bit repetitive:

```
Ltac unfold_ord :=
unfold ge_ord; unfold gt_ord; unfold le_ord; unfold lt_ord; simpl; ineq_fix.
Ltac unfold_ord_in H :=
unfold ge_ord in H; unfold gt_ord in H; unfold le_ord in H; unfold lt_ord in H; simpl in H; ineq_fix.
Ltac unfold_ord_all :=
unfold ge_ord in *; unfold gt_ord in *; unfold le_ord in *; unfold lt_ord in *; simpl in *; ineq_fix.
Tactic Notation "unfold_ord" "in" hyp(l) := unfold_ord_in l.
Tactic Notation "unfold_ord" "in" "*" := unfold_ord_all.
```

You would like to be able to write `3.14`

as a “real number” in Coq… I don’t think this is supported, though, since `.`

is used in the tactic language. You can probably make notation that will work, but it doesn’t seem to exist yet, at least not in Coquelicot.

Current recommendation: `314/100`

Whatever, that’s fine for now!

Somebody briefly on the `#coq`

IRC channel on Freenode was trying to solve problems about rings. They wanted to know why the `ring`

tactic (essentially `omega`

for rings) could not solve this problem:

```
Require Import Coq.setoid_ring.Ring_theory.
Require Import Ring.
Require Import Omega.
Lemma SRNat : semi_ring_theory 0 1 plus mult eq.
Proof.
constructor;
intros;
(omega ||
apply mult_comm ||
apply mult_assoc ||
apply mult_plus_distr_r).
Qed.
Add Ring RNat : SRNat.
Goal forall (a b : nat), (a + b) * (a + b) = a * a + 2 * a * b + b * b.
Proof.
ring.
```

This is the kind of thing that I have some familiarity with after working on Coqplexity! I managed to figure out that:

- You need to introduce the variables
`2 * a * b`

is problematic

Why is `2 * a * b`

problematic? Well, you see… Not all rings have a “2”, so to speak! The `ring`

tactic is going to be operating with only the axioms provided by rings / semirings. `ring`

knows about things like `0`

, `1`

, `plus`

, and `mult`

, but it doesn’t know that `2 * a * b = a * b + a * b`

. There’s no ring / semiring axiom that this falls out from.

However, `replace (2 * a * b) with (1 + 1) * a * b by omega`

will get you fixed up :).

I have attempted to do more scheduling in org-mode, and have finally figured out some things about how org-mode is supposed to work? org-mode has the concept of “scheduling” and “deadlines”, but “scheduling” is **not** for scheduling when to work on something, or scheduling things like on a calendar. This has caused me some confusion. Apparently you are supposed to use plain timestamps for this (though this is not without problems either). More info here.

This is where Orgzly has forsaken me. Orgzly currently does not support plain timestamps. Which is unfortunately quite a glaring difference between how org-mode wants to work, and how orgzly wants to work.

]]>This week more or less concludes my search for a real number library in Coq, allowing me to get another project underway which needed real numbers!

Additionally I have been messing a little bit more with subset types, which should prove useful in the very near future.

I was desperately searching for a real number library in Coq, and finally happened upon this one. What’s nice about this one is that it works with the built in real number library, essentially just extending it to have some of the stuff that I want, and it works with Coq 8.7! C-CoRN is probably a better option if you need anything heavyweight, but it seems to only work for Coq 8.5.2 at the moment, and I was actually having some problems finding the documentation (I know it exists, but I can’t find it for some reason…).

Coquelicot is a fairly simple library with a very straightforward Coqdoc. My only complaint so far is that it doesn’t seem to be on opam, but installing it manually was painless!

One thing that this library offers, which I’m hoping to use, is limits. So, I spent a bit of time unfolding the limit definitions so that I could understand it.

Coquelicot implements limits in terms of filters. The limits in Coquelicot are a bit more general than I need – they work on a number of spaces, and not just over the real numbers. This stuff is vaguely familiar from the topology classes of yore, but it has been a while!

So, my goal is to roughly understand the limit definition used in Coquelicot with respect to real numbers:

```
Definition filterlim {T U : Type} (f : T -> U) F G :=
filter_le (filtermap f F) G.
```

This definition relies very heavily on filters, which are described here.

```
Class Filter {T : Type} (F : (T -> Prop) -> Prop) := {
filter_true : F (fun _ => True) ;
filter_and : forall P Q : T -> Prop, F P -> F Q -> F (fun x => P x /\ Q x) ;
filter_imp : forall P Q : T -> Prop, (forall x, P x -> Q x) -> F P -> F Q
}.
```

A filter (`F`

in this case) is any predicate of type ```
(T -> Prop)
-> Prop
```

An example of a filter is `locally x`

for some `x`

.

```
Definition locally (x : T) (P : T -> Prop) :=
exists eps : posreal, forall y, ball x eps y -> P y.
```

It has to be shown that this is a filter, i.e., the library defines locally to be an instance of filter, and there is a proof that it satisfies the axioms of `Filter`

.

It’s easy to see that the types match up, since ```
locally x : (T ->
Prop) -> Prop
```

, which is exactly the type that `Filter`

takes.

`ball`

is part of the `UniformSpace`

module…

```
Record mixin_of (M : Type) := Mixin {
ball : M -> R -> M -> Prop ;
ax1 : forall x (e : posreal), ball x e x ;
ax2 : forall x y e, ball x e y -> ball y e x ;
ax3 : forall x y z e1 e2, ball x e1 y -> ball y e2 z -> ball x (e1 + e2) z
}.
```

So, `ball`

is a predicate which takes a centre in the space, a real number for the radius, and another point which is within the space. The proposition holds when the second point is within the ball around the first point of the given radius.

`ax1`

, `ax2`

, `ax3`

are all axioms for dealing with balls:

`ax1`

: the centre of a ball is within the ball.`ax2`

: if`y`

is within a ball of radius`e`

centred at`x`

, then`x`

is within a ball of radius`e`

centred at`y`

.`ax3`

: if`y`

is within a ball of radius`e1`

centred at`x`

and`z`

is within a ball of radius`e2`

centred at`y`

, then we can enlarge the ball at`x`

to have a radius of`e1 + e2`

to ensure that`z`

is within this ball as well.

Right now we don’t really care about these axioms, since we’re just trying to figure out how `Filter`

’s are used in limits. So, again, we have this thing which makes a filter:

```
Definition locally (x : T) (P : T -> Prop) :=
exists eps : posreal, forall y, ball x eps y -> P y.
```

Where:

```
ball : M -> R -> M -> Prop ;
```

`locally x`

restricts the use of this predicate to the neighbourhood around `x`

. I.e., locally says that we can find a small enough ball around `x`

such that the predicate `P`

always holds for every point within the ball.

This leads us to this definition, which comes up in our limit.

```
Definition filter_le {T : Type} (F G : (T -> Prop) -> Prop) :=
forall P, G P -> F P.
```

We want to know, much like with the epsilon delta definition of a limit, that when we get close to a point (within a sufficiently small ball), the result of the function is within epsilon of the limit point. So, it becomes useful to be able to map the function over the ball, giving us the image of the function on the domain of the ball. We would then want to be able to tell if this image is contained within the ball around the limit point. So, it makes sense to want a predicate like `filter_le`

which will show that one filter is entirely contained within another.

I’m reading this as: a filter `F`

is less than or equal to a filter `G`

if for any predicate `P : T -> P`

…

`G P -> F P`

This seems to mean that every point in `G`

is in `F`

. I suspect `F`

might be considered less than `G`

since it filters *out* a smaller or equal portion?

Doesn’t really matter, though, because I think its use here is fairly intuitive. This is the Coquelicot for $\lim_{t \rightarrow x} c = c$.

```
filterlim (fun t => c) (locally x) (locally c).
```

Which unfolds to these:

```
filter_le (filtermap (fun _ : R => c) (locally x)) (locally c)
filter_le (fun P : R -> Prop => locally x (fun _ : R => P c)) (locally c)
```

`filtermap`

maps every element of the space `T`

onto some space `U`

using a function `f : T -> U`

. I.e., every element goes through this map before being filtered.

`filter_le (filtermap (fun _ : R => c) (locally x)) (locally c)`

means that everything in the neighbourhood of `x`

, which is then passed through our constant function which maps to `c`

will approach `c`

. Where approach `c`

means that the `ball`

of `G`

(`locally c`

) is contained within the ball of `F`

(the neighbourhood around `x`

which is then mapped to `c`

), for sufficiently sized balls.

Okay, so, in sum, the filters are basically just used to select a neighborhood around a point. The limit definition just wants the neighborhood of the function as values in the domain approach a point to be contained within a neighborhood of the limit point. This is essentially just the epsilon delta definition of a limit, so it’s really not too surprising!

I’m working on a small project for analyzing computational complexity in Coq, which I have dubbed Coqplexity.

While working on this project I have made some observations about subset types, and some additional benefits they have.

To begin with, I’m working with complexity classes, which have some constraints upon values. For instance many of the real constants have to be greater than 0.

I started out doing this:

```
Definition BigO (f : nat -> R) (g : nat -> R) :=
exists (c : R) (n0 : nat), forall (n : nat),
(c > 0) % R ->
n > n0 ->
(0 <= f n /\ f n <= c * g n) % R.
```

which is more or less the standard definition of big O, but I have encountered some problems, which lead me to reformulate this with subset types like so:

```
Definition BigO (f : nat -> R) (g : nat -> R) :=
exists (sc : {c : R | (c > 0) % R}) (n0 : nat), forall (n : nat),
let c := proj1_sig sc in
n > n0 ->
(0 <= f n /\ f n <= c * g n) % R.
```

It’s a little annoying having to destruct sigmas and whatnot, but there are a couple of big advantages to this definition. To start with, consider something like big theta:

```
Definition BigTheta (f : nat -> R) (g : nat -> R) :=
exists (c1 c2 : R), exists (n0 : nat), forall (n : nat),
(c1 > 0) % R ->
(c2 > 0) % R ->
n > n0 ->
(0 <= c1 * g n <= f n /\ f n <= c2 * g n) % R.
```

You end up with more and more hypotheses. One thing that’s worth noting is that these are all irritatingly similar! If you put this condition within the subset type then you don’t have to repeat the condition multiple times:

```
Definition BigTheta (f : nat -> R) (g : nat -> R) :=
exists (sc1 sc2 : {c : R | (c > 0) % R}), exists (n0 : nat), forall (n : nat),
let c1 := proj1_sig sc1 in
let c2 := proj1_sig sc2 in
n > n0 ->
(0 <= c1 * g n /\ c1 * g n <= f n <= c2 * g n) % R.
```

It’s a small example, but this could be a big deal if your predicate is particularly complicated, or if you have a lot of variables.

Another thing that I have noticed, which is probably a much bigger deal, is that these definitions are not quite the same!

See, without the subset types I have a theorem which takes two hypotheses `c1 > 0`

, and `c2 > 0`

. Which means that if I want to use this theorem, even just to prove something that relies only upon one of these constants, then I have to prove that both of these constants (which are already bound) are greater than 0, whereas if I use the subset types I get a proof along with the constants that they are greater than 0. Thus the second theorem is actually much more useful!

Interleave mode is an Emacs mode for taking notes alongside a PDF, and it is absolutely amazing! I find that it really helps me stay focused when reading through a paper / textbook, and switching between the buffers to take notes / read the PDF is actually just fiddly enough that it’s a kind of fidget. I highly recommend this if you ever need to really focus and get through something.

]]>Let’s keep this going! The goal of this is again, to write down some of my thoughts and what I did. This is not necessarily going to be accurate, but maybe it will be useful! Contact me if you find mistakes :).

The past week I have managed to get through another chunk of papers on the following topics:

- Removing information leaks through the progress covert channel
- Ensuring data erasure in programs which use untrusted data stores

Additionally, I have started to research real number libraries in Coq for use in a project…

Each of the papers I read deal with the notion of attacker knowledge. In these papers attacker knowledge is defined as the set of all possible initial memories that the attacker can determine based on the observed output of the program, knowing the program text. So essentially if you have something like this, where $x, y \in \{0, 1\}$, and let’s say that these are actually read off of the disk so that an attacker does not know what they are based on the text of the program:

```
x = 1; y = 0;
output(x);
```

Then an attacker observing the output would no longer have any possible value for `x`

in its set of knowledge. The attacker would be able to determine that `x`

was initialized to `1`

. So the attacker’s knowledge is something like:

$\{x = 1 \wedge y = 0, x = 1 \wedge y = 1\}$

The attacker knows exactly what `x`

is, but is unable to narrow down `y`

in the above case. However, consider something like:

```
x = 1; y = 0;
output(x || y);
```

In this case the attacker would be able to determine a knowledge set which looks something like this, if they observed an output of `1`

:

$\{x = 1 \wedge y = 0, x = 1 \wedge y = 1, x = 0 \wedge y = 1\}$

The attacker does not know if `x = 0`

or `x = 1`

in this case, since there are ways for the output to be `1`

regardless (if `y = 1`

then `x`

can be 0). However, the attacker would know that both `x`

and `y`

can not be `0`

, since that would lead to an output of `0`

instead.

This is a fairly intuitive definition of attacker knowledge, since it classifies all possible things which the attacker is able to determine, but it does seem a bit strange at first glance since the smaller the attacker knowledge set is, the more information the attacker knows. A smaller set means that the attacker is able to remove more possibilities that it knows are not true, giving it more specific information.

This week I went through the Precise Enforcement of Progress-Sensitive Security paper. I think this nicely compliments the paper that I went through last week on timing-sensitive garbage collection, where I expressed some concerns over how one might be able to leak information when `at`

blocks crash.

The basic idea is to use a termination checker (in this paper, a runtime termination checker) to see if you can determine whether or not a loop will terminate or diverge based only on low security information. If this can be determined just with low security information, then any attacker at the low security level will not gain any information about high data based on whether or not the loop terminates. If this can’t be determined, then you just terminate the program. It seems like this could be slightly extended, since there may be situations where you can determine that the divergence / termination of the loop only depends upon low information, but when encountering the loop at runtime the termination checker can’t tell which it would be. This might not matter much at all, though, since this check is only needed for loops which handle high security data anyway. A loop which already deals only with low data is already going to have a low security level.

This approach could obviously be applied directly to the timing sensitive garbage collector I read about last week, as you could use a termination checker on the `at`

command in theory.

Additionally I went through this paper: Cryptographic Enforcement of Language-Based Information Erasure.

The gist of this paper was using techniques for ensuring information erasure, i.e., that data is erased after a certain point of execution such that an attacker is unable to collect that information if it observes the program after the data should have been erased. The paper combined this with cryptographic techniques for erasing data from an untrusted data store. Motivating example is a Snapchat-like application, which should delete messages after a certain point in time.

The paper talked about applications using untrusted data stores, where you can’t necessarily guarantee that the data is removed. A common example of this is something like cloud storage. In theory if you’re putting a bunch of information in Amazon or Dropbox, they probably keep backups of that data. Maybe when you delete the data they retain it anyway, maybe the data is recoverable by the next person to use that portion of the disk… You don’t know. Obviously the solution is to encrypt the data sent to this data store prior, in which case it’s essentially just a collection of garbage random bits. What I had not thought about before was that your own personal filesystems on your drives might be considered untrusted as well. One of the motivating examples was that a bug in the android filesystem left these snapchats available for later retrieval. Modern disks can also duplicate data and move it around unknown to you. However, by encrypting the information prior to storing it to disk you can ensure that everything on that disk is essentially useless to anybody without the private key to decrypt that data. In a sense it’s already deleted. Then you only need to ensure that the private key is thoroughly and securely deleted, and not gigs and gigs of data.

An interesting part of this paper was the classification of “equivalence up to formal randomness”. If an attacker is able to determine when cyphertexts are distinct, but not the contents, then it’s very easy to leak information unintentionally.

For instance:

```
u = encrypt(pk, 0);
output(u);
if secret then v = u; else v = encrypt(pk, 0);
output(v);
```

In this case the attacker could not determine the value of `u`

or `v`

, but they could determine the value of `secret`

by checking whether or not the outputs are the same. If `secret`

is `false`

then the second encryption will use different random values to encrypt the value `0`

than the first, so the ciphertext will appear different. The possible output sequences are said to *not* be equivalent up to formal randomness, since the attacker can tell that one looks like `uu`

and the other looks like `uv`

where `v`

is distinct from `u`

.

```
u = encrypt(pk, 0);
v = encrypt(pk, 0);
if secret then {output(u); output(v);} else {output(v); output(u);}
```

In this case we say the possible output sequences `uv`

and `vu`

are equivalent up to formal randomness. Since `u`

and `v`

are just random garbage the attacker can’t determine the values of them, only that they’re different. So it’s possible to think of a permutation of these random values as being equivalent. If you remap `u`

to `v`

and `v`

to `u`

then you get the other possible output sequence. There’s no way for the attacker to tell which branch was actually taken by looking at the outputs.

There seems to be multiple real number libraries for Coq. I have been directed at a few possibilities.

My understanding is that C-CoRN uses the math classes library behind the scenes. However, it seems math-classes does not actually have real numbers defined in it, so that’s a bust.

I have heard mixed things about the built in real number library. It obviously introduces a bunch of axioms for real numbers, and some people have groaned about it. My initial dive into this has been a bit odd. For instance the reals library seems to be a bit oddly spotty. No functions for logarithms, but hey limits are defined on arbitrary metric spaces! Can the limits diverge? Doesn’t seem like that’s possible… Which is kind of unfortunate. Can you have a limit that tends towards infinity? Nope.

It sounds like C-CoRN / the math classes library have things set up in a fairly principled way, and you should be able to substitute different real number implementations, so this is probably the way to go…

The main concern I have is that C-CoRN seems to still need Coq 8.5.2, and we’re on Coq 8.7. The math classes library works on 8.7, but as previously mentioned, it lacks real numbers, making this a bit of a moot point.

Orgzly has proven to be useful to have on my phone, so that I can see all of my org-mode stuff on the go, and possibly make small changes.

A number of things have been very annoying with it, but it’s still a net win! What I find could be improved at the moment:

- No built in git sync.
- Synchronizing with git manually on a phone is so painful!

- Checkboxes don’t work, though they will in the next major version!

I have attempted to set up beeminder.el to have org-mode automatically able to submit data for my beeminder goals. I find a couple of details don’t work well for me — it sets the deadline of the task to when you would derail from the goal in Beeminder. This is a good feature, but often I like having deadlines separate from that! Additionally, if I want to use Orgzly on my phone to submit tasks, then it won’t update Beeminder… So, it’s probably best to just manually record a lot of data for now.

In typical agile buzzwords fashion I have started doing a biweekly review of the tasks left in my org files. Currently I still have a bunch of tasks which I have *no idea* how to sort out off the top of my head. However, I have begun to organize them using org’s handy `M-<up>`

and `M-<down>`

, which lets me reorder headings very quickly.

The increased scheduling that I have been doing has been very helpful as well! I have been slowly going through the Slack logs left over from the DeepSpec Summer School in 2017 to make a resource of all of the great questions and answers that have come out of that. Previously progress had been more or less halted, but now that I have a scheduled time to go through it every week it’s getting done! Soon we’ll have another resource of a bunch of answers to beginner questions for Coq!

I have also managed to get a bit ahead on some of my Beeminder goals. Previously I have been skirting the edge with these goals, but I am finding that Beeminder works much better with the extra buffer. For those interested in Beeminder, this is an important piece of advice. Getting ahead is significantly less stressful and can grant you much better flexibility if you’re on top of things, which ultimately helps you be even more productive, I think. You can take an hour off of a task one day and then make up that time over a longer period of time, which can be quite helpful. Falling behind with a day buffer can be incredibly stressful, being able to amortize this is immensely valuable.

]]>I’m starting a small series. The gist of it? What I learned in the past week.

Each post will be a brief summary of some things which I studied in the past week. Not necessarily an in depth exploration of the topics, but you may find useful insight, or at least useful references if something is a topic of interest.

If you find a mistake in my understanding, or have a question, feel free to contact me!

With that out of the way, let’s talk about some of the things I read about this week. Coinduction in Coq, dependent pattern matching in Coq, and some papers in langsec.

I had messed with coinductive data types in the context of Idris before, while going through Type-Driven Development with Idris. I haven’t used them in Coq however, and I ended up reading about this in the coinductive chapter in CPDT. While going through this a couple of things clicked for me.

With my first pass through Type-Driven Development with Idris it was not immediately obvious to me why coinductive definitions have to be productive. It makes a lot of sense when you start thinking about them in terms of, say, generators in Python. For instance a stream of values is essentially just a structure for producing “next” values in a sequence. If it’s not known whether or not you will get a next value in the sequence (e.g., if you `filter`

a stream), then it’s possible that asking for the next value in the sequence would not terminate.

Coq uses the guardedness condition, which is a syntactic check to guarantee that coinductive definitions are actually productive. This is very similar to how Coq normally uses simple structural recursion to check for termination of definitions, with the same goal of preventing inconsistencies in the logic.

I have been working on a project using dependent typing in Coq. I have previously worked with dependent types in Agda and Idris, so I thought that adding dependent types to everything in Coq would be easy peasy! I was wrong.

As it turns out Coq does not include the so called “Axiom K” which makes working with dependent types much easier. I don’t actually know too much about what “Axiom K” is, or what it does yet, but my understanding is roughly that it allows for some coercion between types. I also assume that while you can add “Axiom K” to Coq, it’s probably still not as convenient as working with dependent pattern matching in something like Agda which has this assumption built in, so it can take this into account to give nicer syntax and whatnot. I could be wrong about these assumptions (let me know!) but this was my understanding from one of the student talks on this topic at the DeepSpec Summer School in 2017.

Anyway! So, what’s the problem I ran into? Well, in my project I figured I would work a lot with vectors – list types which depend upon the length of the list as well. I’m using the builtin Coq vector library, but a vector basically looks like this:

```
Inductive Vec A : nat -> Type :=
| nil : Vec A 0
| cons : forall (h:A) (n:nat), Vec A n -> Vec A (S n).
```

Since the `Vec`

type keeps track of the length of the vector it can very conveniently prevent out of bounds array access, and simplify some functions which would normally be partial, like `hd`

.

Somewhat foolishly I assumed that this would make a lot of the proofs in my project easier. After all a vector type has more information than a list type, which is more information that you have access to for a proof.

However, while this is true, a vector also has more restrictions than a list type, and sometimes these restrictions can get in the way. I was somewhat surprised to find that this even had an effect on the theorems that I was allowed to write down.

For instance this theorem

```
Theorem vec_n_0_nil :
forall {A n} (v : Vec A n),
n = 0 -> v = nil A.
```

Coq complains, rightfully so, that `nil A`

is going to have type `Vec A 0`

, but `v`

has type `Vec A n`

.

```
Error:
In environment
A : Type
n : nat
v : Vec A n
The term "nil A" has type "Vec A 0" while it is expected to have type "Vec A n".
```

Coq won’t take into account our hypothesis that `n = 0`

when trying to unify the types, and I currently have no clue how to do this. Often these theorems can be rewritten to something reasonable, however:

```
Theorem vec_0_nil :
forall {A} (v : Vec A 0),
v = nil A.
```

Which is probably the way to go, since you should be able to rewrite with a hypothesis in most places.

`sig`

I encountered the `sig`

type unexpectedly while working on my project. It was simple enough to work with, but I wasn’t sure what it meant exactly…

It means sigma. It’s a sigma type. So, it handles existential quantification. There’s some really good documentation about how to use `sig`

for subtypes in this CPDT chapter. In particular, I like the use of Coq’s notation features to almost completely eliminate the overhead of using subset types in Coq.

There’s a couple of tricks to pattern matching with dependent types in Coq, since it doesn’t have as much magic as Agda. One thing that wasn’t immediately obvious to me was how to deal with “absurd” branches, where a branch in a match would lead to a contradiction, so it should be ignored.

In Agda you might use the absurd pattern to deal with branches of a program which should never be executed due to a condition which will never hold. I.e., if the condition to enter the branch is true, then that condition implies `False`

.

```
hd {X} (xs : list X) (length xs > 0) : X
```

For instance, in the case of a `hd`

function, which takes the first element of the list, you might have an argument that’s a proof that the length of the list is greater than 0. But, you still have to provide a case to the match on the list for when it’s empty, after all, otherwise you’ll have an incomplete pattern match.

Well, that’s okay. Because in Coq we can use an empty pattern match on False. So you might think you can do something like this:

```
Require Import List.
Lemma length_nil_not_gt_0 :
forall {X}, @length X nil > 0 -> False.
Proof.
intros X H. inversion H.
Qed.
Definition hd {X} (xs : list X) (pf : length xs > 0) : X :=
match xs with
| nil => match length_nil_not_gt_0 pf with end
| h :: t => h
end.
```

But this still doesn’t quite work.

```
Error:
In environment
X : Type
xs : list X
pf : length xs > 0
The term "pf" has type "length xs > 0" while it is expected to have type
"length nil > 0".
```

Coq does not seem to be recognizing that in this branch `xs = nil`

, and so it’s not replacing `xs`

with `nil`

when it’s typechecking.

So, how can we get around this? What seems to be happening is that `pf`

is getting tied to the type `length xs > 0`

at the beginning, at the very top level, where `xs`

is left very general.

The way to get around this is to make `hd`

actually return a function, which takes a proof as an argument. This lets the `pf`

have a different, more specific type in each branch.

```
Require Import List.
Lemma length_nil_not_gt_0 :
forall {X}, @length X nil > 0 -> False.
Proof.
intros X H. inversion H.
Qed.
Definition hd {X} (xs : list X) : (length xs > 0) -> X :=
match xs with
| nil => fun pf => match length_nil_not_gt_0 pf with end
| h :: t => fun _ => h
end.
```

I’m still getting my head fully around `match`

annotations, but I have found the following resources useful on my quest to enlightenment:

I read this paper on timing sensitive garbage collection. The problem that the paper is aiming to solve is the fact that you can leak information through timing differences in garbage collection.

Simply put, if you branch on a bit of information, and one branch allocates a bunch of memory on the heap, while the other doesn’t, a lower security level section of the program can cause the garbage collector to trigger depending on the branch taken, and timing the amount of time that this takes can reveal a bit of information from the higher security level section of the program.

Separate heaps are used for each security level, and garbage collection of a heap is only allowed when the program is currently in the same security level. This stops a low section of the program from triggering collections based on what’s in a high heap.

Ultimately the result is that you can’t observe different times for anything running at a higher security level.

One aspect of this which I do find problematic is the `at`

command, which bounds the execution time. If it takes less time to execute the `at`

block than the execution time, then the time to complete the execution of this is padded out. I.e., it will always take the same amount of time to get through the `at`

command. While this is inefficient, I’m not sure there’s much you can do about that while maintaining the same security guarantees. However, what I do find slightly more problematic is when the execution time exceeds the bound — the program just crashes, since execution can’t continue while keeping the security guarantees. After all, if you could tell that the `at`

command did not finish in time you could use that to leak information to the lower security section of the program. The paper does briefly mention that it doesn’t handle this, or similar problems like heap exhaustion. It instead suggests that one should be able to use tangential techniques for handling “termination or progress-sensitive security”.

That said, a nice aspect of the `at`

command is that the bound can be any expression, and does not have to be a constant value. This means that you can calculate an upper bound dynamically, which on one hand gives you a lot more leeway while allowing you to make your bounds as tight as possible for efficiency. However, I suspect that this aspect is itself fairly complicated to reason about, and making this an arbitrary expression could mean that problems with the bounds could be complicated and difficult to spot while testing. Although, the thing that you’re timing is of course Turing complete anyway, so maybe it doesn’t matter that the bounds can change according to some arbitrary calculation either.

I’m very curious how this plays out in a real world implementation. Compiler optimizations could make it more difficult to judge how long things will take, and it’s possible that the bounds could be fairly complicated as timings could be very different depending on whether or not your array fits in cache, memory, or swap. These bounds may have to be different for different computers, and may be subject to interference from different processes.

I do like this, though! It’s something that comes up with a lot of security features that operating systems can provide anyway – if the system sees something that means it can no longer guarantee security, then maybe it should just crash to ensure the user’s safety! I suspect that crashing is sufficient to leak a bit of information each time the program executes, though, as long as it’s able to talk to the outside world, but this is still a marked improvement over the exploit pointed out in the paper which would let the program leak a bit of high security information per iteration of a loop.

One thing that I have learned this week is that pdf-tools for Emacs is invaluable for getting through papers. Having everything within Emacs that I need to get through is much less distracting, and being able to easily write notes in org-mode as I go is a huge win.

Trying to go through a paper in Zathura with Emacs on the side, or god forbid a PDF viewer in a browser, was sufficiently distracting, and ruined the flow of reading through the paper just enough.

]]>I’m giving a talk as an introduction to dependently typed programming languages. The contents of this talk can be found in this post, and the slides can be found here.

Hi all!

My name is Calvin and I’m here to talk to you about a couple of things – mostly types, and how they can help you write correct programs! This talk aims to address some of the following:

Firstly: You should care, and you NEED to care about your software. Computers are integral to nearly

**every**aspect of our lives, and their influence is growing daily. Whether it’s a voting machine, banking software, or a component of a vehicle, there’s always a programmer behind it somewhere.And this fact should probably terrify you if you have ever programmed before.

An incorrect program may be annoying and lose a business a customer, or it could be life threatening if it’s something vital, like the software for a pacemaker.

We need to be careful!

Types can help us be careful, and are a good way to deal with how scary this is!

Secondly, not only are types great for writing correct programs, but they can make it a lot easier to write programs in the first place! They help the compiler help you!

This is going to be somewhat of a whirlwind introduction to the world of types, so if you get lost along the way that’s okay. Just let me know and hopefully I can clarify things. If all else fails talk to me after the fact! Sometimes it takes a while to grok mathy stuff!

You think that this is a normal, and necessary thing:

That’s bad. Runtime exceptions don’t actually *have* to exist, and it can be very bad if our pacemaker segfaults. So maybe we should avoid these problems in the first place!

I’ll talk a bit about what we can do to avoid these problems. Later we’ll look a little into how you can guarantee that the logic in your program is correct too, and not just that it does not explode at runtime.

All of this is going to involve a little help from types.

A type describes what a value “is”.

You’ve probably heard of types in programming languages, and probably even used them. Most of the time you see them in the context of “it tells the compiler how to store the value in memory”, and while that’s true it’s not the entire picture.

Over the years in computing science and mathematics we’ve learned that types are good for a lot more than just figuring out how to lay out bits in memory.

Types tell us how you can use values – what operations are defined on the type? Can you add things of that type? Can a function take a value of a certain type as an argument?

This can really help us write programs that make sense! And these types are excellent documentation which the compiler can ensure is accurate!

A type checker can be used to reject programs which consist of nonsense like $357^{circles}$, and in fact types can eliminate entire classes of errors if designed properly. No more null reference exceptions!

You might have already seen how a few languages use types. Let’s discuss some of them quickly!

In Python we might have something like this:

```
def my_sort(xs):
if xs == []:
return xs
else:
first_elem = xs[0]
rest = xs[1:]
smaller = my_sort([x for x in rest if x <= first_elem])
larger = my_sort([x for x in rest if x > first_elem])
return smaller + [first_elem] + larger
def my_factorial(n):
if n == 0:
return 1
else:
return n * my_factorial(n-1)
```

Python likes to pretend that types aren’t a thing, so Python doesn’t tell us anything about what a function like this can do. We can pass this function whatever we want as an argument, and it may or may not fail – we don’t know until we run the program or read it very carefully.

Naming and documentation can help, but in practice, since it can’t be done automatically, enforcing good naming and documentation is really damn hard.

In a large code base it’s difficult to even know what you should pass to a function. Should it take a list, or a set, or an integer?

This factorial function only works with ints (a non-integer number will never trigger the base case), but you might not realize you’re calling it with floats until it’s too late!

Can you simply pass the result of another function to this one, or might that function return None, which this factorial function can’t handle? You can’t know for sure until you read what that other function does, and what every function that function calls does. That’s a lot of work! You can run your program, perhaps with a suite of tests, but that can easily miss a special case.

Another concern is that this function could do a bunch of secret stuff. It could throw away the argument, and read in an integer from some file – maybe it will crash if that file doesn’t exist! It could change the value of some global variable, causing it to return different values depending on the last time it was called – and this might cause other functions to behave differently as well! This can make your program a complicated web of states, which is really difficult to wrap your head around because you need to understand it in its entirety – calling any function could have a drastic effect on the behavior of your program. We’ve all been here, and it’s awful! Often better to rewrite the program than it is to debug it! It would be nice to keep things separated into nice modular compartments that don’t affect each other. That’s what functions are supposed to do, but very often they rely upon outside state so they’re not actually compartmentalized.

What if we could force functions to be compartmentalized so we can’t make these mistakes!? What if we could express what a function can and can’t do in a concise format, and then have the compiler or interpreter tell us when something could go wrong! Why should we accept runtime exceptions when we can catch these problems early on!?

Just a hint, but this is very possible! And we’re going to do it with types!

In languages like Java you have to specify the types of things:

```
Integer factorial(Integer n) {
if (n == 0) {
return 1;
}
else {
return n * factorial(n - 1);
}
}
ArrayList<Integer> my_sort(ArrayList<Integer> xs) {
if (xs.size() == 0) {
return new ArrayList<Integer>();
}
else {
...
}
}
```

This little bit of added verbosity actually helps us a lot! We don’t run into issues with non-termination when we accidentally pass in a floating point value like 3.1, and we get to know a little bit about what this function can do – we can see from the types that it takes an integer value, and returns an integer value.

Some languages that do this kind of thing will perform implicit type conversions. If we call `factorial(3.1)`

these languages might convert the floating point number 3.1 to the integer value 3 without telling us about it. This might seem convenient, but sometimes this can lead to really nasty and hard to track down bugs when you think you’re doing one thing, but the language is hiding these sneaky conversions behind the scenes. I’m of the opinion that it’s better to explicitly convert the values – you don’t actually want to do conversions that often, and when you do it’s better to know when it’s happening, otherwise you might end up with unexpected behavior which is really difficult to debug.

Even this Java example has problems. For instance Java is a language with null references. A variable of any type in Java (save for some primitive types) can have the value `null`

assigned to it. You’ve probably seen `null`

in languages before, even Python sort of has this with `None`

. The problem with `null`

inhabiting every type is that it behaves very poorly with almost every operation. Comparing `null`

to 0 could lead to a runtime exception. Subtracting 1 from `null`

would lead to a runtime exception. We don’t want runtime exceptions, since we might not catch them until our application is running in production! It would be great if the compiler could tell us when we’re doing something that doesn’t make sense like comparing a null value to an integer. Sometimes it makes sense to have `None`

values, since a computation could have no solution, or fail for some reason, but we need the compiler to ensure that we check for these cases. We are notoriously bad at checking for null references, and it’s particularly difficult and verbose when every variable can be `null`

.

Which leads us to the issue that a lot of people don’t like declaring types for all of their variables, thinking that this is a tedious task when the compiler can clearly see that 3 is an integer. We’ll see shortly that this extra syntax can be avoided most of the time with “type inference”, and that when we do choose to write types it can actually make writing our programs easier and quicker. There’s really no excuse not to have types!

Languages like Java are what you might think of when you think of types, and maybe that makes you think types are bad. I assure you that it’s Java that’s wrong, and not the types!

Alright, so there are a few things that can make types better for us. First of all we should identify some important qualities that we want.

- Catch errors at compile time. If something is “wrong”, why wait for the program to run to tell us?
- Ease reading and writing programs.
- Allow us to specify properties, and guarantees within our programs. E.g., this function does not alter global state, or read from a file.
- Less verbosity when writing types. Should be easy and clean!

So, our trip through the land of types brings us to Haskell. Haskell is a programming language which treats types well. The syntax may be a little different than what you’re used to, but it’s surprisingly clean, concise, and precise. Haskell is quite a mathematical language. Haskell is a pure language meaning that whenever you call a function with the same inputs, it produces the same outputs, which can help you understand your programs a lot better. In Haskell there is no immutable state, when you give a variable a value that value can’t change. This sounds limiting, but it’s really not, you won’t even notice in the examples. But it helps you undestand your programs a lot better. You only have to look for where `x`

is assigned to understand what value `x`

has; you need not scrutinize the entire program.

Recall the Python programs from earlier:

```
def my_sort(xs):
if xs == []:
return xs
else:
first_elem = xs[0]
rest = xs[1:]
smaller = my_sort([x for x in rest if x <= first_elem])
larger = my_sort([x for x in rest if x > first_elem])
return smaller + [first_elem] + larger
def my_factorial(n):
if n == 0:
return 1
else:
return n * my_factorial(n-1)
```

These might look like this in Haskell

```
mySort :: Ord a => [a] -> [a]
mySort [] = []
mySort (first_elem::rest) = smaller ++ [first_elem] ++ larger
where smaller = mySort [x | x <- rest, x <= first_elem]
larger = mySort [x | x <- rest, x > first_elem]
factorial :: Integer -> Integer
factorial 0 = 1
factorial n = n * factorial (n - 1)
```

This actually looks pretty nice! In each of these functions it does what’s called pattern matching to break down the different cases. You hardly have to write any type signatures at all, but it’s useful to write the top level signatures that you see here as it helps guide you when writing the function – it acts as a little specification and the compiler can tell you if you deviate from it. In Haskell even these can be avoided, and the compiler can still infer what the types of variables should be in most cases. After all if you write 3, then it’s probably a number. If you multiply a variable by another floating point number, then that variable has to be a float too, so the compiler could figure this out for us. This lets us be as explicit with our types as we want, but the compiler can still catch issues even if you don’t tell it the type of something. You’ll probably find that writing type signatures for functions in Haskell really helps you figure out how to write the function. It’s kind of like test driven development in a way, it gives you an idea of how you would use the function right away, which makes it easier to write the logic later.

In the sort function you’ll see what’s called a typeclass constraint, “Ord”, which stands for “ordered”, and a type variable “a”. This means that “a” can be any type as long as it implements the functions in “Ord”, which contains things like “less than”, “equal to”, and “greater than” comparisons.

This is great, because now we know exactly what we can do with the elements of the list passed into the sort function! We can compare them, and since they have an ordering we can sort them!

Now if you try to sort a list of unorderable things, like functions, the compiler will complain.

`mySort [factorial, (*2), lambda x] -- Causes a type error, because it doesn't make sense.`

Whereas in python it will just cause a runtime exception, which we might not know about until it’s too late!

```
# This causes an error when the program is running...
# We might not catch something like this until it hits production!
sorted([lambda x: x * 2, lambda x: x ** 2])
```

Additionally, we do need the `Ord`

constraint in Haskell. Otherwise we have something like this:

```
-- Instead of: Ord a => [a] -> [a]
mySort :: [a] -> [a]
mySort [] = []
mySort (first_elem::rest) = smaller ++ [first_elem] ++ larger
where smaller = mySort [x | x <- rest, x <= first_elem]
larger = mySort [x | x <- rest, x > first_elem]
```

Which causes a type error, since `a`

could be **any type** without this constraint, which also includes unorderable types like functions, or pictures. If the compiler lets you call mySort on a list of something, then that list can actually be sorted, and you’re guaranteed that things will just work! So that’s one less thing to worry about at runtime!

Haskell is also a bit more strict about what its types mean. For instance we know that these functions can’t return “None” or “null”. In the case of the factorial function it MUST return an integer value of some kind, and in Haskell there is no “None” or “null” value under the Integer type.

These “Nothing” values are encoded in so-called “Maybe” types, i.e., types which may contain just a value of a given type, or may yield Nothing.

```
-- Find out where a value is in a list.
whichIndex :: Eq a => a -> [a] -> Maybe Integer
whichIndex = whichIndexAcc 0
-- Helper function that remembers our position in the list.
whichIndexAcc :: Eq a => Integer -> a -> [a] -> Maybe Integer
whichIndexAcc pos value [] = Nothing
whichIndexAcc pos value (x::xs) = if x == value
then Just pos
else whichIndexAcc (pos+1) xs
-- A dictionary of all the important words.
dictionary :: [String]
dictionary = ["cats", "sandwiches", "hot chocolate"]
main :: IO ()
main = do entry <- getLine
case whichIndex entry dictionary of
(Just pos) => putStrLn "Your entry is at position " ++ show pos ++ " in the dictionary."
Nothing => putStrLn "Your entry does not appear in the dictionary."
```

In this case you know that `getIndex`

can return something like a `null`

value called `Nothing`

, but it could also return “Just” an Integer. You have to explicitly unwrap these values to get at the possible value, like in the case statement in `main`

. This might seem tedious, but if you’re a fancy Haskell person you might use “do” notation, which does this automatically.

```
-- Look up a word in the same position in a different dictionary.
dictionary :: [String]
dictionary = ["cats", "sandwiches", "hot chocolate"]
synonyms :: [String]
synonyms = ["meows", "bread oreos", "sweet nectar"]
moreSynonyms :: [String]
moreSynonyms = ["floofs", "subs", "hot coco"]
getIndex :: Integer -> [a] -> Maybe a
getIndex _ [] = Nothing
getIndex 0 (x:xs) = Just x
getIndex n (_:xs) = getIndex (n-1) xs
lookupSynonyms :: String -> Maybe (String, String)
lookupSynonyms word = do index <- getIndex word dictionary
-- Lookup my synonyms, if anything fails return Nothing.
firstSynonym <- getIndex index synonyms
secondSynonym <- getIndex index moreSynonyms
-- Success! Return Just the synonyms.
Just (firstSynonym, secondSynonym)
-- lookupSynonyms essentially desugars to this.
-- The compiler can help avoid this tedium!
painfulLookupSynonyms :: String -> Maybe (String, String)
painfulLookupSynonyms word = case getIndex word dictionary of
Nothing -> Nothing
(Just index) -> case getIndex index synonyms of
Nothing -> Nothing
(Just first) -> case getIndex index moreSynonyms of
Nothing -> Nothing
(Just second) -> Just (first, second)
main :: IO ()
main = do word <- getLine
case lookupSynonym word of
Nothing -> putStrLn ("Hmmm, I don't know a synonym for " ++ word)
(Just synonym) -> putStrLn ("I think " ++ word ++ "'s are a lot like " ++ synonym ++ "'s!")
```

Types never really add any extra tedium, and they can often relieve it because the compiler can automatically do stuff for you.

These examples also show how input and output are encoded in the types. For example:

```
-- putStrLn :: IO ()
-- getLine :: IO String
main :: IO ()
main = do putStrLn "What is your name?"
name <- getLine
putStrLn ("Hello, " ++ name)
```

The `()`

’s essentially mean “void” or “no return value,” we’re just printing stuff so there’s nothing valuable to return. An ```
IO
String
```

, like `getLine`

, is something which gets a string value using IO. A function which computes its return value based on an IO action will be forced to have an IO type as well, so you can’t hide IO actions in functions which supposedly don’t rely upon IO.

It seems that Haskell satisfies most of our goals.

- We can catch errors at compile time. If something is “wrong”, why wait for the program to run to tell us?
- The type system lets us describe values in a fair amount of detail.
- Types don’t contain values like
`null`

which cause explosions at runtime.

- It eases reading and writing programs. It’s nice to know what a function can do based on a small type.
- Types help in much the same way as test driven development. * Makes you think about the arguments you function takes, and what it returns.
- Thinking about what you can actually compute with restricted types is helpful. * Keeps focus. * Helps you know what a function can possibly do.
- Types point out errors when developing, such as forgetting to unwrap a Maybe value and check each of the cases.

- It allow us to specify properties, and guarantees within our programs. E.g., this function does not alter global state, or read from a file.
- Functions are “pure”, meaning they always produce the same output for the same input.
- Special actions, like IO, are labeled in the type. So you can’t use an IO value in a non-IO function. * The IO action would cause the calling function to have an IO type. IO taints values, and can’t be hidden.

This is really great, and it’s super helpful. There’s a saying that “if a Haskell program compiles, then it’s probably correct” because the type system ends up preventing a lot of errors. For instance, you never end up trying to index `None`

like you would in Python. Think how much time you would save if you never ran into that problem! Quite a lot!

However, we can do even better!

There are some things that we just can’t do even with Haskell’s types. I can write a function to index a list

```
index :: Integer -> [a] -> Maybe a
index 0 [] = Nothing
index 0 (x::xs) = Just x
index n (x::xs) = index (n-1) xs
```

But I can’t write one that the compiler can ensure is never called with an index outside the range of our list.

```
-- Want the integer argument to always be in range so we don't need
-- Maybe!
index :: Integer -> [a] -> a
index 0 [] = error "Uh... Whoops, walking off the end of the list!"
index 0 (x :: xs) = x
index n (x :: xs) = index (n-1) xs
```

We need to somehow encode the length of the list into the type so we can only call index when the position provided is in range. We can’t do this in Haskell because it doesn’t let us have types which depend upon values (e.g., the length of a list).

It’s also not possible to encode other properties which depend upon values in the types. For instance I can’t say that a function returns a list of values which are sorted in ascending order, I can only say that a sort function also returns a list with values of the same type…

```
mySort :: Ord a => [a] -> [a]
mySort [] = []
mySort (first_elem::rest) = smaller ++ [first_elem] ++ larger
where smaller = mySort [x | x <- rest, x <= first_elem]
larger = mySort [x | x <- rest, x > first_elem]
```

It’s nice that we can specify that this function only works on lists which have orderable elements, but it would be even better if we could also say things like…

- The output list must have the same length as the input list.
- The list in the output must contain the same elements as the input list.
- The output list must be sorted in ascending order.

If we could encode these properties in the types, then if the program type checks it would prove that our sort function does the right thing.

In fact, that’s an interesting idea, isn’t it? Why don’t we make it so we can encode essentially any set of properties in our types, any proposition we can think of, and then make it so our program only type checks if it satisfies these properties. That would be a very powerful tool for ensuring the correctness of our programs! Maybe we can even use such a type checker to help us with our proofy math homework? We’ll look into this idea very shortly, but first let’s look at some basic dependent types in Idris, a programming language that is essentially Haskell with dependent types.

The classic example of a dependent type is a vector. A vector is a lot like a list, but the length of the list is included in the type.

So, for example, a vector of 2 strings is a different type from a vector of 3 strings.

```
two_little_piggies : Vect 2 String
two_little_piggies = ["Oinkers", "Snorkins"]
-- This would be a type error, caught at compilation:
three_little_piggies : Vect 3 String
three_little_piggies = two_little_piggies
```

And one thing that’s cool about this is you can actually do some computations at the type level to make more complicated, generalized functions. A classic example is appending two vectors together.

`append : Vect n elem -> Vect m elem -> Vect (n + m) elem`

The lower case identifiers in the type are “variables” again, in this case meaning `n`

and `m`

can be any natural numbers, and `elem`

can be any type, this is because `Vect`

is defined as follows:

```
data Vect : Nat -> Type -> Type where
Nil : Vect 0 a
(::) : (x : a) -> Vect k a -> Vect (S k) a
```

Meaning that the type constructor `Vect`

takes a natural number, and another type, in order to make a full vector type.

Idris has a lot of built in tools for generating your programs based on their types. Since this type for `append`

is actually pretty specific, Idris is able to do a lot of the work for us. Let’s walk through how that might work.

```
append : Vect n elem -> Vect m elem -> Vect (n + m) elem
append xs ys = ?append_rhs
```

The thing on the right hand side is known as a “hole”, and this is a stand in for a value which Idris can potentially fill in for us, or it can at least tell us the type of what we should put in the hole.

Since Idris knows how types are constructed, we can have it automatically perform a case split on the first argument, leading to this:

```
append : Vect n elem -> Vect m elem -> Vect (n + m) elem
append [] ys = ?append_rhs_1
append (x :: xs) ys = ?append_rhs_2
```

Which gives us two cases, with two holes. Idris helpfully tells us about these holes:

```
- + Main.append_rhs_1 [P]
`__ elem : Type
m : Nat
ys : Vect m elem
------------------------------------------
Main.append_rhs_1 : Vect (0 + m) elem
- + Main.append_rhs_2 [P]
`__ elem : Type
x : elem
m : Nat
ys : Vect m elem
k : Nat
xs : Vect k elem
----------------------------------------------
Main.append_rhs_2 : Vect ((S k) + m) elem
```

Above the dashed line you can see what variables are in scope where the hole is, and what types they have. Underneath we have our hole, and the type that it has.

Idris is smart, so it can automatically find values that match a hole of a given type. For the first hole we know that it has type `Vect (0 + m) elem`

, but Idris evaluates this to `Vect m elem`

, and the only vector of length `m`

that it has in scope is `ys`

, so it just happily fills this in for us, if we ask nicely!

```
append : Vect n elem -> Vect m elem -> Vect (n + m) elem
append [] ys = ys
append (x :: xs) ys = ?append_rhs_2
```

The second hole is a bit more interesting.

```
- + Main.append_rhs_2 [P]
`__ elem : Type
x : elem
m : Nat
ys : Vect m elem
k : Nat
xs : Vect k elem
----------------------------------------------
Main.append_rhs_2 : Vect ((S k) + m) elem
```

We can see that `xs`

has been given the type `Vect k elem`

, which means that `n = S k`

, since `xs`

is a part of the `Vect n elem`

argument, just with one less element since `x`

is split from it. `S`

means successor, so `S k`

is just the next natural number from `k`

, so it’s `k+1`

.

Our goal is to make a vector with length `S k + m`

, which we can happily ask Idris to do, and it finds:

```
append : Vect n elem -> Vect m elem -> Vect (n + m) elem
append [] ys = ys
append (x :: xs) ys = x :: append xs ys
```

… Which is exactly what we want. So how did Idris do this? Well, it realized a couple of things.

```
data Nat : Type where
0 : Nat -- Zero
S : Nat -> Nat -- Successor (+1)
(+) : Nat -> Nat -> Nat
(+) 0 m = m
(+) (S k) m = S (k + m)
data Vect : Nat -> Type -> Type where
Nil : Vect 0 a
(::) : (x : a) -> Vect k a -> Vect (S k) a
```

First, it evaluated `(S k) + m`

, which turns out to be ```
S (k +
m)
```

. It looked at the type constructor for a vector and saw that in order to get a `Vect (S (k + m)) elem`

it would need to concatenate an element with a `Vect (k+m) elem`

, which gets us two holes. One for the element to concatenate, and one for the rest of the vector.

```
append : Vect n elem -> Vect m elem -> Vect (n + m) elem
append [] ys = ys
append (x :: xs) ys = ?elem_to_concat :: ?rest_of_vect
```

```
- + Main.elem_to_concat [P]
`__ elem : Type
x : elem
m : Nat
ys : Vect m elem
k : Nat
xs : Vect k elem
-----------------------------------
Main.elem_to_concat : elem
- + Main.rest_of_vect [P]
`__ elem : Type
x : elem
m : Nat
ys : Vect m elem
k : Nat
xs : Vect k elem
------------------------------------------
Main.rest_of_vect : Vect (k + m) elem
```

So, Idris knows of one element with the type `elem`

, and that’s `x`

, so it can fill that in.

```
append : Vect n elem -> Vect m elem -> Vect (n + m) elem
append [] ys = ys
append (x :: xs) ys = x :: ?rest_of_vect
```

It also knows about recursion, so it knows it has this function `append`

which it could call that has a type ```
Vect n elem -> Vect m
elem -> Vect (n + m) elem
```

. And since Idris has a vector ```
xs : Vect
k elem
```

, and a vector `ys : Vect m elem`

, it knows that

`append xs ys : Vect (k + m) elem`

Which is exactly the type of thing we need in this hole, so it can fill it in as well.

So what you just witnessed is Idris essentially writing a program, albeit a small one, for you based on a type which specified the behaviour of this program. That’s awesome, and super helpful!

We can even see how this would not work as well if we were just using lists, which don’t have the length in their type.

```
append : List elem -> List elem -> List elem
append [] ys = []
append (x :: xs) ys = []
```

If you try to fill this in automatically, Idris will just make the function return empty lists, because it’s the easiest way to satisfy the type. If your types are not precise enough, then a number of functions will type check just fine, and Idris can’t tell which one of these possible functions you would want, it just gives you the first one it can find.

We can actually guarantee that a function indexing a vector stays within the bounds of the vector at compile time, too!

```
index : Fin len -> Vect len elem -> elem
index FZ (x :: xs) = x
index (FS n) (_ :: xs) = myIndex n xs
```

`Fin len`

is a type which represents natural numbers strictly less than `len`

. So, given a vector of length `len`

, if we provide a natural number greater than or equal to `len`

as the index, then it would not be an element of the `Fin len`

type, so the program would not type check, catching any potential bugs where you might walk off the end of an array at compile time. Here’s a quick example:

```
cats : Vect 2 String
cats = ["The Panther", "Smoke Smoke"]
-- "The Panther" : String
index 0 cats -- This type checks.
-- (input):1:9:When checking argument prf to function Data.Fin.fromInteger:
-- When using 2 as a literal for a Fin 2
-- 2 is not strictly less than 2
index 2 cats -- This is out of bounds, so the program won't even compile!
```

There are lots of cool guarantees we can make with dependent types! As alluded to earlier, we can even use them to make specifications for how our program should behave with arbitrary propositions, and then use the type checker to ensure that our program actually follows these specifications.

In order to get into this we need to do a quick primer on logic and logical proofs. In logic you have things known as propositions. A proposition is just a statement, such as “the sky is blue”, or “2 + 2 is 4”. These propositions happen to be true, but we can also have propositions which are false, such as “2 + 2 is 27”. A proposition is just something that you can propose. I might propose to you the notion that “2 + 2 is 27”, but using logical proofs we can determine that this proposition is in fact not a true statement.

So! These propositions are often represented by variables, for instance:

`P`

`P`

is a proposition. It could be anything, really…

`P = "ducks are fantastic"`

And I might have another proposition:

`Q = "ducks are truly the worst"`

Right now I’m using plain English to convey these propositions to you, but often they’ll be more mathematical statements, such as:

$\forall n \in \mathrm{N}, \exists m \in \mathrm{N} \text{ such that } m > n$

Propositions are built up from a set of axioms, which are just rules describing your mathematical objects, and propositions can be combined in a number of ways.

- Implications
- $P \rightarrow Q$, meaning “if P is true, then Q must be true.”

- Conjunctions
- $P \wedge Q$, meaning “both P and Q are true.”

- Disjunctions
- $P \vee Q$, meaning “at least one of P or Q is true.”

- Negation
- $\neg P$, meaning “P is false.”

- Universal quantification
- $\forall x, P(x)$, meaning whenever we substitute any value for
`x`

in`P`

, the proposition`P(x)`

holds true.

- $\forall x, P(x)$, meaning whenever we substitute any value for
- Existential quantification
- $\exists x, P(x)$, meaning we can find an
`x`

that we can substitute into`P(x)`

to make the proposition hold.

- $\exists x, P(x)$, meaning we can find an

There are some basic axioms for how you can work with these propositions. These are just rules that “make sense”. Such as modus ponens

$p \rightarrow q, p \vdash q$

Or conjunction elimination

$p \wedge q \vdash p$ $p \wedge q \vdash q$

Or conjunction introduction

$p, q \vdash p \wedge q$

As it turns out when you start to think of your types as propositions some interesting things start to pop up…

For instance if we look at something like implication in logic…

$P \rightarrow Q$

This means that if I have a proof of the proposition P, then I can produce a proof of the proposition Q.

That’s very similar to a function type in something like Haskell or Idris. If I’m given a value of type P, then I can produce a value of type Q. So function application seems to be identical to modus ponens.

`p -> q`

Similarly in logic I might have

$P \wedge Q$

Which means that I have a proof of P and a proof of Q.

If you squint that’s kind of similar to:

`(p, q)`

Which means that I have a value of P, and a value of Q. Conjunction elimination is then just the projection of either the first or second value in the tuple:

```
-- P /\ Q -> P
fst :: (p, q) -> p
fst (a, b) = a
-- P /\ Q -> Q
snd :: (p, q) -> q
snd (a, b) = b
```

What are the values of a type then? Well, they look a lot like an existence proof of a given proposition. For instance:

```
const : p -> q -> p
const a b = a
```

The value, in this case the function which returns the first element, can be seen as a proof of the proposition $p \rightarrow q \rightarrow p$. You take the proof of the first proposition in the chain of implications, and return it as a proof of the same proposition at the end of the implication chain. So, given a proof of $p$ and a proof of $q$, if we take the proof of $p$ and discard the proof of $q$, then we can prove $p$. Which makes sense to me!

We can also see how the type checker can prevent us from proving false propositions. For instance, you can’t prove $p \rightarrow q$, because there would be no way to get a proof of $q$ from a proof of another proposition $p$, when both $q$ and $p$ could be any random proposition!

```
bogus : p -> q
bogus p = -- What can I put here that would type check? :(
```

We can’t find a value of type `q`

, since we only have a value of type `p`

!

I’m going to quickly show you some basic proofs in Idris. With any luck you can at least imagine how these proofs might be extended to more complicated programs!

Idris has a type which represents equality between two things. This type is constructed, as you might expect, with the equals sign.

```
equality_good : 2+3 = 5
equality_good = Refl
-- This fails to type check
equality_bad : 2+3 = 7
equality_bad = Refl
```

An equality like this has only one constructor, `Refl`

. This equality type is roughly defined as:

```
data (=) : a -> b -> Type where
Refl : x = x
```

Which looks a little obtuse, but really all this means is that if we want to put `Refl`

in a hole with some equality type, then Idris needs to be able to determine that whatever is on the left and right of the equals signs will evaluate to the exact same value. If Idris can determine that, then the left and the right side are considered to be identical, and we can replace whatever is on the left side with whatever is on the right side and vice versa. This is reflexivity, and it’s what `Refl`

stands for.

Now, with this in mind lets walk through a small, but mind bending example:

```
cong : (f : a -> b) -> x = y -> f x = f y
cong f prf = ?cong_rhs
```

`cong`

stands for congruence, and has a type which represents the proposition that, if you are given a function `f`

, and you know that some `x`

and `y`

are equal, then `f x = f y`

.

This might seem really odd and scary right now, because you have equals signs in your types. But remember, types are propositions of theorems, and these equals signs just means that we should be able to use the `Refl`

constructor to show that both things are equal using Idris’s internal notion of the equality of terms.

Here’s what we see when we ask Idris about our goal, `cong_rhs`

:

```
- + Main.cong_rhs [P]
`__ b : Type
a : Type
x : a
f : a -> b
y : a
prf : x = y
---------------------------
Main.cong_rhs : f x = f y
```

So, it looks like we need to be able to show that `f x = f y`

. In the list of known values it seems that we have a proof of `x=y`

from one of the arguments to `cong`

. And since we have a proof that `x=y`

, we should be able to rewrite `y`

to be `x`

using reflexivity. In Idris this is done by deconstructing the proof of `x=y`

by pattern matching on the argument.

```
cong : (f : a -> b) -> x = y -> f x = f y
cong f Refl = ?cong_rhs_1
```

That looks really unimpressive, but let’s see what it did to our goal:

```
- + Main.cong_rhs_1 [P]
`__ b : Type
a : Type
x : a
f : a -> b
-----------------------------
Main.cong_rhs_1 : f x = f x
```

Perfect! If we have a proof that `x = y`

, then Idris knows that they’re interchangeable, and it automatically replaced `y`

with `x`

everywhere. Now we just need something with the type `f x = f x`

, which is trivial, since if you look at the definition of `Refl`

, that’s pretty much exactly what it does. We just need to substitute `f x`

for the `x`

in `Refl`

.

```
Refl : x = x
-- So, if we just replace this general "x" with our "f x" we would get...
Refl : f x = f x
```

`Refl`

actually

In Idris `Refl`

uses implicit arguments, since it can often infer what it should use in context, so we could just write `Refl`

:

```
cong : (f : a -> b) -> x = y -> f x = f y
cong f Refl = Refl
```

But we could also give it an argument explicitly.

```
cong : (f : a -> b) -> x = y -> f x = f y
cong f (Refl {x}) = Refl {x = f x}
```

I realize this is a bit confusing because there are `x`

’s in both places, but the `x`

in the definition of `Refl`

is in a different scope, and we’re just substituting our `f x`

for that `x`

, like an argument to a function.

Now that we have a proven congruence theorem we can construct some more complex proofs. Let’s write a function to do addition on natural numbers and prove that it’s associative.

In Idris natural numbers look like this:

```
data Nat : Type where
0 : Nat
S : Nat -> Nat
-- 0 = 0
-- S 0 = 1
-- S (S 0) = 2
-- etc...
(+) : Nat -> Nat -> Nat
(+) 0 y = y
(+) (S x) y = S (x + y)
```

The 0 represents 0 (it’s actually `Z`

, but I think writing 0 is less confusing), and `S`

stands for successor, which means “plus one”. So we have defined the set natural numbers recursively, by adding one to the previous natural number. This gives us a unary representation of the natural numbers that’s very nice to work with, it’s similar to tallies.

Similarly we can define addition recursively:

- Zero plus any number is that number.
- One plus x added to y is x + y with one added to it.

Now let’s define our theorem:

```
plus_assoc : (x, y, z : Nat) -> x + (y + z) = (x + y) + z
plus_assoc x y z = ?plus_assoc_rhs
```

This just says that for all `x`

, `y`

, and `z`

in the natural numbers, `x`

added to `y + z`

is the same as `x + y`

added to `z`

.

To prove this kind of thing we often use induction. We’ve actually already seen induction in Idris. It’s just recursion. So we’ll case split on `x`

, which gives us a base case where `x = 0`

, and a case where `x = S k`

for some natural number `k`

.

```
plus_assoc : (x, y, z : Nat) -> x + (y + z) = (x + y) + z
plus_assoc Z y z = ?plus_assoc_rhs_1
plus_assoc (S k) y z = ?plus_assoc_rhs_2
```

We have some interesting holes now.

```
- + Main.plus_assoc_rhs_1 [P]
`__ y : Nat
z : Nat
---------------------------------------------------------------
Main.plus_assoc_rhs_1 : 0 + (y + z) = (0 + y) + z
- + Main.plus_assoc_rhs_2 [P]
`__ k : Nat
y : Nat
z : Nat
-----------------------------------------------------------------------
Main.plus_assoc_rhs_2 : (S k) + (y + z) = ((S k) + y) + z
```

For the first one we have to just realize that when we use `Refl`

, Idris will try to evaluate both sides of the equals sign completely. Because of how plus is defined, it can actually evaluate these partially even though we don’t know what `y`

and `z`

are. This just triggers the first case of our definition of plus, where 0 is on the left side. So this goal is really:

```
- + Main.plus_assoc_rhs_1 [P]
`__ y : Nat
z : Nat
---------------------------------------------------------------
Main.plus_assoc_rhs_1 : y + z = y + z
```

And we can satisfy this with reflexivity.

```
plus_assoc : (x, y, z : Nat) -> x + (y + z) = (x + y) + z
plus_assoc Z y z = Refl
plus_assoc (S k) y z = ?plus_assoc_rhs_2
```

The second hole is more complicated.

```
- + Main.plus_assoc_rhs_2 [P]
`__ k : Nat
y : Nat
z : Nat
-----------------------------------------------------------------------
Main.plus_assoc_rhs_2 : (S k) + (y + z) = ((S k) + y) + z
```

Again, Idris can still evaluate this partially, so this goal is really this:

```
- + Main.plus_assoc_rhs_2 [P]
`__ k : Nat
y : Nat
z : Nat
-----------------------------------------------------------------------
Main.plus_assoc_rhs_2 : S (k + (y + z)) = S ((k + y) + z)
```

So it looks like we need to prove associativity… Which is what we’re trying to do.

But since Idris knows about recursion, we can actually call `plus_assoc`

on `k`

, `y`

, and `z`

to get something with the type…

`k + (y + z) = (k + y) + z`

So we’re almost there, we just need to be able to apply the successor function on both sides of the equality. This is exactly what congruence does:

```
cong : (f : a -> b) -> x = y -> f x = f y
cong f (Refl {x}) = Refl {x = f x}
```

If we give `cong`

a function, and something with an equality type, then `cong`

gives us an equality type with the function applied to both sides. So we can do this:

```
plus_assoc : (x, y, z : Nat) -> x + (y + z) = (x + y) + z
plus_assoc Z y z = Refl
plus_assoc (S k) y z = cong S (plus_assoc k y z)
```

Which completes our proof! It’s interesting how applying a theorem, like `cong`

, is literally just applying a function.

It’s also neat how recursion and induction are really just the same thing, and you can see that pretty clearly when working with something like Idris.

Sometimes building up proof terms in this functional programming style is a bit tedious. Once you get more complicated proofs on the go it gets pretty hard to keep track of all of the types. There are other languages which use a different style of proof based on tactics, which are little commands that build up these proof terms behind the scenes for you.

These languages are interesting to work with, but the proofs are actually pretty hard to read without the proof state shown, which editors for these languages will display nicely. The proof state is just your current goal type, and it displays it in much the same fashion as Idris displays its goals.

Here’s an example of the associativity proof we just did in Coq:

```
Inductive nat : Type :=
| O : nat
| S : nat -> nat.
Fixpoint plus (n m : nat) : nat :=
match n with
| O => m
| S n' => S (plus n' m)
end.
Theorem plus_assoc : forall (x y z : nat), plus x (plus y z) = plus (plus x y) z.
Proof.
intros x y z. induction x as [| k].
- reflexivity.
- simpl. (* Simplify with evaluation *)
rewrite IHk. (* Use induction hypothesis to rewrite terms *)
reflexivity.
Qed.
```

It’s actually pretty similar, and you can maybe get some idea of how the tactics translate into the functions from before. We break things into cases much the same way with the induction tactic.

- The base case is just handled with reflexivity.
- Then, after simplifying the types by evaluation much like in Idris, we apply the theorem inductively, and then use reflexivity to handle the case of
`x = S k`

.

These tactics look pretty weird when you first see them, but if you start thinking about how they get turned into proof terms like in the Idris examples, it becomes a lot easier to understand.

So, that’s the end of the talk. It’s just a rough overview of why types are so magical, and why you should care about them.

I realize that this was quite the whirlwind introduction to this topic, so if you have any questions feel free to ask!

]]>I wrote a post about setting up Haskell development for ARM over on the Haskell Embedded blog that I’m a part of. You should check it out!

]]>Program optimization is strange.

We naturally want our programs to run as quickly and efficiently as possible, but in some sense I have no idea what that actually means. Or, rather, I have no idea what “computation” actually entails.

The problem of optimization is at its simplest when you have a finite number of inputs to the program, and every input results in termination. In this case everything at runtime can be done in a constant amount of time. All runtime computation boils down to a bland lookup table (arguably you may be able to compute some values faster than you might look them up, but we’ll ignore such details).

This result is incredibly anticlimactic. It just means that all of the fanciful computation is pushed back to compile time. It doesn’t actually go away! Now instead of calculating what you need at runtime you calculate *EVERYTHING* at compile time. Perhaps this is fine. If you have a program that is compiled once, and then executed every nanosecond of every day on millions of computers then this may well be the best option, even for programs with relatively large state spaces.

Of course, that’s not always the case, and the problem seems most interesting at its core: compile time + execution time for a single instance of a given program. This is how much time it *actually* takes to compute your results, after all!

Yet this is more confusing still, because then what constitutes a program? Surely a programmer can just decide to write the entire damn thing as a lookup table, and then the compiler doesn’t have to do much work at all! But then it’s the programmer that’s doing the “real” computation. It seems we can just keep pushing the computation back into different layers, and it’s all horrendously meta!

Just what the hell are we doing? How do we describe a computation *without* doing computation? It seems so intimately linked, and I’m not sure that there is a way to distinguish between the two entirely. I think this is an important question.

So, what can we do? The one thing that comes to mind is that we might choose to describe algorithms with as little state as possible. The idea is simply that by minimizing the amount of information needed to describe an algorithm, we can limit the amount of a priori computation that is done. This can allow us to measure computation time much more reasonably, since we have a common starting point for every algorithm (although, I would wager that the minimal state algorithm description is actually not unique).

For instance consider the Fibonacci numbers. We might describe the algorithm for computing them something like:

$F_n = \begin{cases} 0 & n = 0 \\ 1 & n = 1 \\ F_{n-1} + F_{n-2} & \text{otherwise} \end{cases}$

Which has two integers worth of predetermined state. But some crafty programmer who wanted to shave a little bit of effort off of the compiler / running program might describe the algorithm as such:

$F_n = \begin{cases} 0 & n = 0 \\ 1 & n = 1 \\ 1 & n = 2 \\ F_{n-1} + F_{n-2} & \text{otherwise} \end{cases}$

Gasp! Now no computation aside from a lookup needs to be done for $F_2$, since it was sneakily done beforehand. Now we have no idea how long this actually takes to compute! Of course, now this has more a priori state, which means it’s not in the “minimal” format.

Perhaps we could try to transform this into some kind of stateless representation, eliminating precomputation all together. Say we replace $F_0$ with $a$ and $F_1$ with $b$. That surely fixes things for us, right?

$F_n(a,b) = \begin{cases} a & n = 0 \\ b & n = 1 \\ F_{n-1}(a,b) + F_{n-2}(a,b) & \text{otherwise} \end{cases}$

So $a$ and $b$ are given at runtime and the program is not allowed to guess at what these values might be ahead of time. It can’t say, “oh, I’ll just precompute the first 100 values for when $a = 0$ and $b = 1$”. This is forbidden! Now those crafty programmers can’t do any additional optimization!

But I’m not sure that this actually saves us.

$F_n(a,b) = \begin{cases} a & n = 0 \\ b & n = 1 \\ a + b & n = 2 \\ a + 2 b & n = 3 \\ 2 a + 3 b & n = 4 \\ F_{n-1}(a,b) + F_{n-2}(a,b) & \text{otherwise} \end{cases}$

We can still precompute and optimize the actual computation that’s going on. So we have to consider these cases for each $n$ as part of the state in the algorithm description. Thus this actually takes more predetermined information. Again we may continue to make this program faster and faster by doing some of these steps beforehand ad infinitum, but this still seems like cheating to me. This is just sweeping computation under the carpet again. This is _{-funroll-loops} to the extreme, and I find this unsatisfactory!

It feels like we’re missing a layer of abstraction here. Something is missing, and I’m not sure exactly what it is.

]]>Hello!

I now apparently have a blog. I don’t know if this will ever see an update, or if this will ever be read by anybody, but the intention is to put some ramblings up here.

Why is there this blog? Because I might as well have something on the webby places, and it’s as good a time as any to actually do some web related stuff. Perhaps this will be a convenient place to put ramblings that I will direct at multiple people.

This blog will most likely contain anything that pertains to my interests. This includes, but is not limited to:

- Computing Science
- Programming languages (C, Haskell, etc…)
- Electronics (Arduinos and such)
- Mathematics
- Stenography / Writing / Pens

Maybe it will be interesting, maybe it won’t!

The blog itself is a static website generated with Hakyll, which is written in glorious Haskell. Hakyll is a very neat little tool, and I am quite happy with what it can do!

Currently the blog uses Bootstrap, and I have blatantly stolen the Bootstrap blog example. I am not a web designer, and I certainly don’t know how to make things pretty! I hope to actually do a better job of this in the future, but it’s functional at least!

Comments are being provided by Disqus. It’s a super easy way to get a comment section plopped into a statically generated website, so hopefully it works alright!

MathJax is also set up so I can write mathy things in LaTeX. For instance, here is a nonsense formula:

$\frac{\int_a^b x^2 \, \mathrm{d} x}{\partial \phi \varphi \psi}$

We should also be able to insert code:

```
-- Sum of all of the even numbers in a list.
sumEvens = sum . filter even
```

All of this is hosted on an incredibly cheap VPS, so I am not sure if I expect much, and the website is updated automatically with some git hooks. Hurray.

That’s about all I have to say for now!

]]>