What I Learned This Week: Tapir

WILTW #5

This week has largely been administrative, with letters written, and servers managed.

I went through the Tapir/LLVM paper, and had some thoughts about theorem proving…

Tapir/LLVM

This paper introduces a very elegant way of expressing fork-join parallelism in an extended version of the LLVM IR. It does so by introducing three instructions, detach, reattach, and sync.

detach is how you introduce parallelism, it takes two labels. One for the detached block, and one for the continue block. Tapir has a really slick set of semantics for the parallel code – it should be the same as running the detached block, and then the continue block sequentially.

In practice the detached block will be run in parallel with the continue block, and it will terminate when it hits a reattach instruction which links bank to the continue block. The sync command can be used by the continue block to wait for this reattach.

What’s impressive is that Tapir/LLVM is a very small extension of the IR, but it allows for many of LLVMs optimizations to work on the parallel code with little to no modification.

Theorem Proving Wisdoms

I’ve been thinking a bit about why I should care about theorem proving, and I think it’s best summarized with a few points.

Any advance in theorem proving is essentially an advance in:

Being able to write programs. If you can automatically find proofs, you can automatically write programs.
Being able to test programs. If you can automatically find proofs / counterexamples then this makes testing better.
Being able to optimize programs. If compilers can automatically try to prove properties about your program, then they can use this knowledge to better optimize this program. Better proof search might mean we can write better compilers.

I find that it’s quite valuable to be able to reason about my code in a formal system. Helps to keep things organized, and makes a lot of things easier to write in a way. Having proof search available, like in Idris, when writing code can make development so much nicer and faster. I really do think that with just a bit more development it can be way better and faster to write programs.

Rpower

I was having a number of problems proving properties with Rpower in Coq. I couldn’t find any theorems or lemmas which could help me prove even the simplest thing like ln (Rpower a b) = b * ln a.

Turns out in these situations it’s worth printing the definition of the offending function:

Print Rpower.

Rpower = fun x y : R => exp (y * ln x)
     : R -> R -> R

Ah. Yeah. That would help. I couldn’t find anything relating Rpower to exp or ln with Search! Turns out I just needed to unfold the definition this entire time.

This is one thing that’s worth keeping in mind when writing Coq. You really need to know how the things you are working with are defined. Makes it much more obvious how things will simplify, and can also make it way more obvious how to approach a problem.

But seriously, maybe we should make Rpower a b = exp (b * ln a) show up in the search or something?

Tactics

Coqplexity is ultimately a project in Ltac, at least I’m hoping to automate as much as possible with it.

Previously I was just working on a big_O tactic which can automatically solve Big O problems using some common tricks, like recognizing that if \(f1 \in O(g)\), and \(f2 \in O(g)\), then \(\lambda \; n \rightarrow f1 \; n + f2 \; n \in O(g)\). This works very well in a lot of cases, but this approach is kind of rigid and the tactics currently can only solve fairly trivial Big O problems.

What I am trying to do now is write a much more general set of tactics which encompass the usual path that I take when trying to solve these problems. This should allow me to automate a lot more, and possibly even automate theorems that I would otherwise need to write custom proofs for.

However, I’m finding Ltac is a bit fiddly. I’m ending up with situations where it’s hard to match on what I actually need, and keep track of the current state and any backtracking I might want to do. Often Ltac seems to rely upon heavy-handed tactics like inversion and subst which drastically alter the context and goal. I want to be a lot more principled, so I can print out the steps taken. We’ll have to wait and see how this goes!