## Catamorphisms, part seven

Posted by Brian on June 7, 2008

The code in the previous blog entry was difficult to read due to the use of explicit continuations. Today I’ll show how to make the code more readable using a continuation monad. I’ll also discuss F# "computation expressions" (a.k.a. "workflows") in a little more depth.

(Today’s blog entry is probably the most "advanced" that I’ve written, as it deals with three advanced functional programming concepts (catamorphisms, continuations, and monads) simultaneously. I’ll try to revert back to some beginner/intermediate F# topics (that may attract a wider audience) in future blog entries… but today I am just writing about things that tickle my fancy.)

### The trouble with continuations

Continuations are an almost indispensable mechanism for achieving certain goals. For example, in the previous entry I showed how to use KFold to process arbitrary tree-like data structures without using the call stack. This is a win when you need to process large, arbitrarily-shaped (e.g. unbalanced trees) data. However the resulting code suffered a bit in terms of readability. Last time we ended up with code like

KFoldTree (fun x kl kr (Node(_,oldL,oldR)) k ->

if x < 5 then

kr (fun newR ->

k (Node(x, oldL, newR)))

elif x > 5 then

kl (fun newL ->

k (Node(x, newL, oldR)))

else

k (Node(0,oldL,oldR))

) (fun t k -> k t) tree

for changing one value to another in a binary search tree. The weird bits now are that our lambdas take an extra "k" parameter at the end of the parameter list, and we write code using the "k" variables rather awkwardly. Compare the code above to a simple, non-tail-recursive implementation of the same method which doesn’t use a fold:

match tree with

| Node(x,oldL,oldR) ->

if x < 5 then

let newR = Change5to0bstUsingStack oldR

Node(x, oldL, newR)

elif x > 5 then

let newL = Change5to0bstUsingStack oldL

Node(x, newL, oldR)

else

Node(0, oldL, oldR)

| Leaf -> tree

The main difference I want to highlight is this

k (Node(x, oldL, newR)))

versus this

Node(x, oldL, newR)

The latter code looks much more familiar, whereas the former code that explicitly uses continuations looks kind of odd/backwards and uses weird lambdas and parens that are hard to read/parse at a glance.

It turns out that we can leverage F# computation expressions to get better syntax for the continuation-version of the code. By coding up the continuation monad, we can use continuation workflows, so that that little code snippet becomes

return Node(x, oldL, newR)

In the long run, this is much more readable an understandable (especially if there are more than 2 lines of code) than the explicit continuations version. But the "let!" and "return" keywords are pretty new to us (we saw them in passing in a previous blog about async workflows), so let’s pause and talk a little about F# computation expressions first.

### How to read F# computation expressions

F# computation expressions are a syntax sugar for writing code where a portion of that code deals with a particular *monad*. (I’m continuing to defer the discussion of exactly what a monad is, but if you’ve been keeping up with previous blog entries I’ve written, you know that parsers can be expressed as a monad, and asynchronous computations can be expressed through monads, and today we’ll see a monad of continuations. Monads have many uses, but they are very abstract, and thus hard to explain without a lot of examples already under your belt.) A computation expression takes this general form:

*comp_expr_code*}

where "foo" is a "builder" object for a particular monad, and then *comp_expr_code* is code written using a special subset of F# that can be used inside computation expressions.

Consider this "normal" F# code:

f()

let y = g(x)

h(y)

x + y

Assuming that all these functions and values have nothing to do with the "foo" monad, we can lift this code into a computation expression for foos thusly:

let x = 3

do f()

let y = g(x)

do h(y)

return x + y

}

When moving code into a computation expression, "let"s are unchanged, functions that are called just for effects are prefixed with "do", and the final value is prefixed with the "return" keyword. The result is code with pretty much the same meaning, except that the whole expression has been lifted into the "foo" monad, which may change its final value and type. If in this example, "x" and "y" had type "int", then the normal code results in an "int", whereas the computation expression results in a value of type "Foo<int>" where "Foo" is some data type associated with the monad described by builder "foo".

A concrete example will help – consider asynchronous computations. In the example above, if "foo" is replaced with "async", then the result of the whole expression is an "Async<int>" – an asynchronous computation that will eventually yield a value of type "int". Recall that "Async<‘a>" is the data type that describes asynchronous computations, and "async" is the corresponding "builder" object for the asynchronous monad ("async" and "Async<‘a>" are defined in the F# Control library).

In addition to "let", "do" and "return", which operate on "normal" values, there are corresponding keywords "let!", "do!", and "return!" which operate on monadic values. In our previous example, "g" was a function that returned an "int", and "h" was a function that returned "unit". Suppose, instead, that "g" returns a "Foo<int>" and "h" returns "Foo<unit>". In that case, we could write

let x = 3

do f()

let! y = g(x)

do! h(y)

return x + y

}

where we have added the "!" (usually pronounced "bang") to the lines using "g" and "h". The "!" says that the expression being processed is monadic, and should be processed according to the semantics given by the corresponding builder object. The results of these monadic computations still have "normal" values – so for example, even though "g" returns a "Foo<int>", the "let!" ensures that the type of "y" is simply "int".

Again, a concrete example helps – consider again async computations. Suppose "g" returns an Async<int>. The "let!" says to run that async computation to completion until it yields an "int" value, and then bind that value to "y". Similarly, if "foo" were "async", then "do!" would run its async computation to completion to produce an effect.

In short, inside a computation expression, you use "let", "do", and "return" for "normal" values, and you use "let!", "do!", and "return!" for values with a monadic type that need special processing specific to the particular monad we are in (as described by the builder object). As with LINQ in C#, it’s all just syntax sugar – the compiler desugars the code snippet above into

let x = 3

do f()

foo.Bind(g(x), fun y ->

foo.Bind(h(y), fun () ->

foo.Return(x+y))))))

Now, it just so happens that, given particular interesting definitions of "Delay", "Bind", and "Return", you can make that little code snippet have all kinds of interesting effects (like async, or parsing, or continuations, or …) – that is the magic of monads. But we’re getting ahead of ourselves – for now, at least you have a cursory introduction to the F# computation expression syntax.

### Defining a continuation monad

A continuation monad in F# can be defined like so:

member this.Return(x) = (fun k -> k x)

member this.ReturnFrom(x) = x

member this.Bind(m,f) = (fun k -> m (fun a -> f a k))

member this.Delay(f) = f()

let K = new ContinuationBuilder()

Unless you are trying to understand the mechanics of what is going on very deeply, you can pretty much just gloss over that definition – continuation monads are well-understood, and that’s what it looks like to write one in F#. Generally this is code that would be off in a library somewhere, but this code isn’t in any of the standard F# libraries, so I’m writing it myself.

What is important is that, with this thing in hand, we can now write "continuation workflows", which are just computation expressions that use the "K" builder object to produce values in the monad of continuations. Let’s see how to use it.

### Leveraging the continuation monad to make our code more readable

Consider again our code that was traversing a binary search tree to replace values:

KFoldTree (fun x kl kr (Node(_,oldL,oldR)) k ->

if x < 5 then

kr (fun newR ->

k (Node(x, oldL, newR)))

elif x > 5 then

kl (fun newL ->

k (Node(x, newL, oldR)))

else

k (Node(0,oldL,oldR))

) (fun t k -> k t) tree

We can rewrite this code using the continuation monad thusly:

KFoldTree (fun x kl kr (Node(_,oldL,oldR)) -> K {

if x < 5 then

let! newR = kr

return Node(x, oldL, newR)

elif x > 5 then

let! newL = kl

return Node(x, newL, oldR)

else

return Node(0,oldL,oldR)

}) (fun t -> K { return t } ) tree

Here are the changes relative to the previous version:

- The final "k" parameter to each lambda has been eliminated. Since the body of each lambda is now a continuation workflow (code of the form "K { … }"), the result will be a monadic value of the general form "fun k -> …", and so extra "k" parameter is now implicitly handled by the monad.
- The "continuation-ness" of the "kl" and "kr" parameters is gone. Recall that these parameters represent recursive calls on the left and right subtrees, respectively. Whereas previously we had to call them using continuation-style syntax "kr (fun newR -> …", now we call them with more natural syntax "let! newR = kr". We use the "!" form of "let" because "kr" is a value in the monad ("kr" is a continuation).
- We don’t have to call a continuation on the final result, we just need to "return" it. We are returning Nodes, which are "normal" (non-monadic) values, so we use the non-"!" form ("return" rather than "return!").

The types and behavior all work out just as in the original version – this code has the same semantics as the original version, it’s just easier on the eyes (especially so once you are already familiar with the F# computation expression syntax).

### One more example

Let’s take a look at one more example of applying the continuation monad – using it on the "Eval" function, another running example in this blog series. (Recall: that example has a data type called "Expr" to represent a tiny expression language, comprising integer constants, binary operators (Plus and Minus), if-then-else expressions, and "print" expressions.) Here is the previous version of the "Eval" code, using explicit continuations (though slightly reformatted to make for easier comparison with the new version I’m about to show):

KFoldExpr (fun x _ k -> k x)

(fun kl op kr _ k -> kl (fun l ->

kr (fun r ->

match op with

| Plus -> k (l+r)

| Minus -> k (l-r))))

(fun kc kt ke _ k -> kc (fun c ->

if c <> 0 then

kt (fun t -> k t)

else

ke (fun e -> k e)))

(fun ke _ k -> ke (fun e ->

printf "<%d>" e

k e))

expr

Now here it is rewritten using the continuation monad:

KFoldExpr (fun x _ -> K { return x } )

(fun kl op kr _ -> K { let! l = kl

let! r = kr

match op with

| Plus -> return l+r

| Minus -> return l-r } )

(fun kc kt ke _ -> K { let! c = kc

if c <> 0 then

return! kt

else

return! ke } )

(fun ke _ -> K { let! e = ke

do printf "<%d>" e

return e } )

expr

This reads so much better. In the case of binary operators (the second lambda), for example, I can read the code "aloud" as something like "Let ‘l’ be the value of the recursive call on the left-hand subtree, and let ‘r’ be the value of the recursive call on the right-hand subtree. If the operator is ‘Plus’, return ‘l+r’, else return ‘l-r’." The previous version with explicit continuations, though equivalent, is much harder to decipher, and doesn’t let you "read it aloud" nearly so easily.

Another thing worth pointing out in this example is the use of the "return!" keyword. In the case of if-then-else expressions (the third lambda), the "read aloud" version goes something like "Let ‘c’ be the value of the recursive call on the condition expression. If ‘c’ is non-zero, then the result is the value of recursing on the ‘then’ subtree, else the result is the value of recursing on the ‘else’ subtree." Since "kt" and "ke" are monadic values (they are continuations that represent the recursive calls into the ‘Then’ and ‘Else’ subtrees), we use the "!" version of "return" to return the value.

### The source code

Here’s the code from today’s blog entry.

member this.Return(x) = (fun k -> k x)

member this.ReturnFrom(x) = x

member this.Bind(m,f) = (fun k -> m (fun a -> f a k))

member this.Delay(f) = f()

let K = new ContinuationBuilder()

open System

// handy operator

let (===) = fun x y -> Object.ReferenceEquals(x,y)

type Tree<‘a> =

| Node of (*data*)‘a * (*left*)Tree<‘a> * (*right*)Tree<‘a>

| Leaf

// 4

// 2 6

// 1 3 5 7

let tree7 = Node(4, Node(2, Node(1, Leaf, Leaf), Node(3, Leaf, Leaf)),

Node(6, Node(5, Leaf, Leaf), Node(7, Leaf, Leaf)))

let KFoldTree nodeF leafV tree =

let rec Loop t k =

match t with

| Node(x,left,right) -> nodeF x (Loop left) (Loop right) t k

| Leaf -> leafV t k

Loop tree (fun x -> x)

// explicit continuations version

let Change5to0bstExplicitContinuations tree =

KFoldTree (fun x kl kr (Node(_,oldL,oldR)) k ->

if x < 5 then

kr (fun newR ->

k (Node(x, oldL, newR)))

elif x > 5 then

kl (fun newL ->

k (Node(x, newL, oldR)))

else

k (Node(0,oldL,oldR))

) (fun t k -> k t) tree

// plain recursive version (blows stack)

let rec Change5to0bstUsingStack tree =

match tree with

| Node(x,oldL,oldR) ->

if x < 5 then

let newR = Change5to0bstUsingStack oldR

Node(x, oldL, newR)

elif x > 5 then

let newL = Change5to0bstUsingStack oldL

Node(x, newL, oldR)

else

Node(0, oldL, oldR)

| Leaf -> tree

// version using continuation workflow

let Change5to0bst tree =

KFoldTree (fun x kl kr (Node(_,oldL,oldR)) -> K {

if x < 5 then

let! newR = kr

return Node(x, oldL, newR)

elif x > 5 then

let! newL = kl

return Node(x, newL, oldR)

else

return Node(0,oldL,oldR)

}) (fun t -> K { return t } ) tree

// CreateZeroRightTree : int -> Tree<int>

let CreateZeroRightTree size =

let rec Loop t n =

if (n < size) then

Loop (Node(0,Leaf,t)) (n+1)

else

t

Loop Leaf 0

// make a big tree of 2 million nodes all going to the right

let bigTree = CreateZeroRightTree (2 * 1000 * 1000)

// call our tail-recursive function on it, to prove we get no StackOverflowException

Change5to0bst bigTree

///////////////////////////////////////////////////////////////////////////////////

// types capable of representing a small integer expression language

type Op =

| Plus

| Minus

type Expr =

| Literal of int

| BinaryOp of Expr * Op * Expr // left, op, right

| IfThenElse of Expr * Expr * Expr // cond, then, else; 0=false in cond

| Print of Expr // prints, then returns that value

let exprs = [Literal(42)

BinaryOp(Literal(1), Plus, Literal(1))

IfThenElse(Literal(1), Print(Literal(42)), Print(Literal(0)))

]

let KFoldExpr litF binF ifF printF expr =

let rec Loop ex k =

match ex with

| Literal(x) -> litF x ex k

| BinaryOp(l,op,r) -> binF (Loop l) op (Loop r) ex k

| IfThenElse(c,t,e) -> ifF (Loop c) (Loop t) (Loop e) ex k

| Print(e) -> printF (Loop e) ex k

Loop expr (fun x -> x)

// EvalExplicitContinuations : Expr -> int

let EvalExplicitContinuations expr =

KFoldExpr (fun x _ k -> k x)

(fun kl op kr _ k -> kl (fun l ->

kr (fun r ->

match op with

| Plus -> k (l+r)

| Minus -> k (l-r))))

(fun kc kt ke _ k -> kc (fun c ->

if c <> 0 then

kt (fun t ->

k t)

else

ke (fun e ->

k e)))

(fun ke _ k -> ke (fun e ->

printf "<%d>" e

k e))

expr

// Eval : Expr -> int

let Eval expr =

KFoldExpr (fun x _ -> K { return x } )

(fun kl op kr _ -> K { let! l = kl

let! r = kr

match op with

| Plus -> return l+r

| Minus -> return l-r } )

(fun kc kt ke _ -> K { let! c = kc

if c <> 0 then

return! kt

else

return! ke } )

(fun ke _ -> K { let! e = ke

do printf "<%d>" e

return e } )

expr

exprs |> List.iter (fun expr -> printfn "%d" (Eval expr))

// 42

// 2

// <42>42

printfn "press a key"

System.Console.ReadKey() |> ignore

## Leave a Reply