Inside F#

Brian's thoughts on F# and .NET

On lambdas, capture, and mutability

Posted by Brian on November 12, 2008

This blog entry is a little more esoteric than I would like, so the author apologizes up front.  :)

A C# brain teaser

Last time I posed a teaser question: what does this C# code print?

    List<Func<int>> actions = new List<Func<int>>();  
    for (int i = 0; i < 5; ++i) 
        actions.Add( () => i * 2 );  
    foreach (var act in actions)  
        Console.WriteLine( act() );  

At a glance it looks like it may print 0, 2, 4, 6, 8.  But in fact it prints 10, 10, 10, 10, 10.

The reason is that all five instances of the Func objects created by the lambda are capturing a reference to the same mutable variable instance "i".  By the end of the "for" loop, the value of "i" is 5, so invoking the Func objects returns "i*2", which is 10.

If you want to write code like that that prints 0, 2, 4, 6, 8, you can do it by changing the "for" loop to this:

    for (int i = 0; i < 5; ++i)
        int copy = i;
        actions.Add( () => copy * 2 ); 

There is a different instance of "copy" for each iteration of the loop.  As a result, each lambda captures a different "int" object.

If you feel that this is rather subtle, you’re not alone.  This example is based off a question on StackOverflow.  But nearly everyone trips over this at some point.  (The astute reader will note that I’ve even run into it myself on this blog – see the C# code at the end of this blog entry.)  For other examples, see this question or this one.  And this isn’t just limited to C#.  People run afoul of this in Python as well, as seen here and here.  (I expect there are more such questions; those are just the ones I found when glancing quickly.  EDIT: There are dozens more instances of this question on StackOverflow; it’s one of the most common questions.  See also Eric’s blog for a nice description.)

In general, with every language that has both mutable variables and closures (lambdas that can capture their environment), you run into this issue.  The lambda captures the mutable variable now, but gets evaluated later, after further mutations may have occurred.  This is an instance of the more general notion of how "lazy evaluation" and "side effects" don’t always mix nicely.  (You can read lots more on the topic by doing a web search for e.g. "closure capture value reference".)

Ok, fine, so this is a gotcha in most languages with mutables and closures.  So, you will ask, how does F# deal with this?  I’m so glad you asked!

Capture in F#

Let’s try transliterating that first C# example into F#.  I write

let FirstAttempt() =
    let actions = new List<unit->int>()
    let mutable i = 0
    while i < 5 do
        actions.Add( fun () -> i * 2 )
        i <- i + 1
    for act in actions do
        printf "%d " (act())    
    printfn ""

What do you think it will print?  The answer: it doesn’t compile!  The compiler points inside the lambda ("fun") and tells me

The mutable variable ‘i’ is used in an invalid way. Mutable variables may not be captured by closures. Consider eliminating this use of mutation or using a heap-allocated mutable reference cell via ‘ref’ and ‘!’.

One interpretation of this error message is "you are about to fall headlong into this messy gotcha of mutables+closures, so I (the compiler) am going to stop you before you have an accident".  Ok, great, so the compiler stopped me before I incautiously drove off a cliff.  How do I do what I want?

Well, that depends on what I want.  The compiler has stopped me from perhaps accidentally capturing mutable state, and now I have to choose.  Do I intend to capture mutable state or not?  That is, do I want to print "0 2 4 6 8" or "10 10 10 10 10"?  If the former, then I need to ensure that the identifier captured by the lambda is not a mutable, which leads naturally to the analogue of the revised C# example:

let Option1() =
    // prints "0 2 4 6 8"
    let actions = new List<unit->int>()
    let mutable i = 0
    while i < 5 do
        let copy = i
        actions.Add( fun () -> copy * 2 )
        i <- i + 1
    for act in actions do
        printf "%d " (act())    
    printfn ""

The new identifier "copy" is not declared mutable, so now the lambda can capture it.  In each iteration of the loop, "copy" is bound to a different value, so each lambda captures a different value.

On the other hand, if I did intend to capture mutable state, then I can use "ref".  As suggested by the compiler diagnostic (and discussed in the previous blog entry), one can use ref cells to hold mutable state that can be captured by closures.  So to write code like the original C# example I could say

let Option2() =
    // prints "10 10 10 10 10"
    let actions = new List<unit->int>()
    let i = ref 0
    while !i < 5 do
        actions.Add( fun () -> !i * 2 )
        i := !i + 1
    for act in actions do
        printf "%d " (act())    
    printfn ""

Now "i" has type Ref<int> – an object with a single integer field.  "i" itself is immutable (it’s a reference to that object in memory; "i" never changes to point to some other object), but the int stored inside the object is mutable.  The lambda captures "i", an immutable reference to the mutable object, and so when it finally executes "!i" it will read the current value of the int field in that object.  (There’s nothing magic about "ref" here, any heap-allocated object would do.  The point is I needed to add a layer of indirection so as to get ‘reference semantics’ rather than ‘value semantics’.)

Language design commentary

Any time a language designer adds closures to a language, he has to decide the semantics for the symbols being closed over.  Ought closures capture by value or by reference?  If you’re in a totally pure (side-effect-free) language, then it doesn’t matter – both "by value" and "by reference" yield the same results.  But one you introduce effects, the distinction matters.  That’s the reason that the programming "teaser" that started this blog entry is, in fact, a brain-teaser; the human reader can easily imagine the program being executed as though the lambdas capture by-value, though in fact C# captures by reference.

F# takes an interesting tack here.  Rather than choose to capture "by value" or "by reference", the F# compiler notices if these two different capture modes would yield distinctive behavior(*) (simply by noting if any of the captured symbols are declared as local mutables), and if they would, then the program doesn’t compile.  The advantage of this strategy is that it makes you far less likely to write brain-teaser code (code that you cannot easily predict the behavior of just by reading it).  The disadvantage is that when you do really want to capture mutable state, you may have to rewrite a bit of "let mutable" code to use "ref"s instead.  Scenarios for capturing mutable state like this are relatively uncommon, so the F# approach seems like a good trade-off.

(*) This is not entirely true, as F# lambdas can capture "let mutable" symbols bound at module-scope or class-scope, but not those bound at expression-scope.  Why this distinction?  Modules and classes are OO software engineering structures designed to encapsulate members in a way that makes it easy to reason about their lifetime, concurrent access & mutability, etc.  As a result, capturing these entities is allowed.  Expression bindings, on the other hand, are less appropriate software-engineering site for such purposes, and thus the ad-hoc capture of let-bound mutable variables from expression bindings is not part of the F# design.

There are a variety of choices available to language designers and compiler implementers… As Jared pointed out in a comment on my prior blog entry, another option is for compilers to emit a warning diagnostic when closing over mutables; apparently this is what VB does (I haven’t tried it myself).

A final note

Of course, we could have avoided all of these issues by eschewing mutation from the start.  When everything is immutable, the code in F# is both shorter to write and harder to misinterpret:

let CompletelyImmutableOption() =
    // prints "0 2 4 6 8"
    let actions = List.init 5 (fun i -> fun() -> i*2)
    for act in actions do
        printf "%d " (act())    
    printfn ""

When nothing is mutable, you don’t have to reason about whether the lambda will be called "now" or "later" – the value will always be the same.  This is the principle motivation for reducing the number of side-effects in your code; "pure" code without side-effects is immune to changes elsewhere in the program, and thus is easier to reason about (since many considerations, such as ‘timing’, can be ignored in effect-free code).


6 Responses to “On lambdas, capture, and mutability”

  1. Stephan said

    I\’d really prefer if there was a way to override the F# default and capture mutable locals. I\’ve more than once come across a situation where the need to convert mutable locals into ref cells effectively created a barrier towards adopting higher-order functions, which I guess it the polar opposite of what you guys intended. Take for instance your typical imperative algorithm: several loops iterating over a data structure and modifying some local book-keeping information. Once you want to replace one of the loops by a higher-order function taking a lamda argument you have to replace all the referenced mutable locals by ref cells. Since ref cells are so damn ugly (with their \’!\’ in front of every use), you don\’t make this change if you can avoid it.

  2. Dan said

    Re: (no name)Without a concrete example, I might be misunderstanding you. But I would say that the appropriate functional pattern for passing around local context is to pass around an immutable version of it. Consider the fold functions, for example.

  3. Stephan said

    Don\’t know why my previous post carries "(no name)" as the name. Sorry for that.I\’m fully aware of functional alternatives to imperative implementations. Efficiency issues aside, my point wasn\’t that you can\’t reimplement imperative algorithms in a functional way, but that the transition between the two styles is discontinuous and unnecessarily difficult because you are forced to convert mutable locals to refs, or even more awkwardly, put all the state into a tuple, thread it though a fold and then unpack it at the end. In my view this makes it unnecessarily hard to mix and match imperative and functional styles on a pragmatic case-by-case basis.

  4. Dan said

    Sephan, I don\’t dispute that. But there\’s also a good reason to inflict a little pain: mutable data are dangerous. Haskell restricts statefulness through a far more onerous process (though, arguably, a more elegant one), F# merely discourages its use. On the one hand, this means things are syntactically unpleasant and, as you say, mixing styles can be difficult. On the other hand, I think it encourages the user to write functional code when possible and imperative code when necessary. Depending on your point of view, this may not be a good bargain, but from my experience, it results in generally cleaner and safer code. But yes, I\’d say your observation is correct.

  5. Gates VP said

    I think that you\’ve hit on one of the biggest problems with moving people to functional languages and I can see it in the comments (and even in the very existence of the example).Honestly, I\’m just learning F#, but I did some functional in University and that first example just grated on me.  The whole "mutable i" and "while loop" reeked of "I\’m trying to convert C#" rather than "I\’m trying to write this in F#".  Even "no-name\’s" comments with the "book-keeping" seems problematic.I mean, anything that\’s a "series of nested loops" is definitely in for a major change in F#.  I\’d personally love to see one of his examples in action, b/c I suspect that the immutable answer for such a problem is actually nested functions and a modified data object.

  6. […] has its own little gotchas; at work those on the C# team still fall into the trap described here, and so it is fun to josh each other about the minor warts in our respective languages.  […]

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: