A week ago today, on this very same stage, Chris got served. Well, the poor fool danced back. So you know what happens now. It’s on!

### The overview

(If you haven’t been keeping up, then go glance at the prior posts before starting here.)

My implementation of the Bookworm trie took less code, less memory, and less time to run than Chris’. So of course Chris, in a desperate bid to restore his tarnished image, had to try to one-up me *somehow*. He chose to focus on memory: the original trie implementation used about 400,000 nodes to represent the dictionary. By starting with my superior implementation, and then adding about 100 lines of his own code, Chris reduced the trie to about 55,000 nodes by sharing common suffixes. In the process, though, he manages to write code that takes about four *minutes* to initialize the data structure. Can you imagine if your favorite Flash/iPhone game took *four minutes *to load? Good luck selling that at the app store, bro.

So today I’ll walk you through what Chris did right (shortest… subsection… ever!), as well as what he did wrong (get comfy in your chair), and then show my own implementation. Like Chris’, my added code also tallies about 100 lines. In contrast, my implementation uses less than a quarter of the memory of his. And it runs in four *seconds*. (Some refer to me as “Dr. McNamara”, but you can just call me ”Mac”.)

### What Chris did right

Chris used a DAWG, and this is a good data structure.

### What Chris did wrong

Sigh. Where to start.

Let’s start small, then work our way up.

*Idioms*. At one point in his code, Chris writes “ignore(expr)” rather than “expr |> ignore”. The latter is the preferred idiom in F#; the former just looks funny. (In two other places, Chris does this “right”.) Also, Chris calls Option.isNone and Option.get, which is completely non-idiomatic. Hello! It’s called “pattern-matching” – look into it.

*Equality.* Chris does a lot of comparing object references for equality using the builtin ‘=’ and ’<>’ operators. If you just want to do reference equality, it is faster to call obj.ReferenceEquals than to dynamically dispatch through the virtual .Equals() method. This is a somewhat subtle point, but the mark of a true evil genius is attention to such detail. In my implementation, I experimented using the wrong Equals call, and it resulted in about a factor-of-two slowdown (there are a *lot* of comparisons to do, as we’ll see).

*Creating needless garbage*. Chris seems to enjoy making sure the GC always has plenty of work to do. For example, he creates a Dictionary that uses a tuple type (“int*int”) as a key. Tuple is a reference type, which means every time he looks something up in the Dictionary, he allocates a fresh tuple, which then immediately becomes garbage. The tuple was at least somewhat well-motivated; Chris is trying to create a decent ‘hash’ function to minimize collisions and speed lookup, and is using “height*numChildren”. Since the height it at most 16–

And, by the way, Chris’ blog and code just magically pull the number ‘16’ out of the air with no explanation. So I’ll be the one to step up and explain it. It just so happens that the longest word in our particular dictionary has 15 letters. And this means the tree ends up being at most 16 tall (one extra node at the end representing “end of word”). Of course, with a different word dictionary, Chris’ code may silently fail, since he’s hardcoded this number in–not one–but

threeplaces. Argh! Anyway, enough digressing…

–like I was saying, since the height is at most 16, and the number of children is at most 26, there’s plenty of bits in a single ‘int’ to store all these combinations. But Chris just sloppily goes allocating tuples all over the place. That might be fine for an everyday blog post or app, but if the *whole point* of your blog post is to out-do me on the “memory” front… Sheesh! I know the final goal is to optimize only the trie structure, but come on, try not to look like a complete chump along the way.

*Choosing a good hash code.* Speaking of that Dictionary, since we’re doing a lot of lookups to find equivalent trees, we really want the best hash code to minimize collisions. Rather than just *count* the number of children a node has (and fail to distinguish e.g. a node with ‘ABC’ children from a node with ‘XYZ’ children), why not incorporate ‘which children’ into the hash? Guess what, with 26 letters in the alphabet, we only need 26 bits to store whether there’s a child for each letter, which leaves us 6 more bits (plenty) to also store the height (in a 32-bit hash code). We’ll see that in my implementation; it only takes 10 lines of code to do the job; some jobs are worth doing *well*.

*Recomputing state.* Chris defines some new properties on the TrieNode class, namely IsLeafNode and NumChildren. Since they are ‘getter’ properties, their bodies are run every time they are called. There is not a ton of work happening in these properties (just traversing the 26-wide array), but it is still wasteful to recompute every time; better to cache the result. I presume Chris was trying to avoid putting new fields in the class (to keep the memory low), but as we’ll see in a moment, that’s being penny-wise but pound-foolish.

*Choosing the right data structure.* Wait, what? Earlier I said the DAWG was *right*! Yes, a directed acyclic word graph is a great structure to be able to efficiently do prefix word searches through a big English dictionary. But just because the thing is called a “graph” doesn’t mean you have to realize it as an *actual object graph*. Objects are expensive! If the whole goal is to reduce memory, then one should not choose a CLR runtime representation that includes more than a hundred thousand separate objects (55K nodes, and each node also has an array field, which is its own separate object). Why waste all that, when just *one *object will do?

(Yes. That’s right… wait for it…)

If you want to represent this directed graph, there’s a good data structure for this. It’s not rocket science. You’ve heard of this data structure before. They even have it in C. It’s called a *2-D array*. Check it out!

(Pat yourself on the back if you saw it coming.)

Ok, enough of the blow-by-blow, Chris looks like he’s had enough. Let’s move on to my version.

### Doing it right – the strategy

Any directed graph of nodes-pointing-to-nodes can be realized with an array of integers. After all, when we get close to the hardware level, that’s what the computer is doing anyway. Memory is just a big array of bytes, and a pointer or object reference is just an index into that array of bytes. The ‘problem’ is that runtimes like CLR inject a lot of overhead in that model (as suggested here; these are the costs associated with the benefits of having a GC, managed code, …), but this means there’s an opportunity to save memory by avoiding using the runtime to *implicitly *encode a graph as an array, and instead *explicitly *do it yourself.

So I’ll do just that. I’ll build the original trie nodes as before (not caring too much about size, we’ll throw these objects away once we know final structure), combine common suffixes (create the DAWG), and then re-encode the DAWG structure as an array. Along the way I’ll be sure to avoid all the pitfalls Chris fell victim to.

Before I do this, though, I ask myself, is it worth it? I did some back-of-the-envelope math, and worked out that Chris’ structure occupies almost 15 megs of memory on a 64-bit machine, whereas the array structure I’m going to implement will only take up a little more that 3 megs. So that’s more than a four-fold savings, which is a big enough win that I want to do it to ensure a FLAWLESS VICTORY in this grudge match.

If it weren’t for my desire to crush Chris’s soul, would it still be worth it? My gut reaction is ‘probably not’ – if the goal is just to cheat at Bookworm, then even Chris’ (original, first) bloated solution is probably sufficient. But if I imagine I’m the author of the Bookworm app, I can see wanting to do this kind of optimization to ensure that my app can run in a constrained memory environment. And a smaller memory footprint means more spatial locality, which may also reap dividends regarding how fast the app runs. (If you want a program to run faster, using less memory can be one way to make it happen.) So I do think this is useful to explore, as this is the kind of thing you will occasionally want to do in the real world. So let’s hop to it.

### Doing it right – the implementation

I’ll just walk through the code that’s different since last time. First, I define

`let (===) x y = obj.ReferenceEquals(x,y)`

I’ve defined this operator before; “obj.ReferenceEquals” is a mouthful, and I like choosing triple-equals-sign as the operator to compare whether two things are really, really equal (the same object). I’ll be using it a fair bit.

Next I tweak the TrieNode class a few ways. First,

```
[<NoEquality; NoComparison; AllowNullLiteral>]
type TrieNode() =
```

I added the NoEquality and NoComparison attributes; these will prevent me from accidentally calling the slower ‘=’ operator on two TrieNodes (the code would not compile – F# is great at detecting various errors at compile-time).

Next, in the body of the class, I add a couple more fields, and publicly expose the 26-wide array:

let mutable height = -1 let mutable fastHash = 0 member this.NextChar = nextChar

What are the two new fields for? I’m so glad you asked. It’s probably obvious what “height” will be: it will store the height of the current subtree. And “fastHash” will store a great hash code for this TrieNode. Until the whole trie is built, however, I can’t assign these guys the right values, so I made them mutable and initialized them as above. After the tree is built, then I can populate and access those values, thanks to these new members:

// don't call this until the entire tree is built; it caches its results member this.Height = if -1 = height then height <- 1 + (nextChar |> Array.map (fun t -> if t===null then 0 else t.Height) |> Array.max) fastHash <- this.ComputeFastHash() height member private this.ComputeFastHash() = // An especially good hash function we can use after we've finished building the whole tree. // The first 26 bits will be set if the corresponding NextChar is non-null, and // the remaining 6 bits will be the Height. assert(this.Height > 0 && this.Height < 64) // 64 is 2^6, we have 6 bits, which is plenty let mutable r = 0 for i in 0..25 do if not(null === nextChar.[i]) then r <- r + 1 r <- r <<< 1 r <- r <<< 6 r + this.Height member this.FastHash = fastHash

The act of asking for the Height property the first time causes the entire tree to initialize the “height” and “fastHash” fields. Height is easy – one plus the maximum height of any of your children. As for the hash code, it’s an integer, where the first 26 bits encode which alphabet letters have children, and the last 6 bits are the height. (If we end up with an English dictionary containing any sixty-five-letter words, an assert will fire to let us know that our clever scheme to cram all this information into the hash code is not working out exactly as planned, though it would do no great harm.) Note the use of the bitwise left shift operator (<<<) – I think perhaps this is the first time I’ve had a reason to use it in a blog entry. Fun stuff.

Ok, so with those minor additions to the TrieNode class, we can load up the word files and build the trie just as we did before. (The full source code appears at the end of this blog entry; I’m skipping stuff that’s the same as last time.) Once we have a trie, we need to convert it to a DAWG by finding common suffixes and sharing structure among semantically equal nodes. Starting from the leaves, I’ll keep a Dictionary<TrieNode,TrieNode> where the keys represent semantically distinct nodes, and the values are “the canonical object reference” for that node. We can walk the whole tree height-wise; if we encounter a node we’ve already seen, we replace it with the canonical one, otherwise we add it to our growing table of distinct nodes. The code is almost easier than the prose:

open System.Collections.Generic let canonicalNodes = new Dictionary<_,_>( { new IEqualityComparer<TrieNode> with member this.Equals(x,y) = x.Height = y.Height && x.IsEndOfWord = y.IsEndOfWord && Array.forall2 (===) x.NextChar y.NextChar member this.GetHashCode x = x.FastHash } ) let rec Canonicalize (t:TrieNode, height) = if not(null===t) then if t.Height > height then for i in 0..25 do Canonicalize(t.NextChar.[i], height) elif t.Height = height then for i in 0..25 do let n = t.NextChar.[i] if not(null===n) then match canonicalNodes.TryGetValue(n) with | true,canon -> t.NextChar.[i] <- canon | _ -> canonicalNodes.Add(n,n) for i in 1..trie.Height do Canonicalize(trie, i)

Note that I provide a custom IEqualityComparer to the Dictionary, whose Equals() method tests semantic distinctness (under the invariant that all nodes of lesser height have already been canonicalized), and whose GetHashCode is our FastHash. The Canonicalize() function then walks the tree heightwise: if we’re too high, we recurse deeper until we reach the height we’re canonicalizing, but once we reach the right height, we look up each of the object references in the 26-wide array and either replace each with the canonical value if we’ve already seen it, or else add that object to our growing table of distinct ones. Calling Canonicalize for every height will ensure that every node in the trie (viz. every suffix tree) has a canonical object. Well, *almost* every node – since we’re just replacing references in the inner arrays, we missed the root:

`canonicalNodes.Add(trie,trie) // need to add the root`

Ok, *now* every node is represented in the canonicalNodes table.

In case you missed it, in those 25 lines of code, I basically just did what Chris did in his blog entry. Well almost. Rather than 4 minutes, mine runs in about 3 seconds, so I guess it’s a *little* different. But it accomplishes the same goal. Rather than 400,000 nodes, we’re now down to about 55,000. (In the full program, you’ll see that I print out the node count.)

I’ve equaled Chris’ node-compression (with less code, that runs like 60x faster) – now it’s time to get the memory footprint smaller still. Now that we have a DAWG represented as an object graph, we’ll re-encode it as a 2D-array. Each ‘node’ will be a row of the array, and each ‘field’ within a node will be a column. The only fields that matter are the 26-wide array (pointers to the child nodes) and the IsEndOfWord boolean – the other fields of the TrieNode type were just temporary artifice to help us efficiently get us to where we are now (DAWG-in-hand). To simplify, we’ll just say each row is a 27-wide array, where the first 26 elements are the A-Z pointers, and the 27th element represents the IsEndOfWord boolean. What kind of element is stored in this array? Integers, of course. Since each TrieNode now corresponds to a row of the array, we can just encode each object reference as a row index. And, since there are conveniently only about 55,000 nodes – convenient because it’s a little less than 65,536 – we can use the two-byte “uint16” type in preference to “int”, to save half the storage. So we begin:

assert(canonicalNodes.Count < int System.UInt16.MaxValue) let arrayTrie = Array2D.create (canonicalNodes.Count+1) 27 0us

First I assert, so that if I do ever use a bigger dictionary that requires more than 65,536 nodes, I’ll know I have to switch to a bigger data type. Then I create a 2D-array that has “canonicalNodes.Count+1” rows, 27 columns, and has each element initialized to (the uint16 version of) zero (I had to look up the suffix “us” in this table). Why “canonicalNodes.Count**+1**”, you ask? Well, we represented every TrieNode in “canonicalNodes”, but there is one other value to account for. Not every entry in each 26-wide table of pointers to TrieNodes points to another TrieNode. Many of those references were “null”. And we need to represent that, too. So row 0 of the table will mean “null”. As it turns out, that row is already perfectly initialized. The first 26 entries all contain the value 0us, which is the row number they are “pointing to” – in this case, the “null” row. And the 27th entry, representing IsEndOfWord, is 0us (which I’m using to represent “false” – I’ll use 1us to represent “true”). So this row is already great; in the TrieNodes, if we encounter ‘null’, we know there are no more words here, we’ve “fallen off the edge”. And in the array, if we get to the 0-row “node” then no matter what extra letters we try to tack on, we just keep circling back around to this row, whose IsEndOfWord is always false.

So now we just need to initialize rows 1-through-Count so that they each correspond exactly with a canonical TrieNode. To do this, we need both a way to convert a row index into a canonical TrieNode reference, and a way to convert a canonical TrieNode into a row number. So I write:

let numToNode = seq { yield null; yield! canonicalNodes.Keys } |> Seq.toArray let nodeToNum = let d = new Dictionary<_,_>( { new IEqualityComparer<TrieNode> with member this.Equals(x,y) = x===y member this.GetHashCode x = x.FastHash } ) for i in 1..numToNode.Length-1 do // 0th is null d.Add(numToNode.[i], uint16 i) d

“numToNode” is an array of TrieNodes. I pass in a number, and I get back a TrieNode. And note that numToNode.[0] will be null, the way I’ve coded it. Spiffy.

“nodeToNum” is a Dictionary<TrieNode,uint16>. I pass in a TrieNode, and I get back an integer. The .NET Dictionary type does not accept ‘null’ as a key, so I have to leave that entry out; we’ll account for it later. Note that since I am working with canonical nodes, I use (===) as the equality test in my comparer. So now I can say nodeToNum.[someNode] and I get back the row number in the array that will correspond to someNode.

Time to fill out the rows of the array with real data:

let mutable nextIndex = 1 let rec Fill (t:TrieNode) = if not(null===t) then for i in 0..25 do Fill(t.NextChar.[i]) let num = int nodeToNum.[t] for i in 0..25 do arrayTrie.[num,i] <- match t.NextChar.[i] with |null -> 0us | n -> nodeToNum.[n] if t.IsEndOfWord then arrayTrie.[num,26] <- 1us Fill(trie)

I just recursively walk through every node in the trie. For each TrieNode reference in the 26-wide NextChar array, I convert each reference to its corresponding number, and store that in correct slot of the 2D-array. Note also the special-casing of “null->0us” here (as I earlier forewarned). And for each TrieNode, if IsEndOfWord is true, I set the 27th element of the row to 1us. This initialization step only takes about one second of wall-clock time to run.

Now my 2D-array contains all the same information as my TrieNode object graph, only in much less memory space. I can drop all the other objects on the floor for the GC to sweep up; I need only a single object in hand (the 2D-array) to solve Bookworm.

I’ll need to make some very minor adjustments to my “Find” algorithm – it used to talk to TrieNodes, but now it needs to talk to the array instead. The previous algorithm just needed to know (1) which node it was ‘currently at’, (2) how to ‘move forward a letter’, and (3) how to check if it’s at the end of a word. Previously (1) was handled by storing a local reference to a TrieNode, now instead I’ll store a uint16 instead. As for (2) and (3), I can write those functions thusly:

let NextChar(node,c) = arrayTrie.[int node, int c - int 'A'] let IsEndOfWord(node) = arrayTrie.[int node, 26] = 1us

Thus my Find() code is almost unchanged; the only changes are on the lines marked HERE:

let rec Find(col, row, n:uint16) = seq { // HERE - was t:TrieNode assert((col+row)%2 = 1) // we should only be searching LETTER positions let c = board.[col,row] if c <> ' ' then board.[col,row] <- ' ' match NextChar(n,c) with // HERE - was t.[c] | 0us -> () // HERE - was null | node -> if IsEndOfWord(node) then // HERE - was node.IsEndOfWord yield [c] for suf in Find(col-1, row-1, node) do yield c::suf for suf in Find(col-1, row+1, node) do yield c::suf for suf in Find(col , row-2, node) do yield c::suf for suf in Find(col , row+2, node) do yield c::suf for suf in Find(col+1, row-1, node) do yield c::suf for suf in Find(col+1, row+1, node) do yield c::suf board.[col,row] <- c }

I changed the argument type of the function from “TrieNode” to “uint16”. I changed “t.[c]” to a NextChar() call. “null” became “0us”. And “IsEndOfWord” is no longer a member function. Four tiny changes, due to the fundamental representation change I made to save on memory.

The full source code appears at the end of this blog entry.

### Summing up

Here’s the TL;DR:

- Chris posts a Bookworm solver
- Brian posts his own solver that’s shorter, faster, smaller
- Chris builds on Brian’s solution and reduces memory further, but makes the program much slower
- Brian improves on that by more than 4x memory-wise and more than 60x speed-wise

Victory is mine!

### The Source Code

All told, still fewer than 200 lines.

let IsLetter(c) = c>='A' && c<='Z' let (===) x y = obj.ReferenceEquals(x,y) [<NoEquality; NoComparison; AllowNullLiteral>] type TrieNode() = let nextChar = Array.create 26 null let mutable isEndOfWord = false let mutable height = -1 let mutable fastHash = 0 member this.NextChar = nextChar member this.IsEndOfWord = isEndOfWord member this.Item with get(c:char) = nextChar.[int c - int 'A'] and private set(c:char) v = nextChar.[int c - int 'A'] <- v member this.Add(word) = assert(word |> Seq.forall IsLetter) this.Add(word,0) member private this.Add(word:string,i) = if i = word.Length then isEndOfWord <- true else match this.[word.[i]] with | null -> let newNode = new TrieNode() newNode.Add(word,i+1) this.[word.[i]] <- newNode | node -> node.Add(word,i+1) // don't call this until the entire tree is built; it caches its results member this.Height = if -1 = height then height <- 1 + (nextChar |> Array.map (fun t -> if t===null then 0 else t.Height) |> Array.max) fastHash <- this.ComputeFastHash() height member private this.ComputeFastHash() = // An especially good hash function we can use after we've finished building the whole tree. // The first 26 bits will be set if the corresponding NextChar is non-null, and // the remaining 6 bits will be the Height. assert(this.Height > 0 && this.Height < 64) // 64 is 2^6, we have 6 bits, which is plenty let mutable r = 0 for i in 0..25 do if not(null === nextChar.[i]) then r <- r + 1 r <- r <<< 1 r <- r <<< 6 r + this.Height member this.FastHash = fastHash let words = seq { let toIgnore = System.IO.File.ReadAllLines("IgnoredWords.txt") |> Set.ofArray for word in System.IO.File.ReadAllLines("TWL06.txt") do assert(word |> Seq.forall IsLetter) if not (toIgnore.Contains word) && word.Length >= 3 then yield word } let trie = new TrieNode() for w in words do trie.Add(w) let rec IsWord(s:string, t:TrieNode) = if t===null then false elif s.Length=0 then t.IsEndOfWord else IsWord(s.Substring(1), t.[ s.[0] ]) assert(IsWord("ABSTRACTIONISMS", trie)) assert(IsWord("MONSOON", trie)) assert(not(IsWord("TAPL", trie))) ///////////////////////////////////////////////////////// let rec CountNodes (t:TrieNode) = if null===t then 0 else 1 + (t.NextChar |> Array.map CountNodes |> Array.sum) printfn "Original has %d nodes" (CountNodes trie) open System.Collections.Generic let canonicalNodes = new Dictionary<_,_>( { new IEqualityComparer<TrieNode> with member this.Equals(x,y) = x.Height = y.Height && x.IsEndOfWord = y.IsEndOfWord && Array.forall2 (===) x.NextChar y.NextChar member this.GetHashCode x = x.FastHash } ) let rec Canonicalize (t:TrieNode, height) = if not(null===t) then if t.Height > height then for i in 0..25 do Canonicalize(t.NextChar.[i], height) elif t.Height = height then for i in 0..25 do let n = t.NextChar.[i] if not(null===n) then match canonicalNodes.TryGetValue(n) with | true,canon -> t.NextChar.[i] <- canon | _ -> canonicalNodes.Add(n,n) for i in 1..trie.Height do Canonicalize(trie, i) canonicalNodes.Add(trie,trie) // need to add the root printfn "Now has %d nodes" canonicalNodes.Count ///////////////////////////////////////////////////////// assert(canonicalNodes.Count < int System.UInt16.MaxValue) let numToNode = seq { yield null; yield! canonicalNodes.Keys } |> Seq.toArray let nodeToNum = let d = new Dictionary<_,_>( { new IEqualityComparer<TrieNode> with member this.Equals(x,y) = x===y member this.GetHashCode x = x.FastHash } ) for i in 1..numToNode.Length-1 do // 0th is null d.Add(numToNode.[i], uint16 i) d let arrayTrie = Array2D.create (canonicalNodes.Count+1) 27 0us let mutable nextIndex = 1 let rec Fill (t:TrieNode) = if not(null===t) then for i in 0..25 do Fill(t.NextChar.[i]) let num = int nodeToNum.[t] for i in 0..25 do arrayTrie.[num,i] <- match t.NextChar.[i] with |null -> 0us | n -> nodeToNum.[n] if t.IsEndOfWord then arrayTrie.[num,26] <- 1us Fill(trie) let NextChar(node,c) = arrayTrie.[int node, int c - int 'A'] let IsEndOfWord(node) = arrayTrie.[int node, 26] = 1us let rootNode = nodeToNum.[trie] ///////////////////////////////////////////////////////// let COLS = 7 let ROWS = 8 let data = [| // columns " K M E M T O O " " E F K I M C A D" " A A E R E A F " " T D A I B N D J" " V F F A T E E " " O J T R F I R R" " I R D R E E O " |] // surround entire grid with border of spaces (one extra column, two extra rows) let BOARDCOLS = COLS + 2 let BOARDROWS = 2*ROWS + 4 let MakeBoard(columns:string[]) = assert(columns<>null) assert(columns.Length = COLS) assert(columns |> Seq.forall (fun s -> s.Length = ROWS*2)) for i in 0..COLS-1 do for j in 0..ROWS*2-1 do let c = columns.[i].[j] if (i+j)%2 = 0 then assert(c=' ' || IsLetter c) else assert(c=' ') Array2D.init BOARDCOLS BOARDROWS (fun c r -> if c=0 || c=BOARDCOLS-1 || r=0 || r=1 || r=BOARDROWS-2 || r=BOARDROWS-1 then ' ' else columns.[c-1].[r-2]) let board = MakeBoard(data) let rec Find(col, row, n:uint16) = seq { // HERE - was t:TrieNode assert((col+row)%2 = 1) // we should only be searching LETTER positions let c = board.[col,row] if c <> ' ' then board.[col,row] <- ' ' match NextChar(n,c) with // HERE - was t.[c] | 0us -> () // HERE - was null | node -> if IsEndOfWord(node) then // HERE - was node.IsEndOfWord yield [c] for suf in Find(col-1, row-1, node) do yield c::suf for suf in Find(col-1, row+1, node) do yield c::suf for suf in Find(col , row-2, node) do yield c::suf for suf in Find(col , row+2, node) do yield c::suf for suf in Find(col+1, row-1, node) do yield c::suf for suf in Find(col+1, row+1, node) do yield c::suf board.[col,row] <- c } let allWordsSorted = seq { for col in 0..BOARDCOLS-1 do for row in 0..BOARDROWS-1 do if board.[col,row] <> ' ' then yield! Find(col, row, rootNode) // HERE } |> Seq.map (fun w -> new string(w |> Seq.toArray)) |> Set.ofSeq // remove duplicates |> Seq.toArray |> Array.sortWith (fun a b -> match compare a.Length b.Length with | 0 -> compare a b | x -> x) printfn "" for w in allWordsSorted do printfn "%s" w printfn "%d words" allWordsSorted.Length