Inside F#

Brian's thoughts on F# and .NET

F# web snippets, Mandelbrot, and great things to come…

Posted by Brian on October 17, 2010

Tomas recently posted a terrific new tool called F# Web Snippets.  You can read a bit on his blog, but you can get an immediate feel by reading a clone of this blog entry I’ve hosted elsewhere (on a site where I can host Javascript code).  So click the previous link and check it out.  Here is the code without the nice web-snippet adornments:

open System
open System.Numerics

let maxIteration = 100

let modSquared (c : Complex) = c.Real * c.Real + c.Imaginary * c.Imaginary

type MandelbrotResult = 
    | DidNotEscape
    | Escaped of int
let mandelbrot c =
    let rec mandelbrotInner z iterations =
        if(modSquared z >= 4.0) 
            then Escaped iterations
        elif iterations = maxIteration
            then DidNotEscape
        else mandelbrotInner ((z * z) + c) (iterations + 1)
    mandelbrotInner c 0

let chars = " .:-;!/>)|&IH%*#"

for y in [-1.2..0.05..1.2] do
    for x in [-2.0..0.025..0.9] do
        match mandelbrot(Complex(x, y)) with
        | DidNotEscape -> Console.Write " "
        | Escaped i -> Console.Write chars.[i &&& 15]

The Javascript-ful site, in addition to just nicely colorizing the F# code as above, also provides hover tooltips in your browser, much like you’d get from Visual Studio.  On the cloned blog entry you can hover over the identifier “maxIteration” and see that it is an “int”, for example.  It’s very cool.

(By the way, yes, the Mandelbrot code I stole mercilessly from Luke’s recent timely blog.)

Anyway, Tomas’ tool is so cool I had to immediately try it out in a blog post before I went to sleep tonight.  Here’s the output of the program, by the way:


With PDC coming soon, one imagines there may be other exciting F#-related announcements coming soon as well!


Posted in Uncategorized | 1 Comment »

Unit tests + debugger = code understanding

Posted by Brian on October 8, 2010

(A quick reminder to update your feed readers to point at for my new blog home.)

One of the best features of Visual Studio is Intellisense, the auto-completion that suggests legal completions when you “dot into” an object, and brings up a list of identifier completions when you press a keyboard shortcut like Ctrl-Space.  This kind of auto-completion is one of those most-addictive productivity features, where once you’ve used it, you can’t imagine how you ever lived without it.  Nowadays I think most every IDE supports this feature in one way or another.

The Intellisense for F# that we shipped as part of VS2010 is pretty great; I use (read: “depend on”) it all the time.  Nevertheless, there are a number of corner cases where the completion lists fall down, and pressing “dot” yields incomplete or incorrect information.  On the F# product team, we refer to these cases as “bugs”.  :)


I’ve recently been spending time fixing these Intellisense bugs.  The associated product code is the F# Language Service; a “language service” in Visual Studio is the code that provides all the logic for editor support of a given programming language: features like syntax coloring, Intellisense, “Go To Definition”, etc.

So I start looking at these bugs and spelunking through the language service, and guess what – most of it is code I have either (1) never looked at or (2) completely forgotten.  Before I can fix anything, I’ll need to understand how it all works!

Fortunately, much of this code was originally authored by my esteemed colleague Jomo Fisher.  Among Jomo’s many admirable developer qualities: he is a TDD-er and avid unit tester.  I had started doing a little TDD and unit testing on my own before joining the F# team, but then I joined F# and saw the foundation Jomo had laid for unit-testing the F# language service, and I started to really learn about unit testing.  The Visual Studio architecture is all about interfaces and services, which means you can mock out all of your VS dependencies and create tests against your VS language service that don’t require Visual Studio to be running.  And that’s what Jomo had already established when I first joined the team – a bunch of NUnit tests against the F# language service he was authoring.

Fast-forward two-and-a-half years, to earlier this week when I wanted to fix an Intellisense bug.  Step one, of course, is to author the tests.  I had a few repros in hand of places where Intellisense fails (uncovered by some reliability tests written by Jack (a tester on F#) and inspired by Dmitry (another developer on F#)), so I authored a few small test cases to put the bugs under test.   I’m now a one-third the way through the TDD mantra “red, green, refactor!”  The next step it harder, though.


Now I need to try to fix the bug.  Only I haven’t the faintest idea where to start, since I don’t know or recall any of this code.  So I fire up the Visual Studio debugger and start stepping through the code.  The debugger is a good tool for code understanding – when coupled with good unit tests, it is fantastic.  Stepping through the code, setting interesting breakpoints, and inspecting call-stacks helped me understand how the various components and functions related to one another.  Debugging individual unit tests (where each test is effectively a specification of a tiny feature) helped me identify which pieces of code were attached to which individual features.  I probably spent about 4 hours meandering through perhaps some 7000 lines of code, learning how it fit together and locating likely places I would need to eventually make some fixes.

So I finally start making changes and try to make one of my newly-created red tests turn green.  Make a change, run that test.  Hurrah, it turns green!  Now, run the other 600 language service tests.  :)  And of course, I’ve broken like 50 of those.  Great, my unit tests are fulfilling one of their purposes – preventing me from regressing existing functionality.  Once again I pop open the debugger, and step through a test I just broke, to understand how what-I-just-changed interacted with this old test.  Here, a good debugger really shines.  I see it going past some code I just changed, and now because of the computations I just changed, I think it’s now going into this “else” branch, whereas previously I suspect it was going down the “if” branch.  Well, I could back out my change, and re-compile, and re-debug, and see if it does take the other branch and if that’s the key difference between red and green for this test.  But hey, I’m in the Visual Studio debugger, so I can poke at the live program state however I like.  So once I reach the “else”, I drag the little yellow arrow (the ‘next statement’ icon in the margin) up into the “if” to “Set Next Statement” as though the if condition were true rather than false, and then press F5 to keep going.  And sure enough, now the test is green again.  So I’ve quickly verified my hypothesis about “what changed” inside the debugger, without having to recompile the program.  Alternatively, I could poke new values into variables in the “locals” window as another way to test hypotheses about changes without having to actually change and recompile the program.  It takes a few minutes to recompile these components and re-install the newly-compiled components, so it really is time-saving (as well as keeping you engaged/focused) to do this hypothesis-testing quickly inside the debugger without needing a recompile.

Ok, so now I think I grok how my change interacted with and broke existing tests, so I’m ready to try a different fix.  I make a change, recompile, and while it’s compiling I speculate about which tests I expect to turn green and which I think will stay red.  Re-run some tests and continue to hone my understanding.  The test I’m looking at, incidentally, is when you type


you don’t get Intellisense, even though e.g.


is a legal eventual completion.  Debugging… ok, I’ve definitely found a bug in a function called “QuickParse”, which parses the current line of text where you pressed dot and tries to find The.Qualified.Identifier that appears just to the left of the dot you’re pressing.  It would typically return a list of dot-separated identifiers, like [“The”;”Qualified”;”Identifier”], but in the case above, it has something like [“1”;””;”System”].  QuickParse doesn’t know about the range operator (..) and so it thinks all those bits are part of a qualified identifier, with e.g. the empty identifier “” as the “namespace between the two dots” of the range operator, rather than understanding “System” as being the start of the current identifier.  Indeed, if you do

    [1 .. System.

with a space after the range operator, then you do get Intellisense again, as QuickParse correctly handles that case.  Ok, so I can fix this.  Eventually after a number of iterations of this kind of work, I have my new tests turning green, and only 4 existing tests red, and all of those involve Intellisense of Obsolete entities.  I debug through those tests, and find the special logic for Obsolete-handling, and just above that code I discover a comment in the code that basically says “here we rely on the fact that we get an empty identifier in certain cases…” aha!  The “bug” in QuickParse is actually a “feature” used by this other code. 

Now maybe if I had found that code/comment sooner, it would have saved me a couple hours of debugging.  But it would have cost me all the code understanding I gained during the debugging session.  I have a number of Intellisense reliability issues to fix, as well as some new feature work in the area ahead of me, so it is very much worthwhile for me to get a deep understanding of how everything here works, so that I have this all “cached into my brain” for my next few weeks of work.

Anyway, that “feature” of QuickParse and the nearby code is a rather subtle and ugly to my eye, so now it’s time to…


In the course of my investigation, I’ve already found a few cleanups that don’t break any tests, and so I can move the code in the general direction I want it to go, even if I’m not all done fixing things yet.  Our suite of unit tests gives me the confidence to make changes to this “code I’d never seen prior to a couple days ago”.  (My own manual testing inside Visual Studio, getting a code review approved, and having the QA group poking at the product, all also inspire more confidence that regressions will be discovered, but the unit tests provide the most immediate feedback that things are ok.)

I don’t yet have a tidy end to this fixing-an-Intellisense-bug story, as I’ve brought you up to date now (right now I understand the issue, but haven’t fixed the bug and got all the tests green yet).  But that’s ok, because this particular narrative is just a means to an end, a way to describe how unit tests and debugging lead to understanding of code.  So let me wrap up.


When you have good unit tests, everything is better; if you’re already a TDD-er, then  saying that is just preaching to the choir.  And when you have a good debugger, using that tool is a great way to learn about an unfamiliar code base (stepping through code, setting breakpoints, looking at stacks, changing values of locals or the next statement to see how things react).  Put those two together, and the effect is even more powerful.  Unit tests + debugger = code understanding!

Posted in Uncategorized | 7 Comments »

Hello world!

Posted by Brian on October 7, 2010

Windows Live Spaces is phasing out their blogging support in favor of WordPress, so I am converting my F# blog.  Please update your bookmarks and RSS feeds!

Posted in Uncategorized | 3 Comments »

Having fun with F#, WPF, MVVM, Silverlight, and Windows Phone 7

Posted by Brian on September 25, 2010

Like many developers, in my free time, I like to write code.  :)  I have my little side-projects, which are typically both fun to play with as well as a chance to learn new technologies.  My prior side-projects have typically been small (the two linked in the previous sentence were each less than 500 lines of code), but right now I’m working on a side-project that’s a little more ambitious.

Like many .NET developers, I’m excited about the imminent arrival of the Windows Phone.  Here’s a chance to put my existing programming skills and experience to use on a fun new device, and possibly even earn some fun-money selling my app in the marketplace.  So that’s what I’ve been spending my free time on of late – a simple-but-fun little puzzle game for the phone.  I’ve got about 1600 lines of code now, and I estimate it will come in weighing about 2000 lines – more than I can hack in a weekend, but still small enough to finish in my free time over a couple months.

Along the way I’ve been learning lots more about WPF.  I highly recommend the book WPF 4 Unleashed, by Adam Nathan.  I’m only about two-thirds the way through the book now, but it’s been very good.  I finally deeply understand XAML, and I really now grok and appreciate the WPF architecture and all the terrific-looking UI you can easily create in a small amount of XAML or code.  The book is about WPF in .NET 4.0, but nearly everything I’ve learned and wanted to use for my phone app has applied equally to Silverlight for the phone.

I’ve also been trying to wrap my head around Model-View-ViewModel (MVVM) and apply that architecture, along with WPF data-binding, in my app.  I’m making some progress here, but think I still don’t have exactly the best factoring and still don’t have enough experience to measure my own success here.

Anyway, this blog post is especially light on content, but I wanted to share what I’ve been up to.  Of course, if you want to get started programming for Windows Phone yourself, you should visit the developer web site and get the free tools.  And if you want to use F#, get the F# phone templates from the Visual Studio Gallery online.  With these tools I found it easy to get started doing game programming for the phone using F# and Silverlight.

Happy coding!

Posted in Uncategorized | Leave a Comment »

F# for puzzles (Morse code decoder)

Posted by Brian on August 26, 2010

At Microsoft (and around the Seattle area) there is a tradition of puzzle events: PuzzleHunt, Puzzle Safari, PuzzleDay, and so forth.  I always find these events really enjoyable.  There are a wide variety of creative puzzle types that require ingenuity to solve.  Automated tools, such as anagram solvers (where e.g. you type in a jumble of letters, and a list of all reasonable anagrams is output), are occasionally useful, but usually the puzzles are constructed so that humans need to do the majority of the solving and tools are of limited utility.
A hybrid tool, however, that lets humans apply smarts while letting the machine do the grunt-work, can be very useful.  For example, suppose I need to decode a Morse code string like
let toDecode = "......-...-..---.-----.-..-..-.."
In real life, there would be spacing to delimit letters and words, but in a puzzle event, you’re often on your own.  The nature of Morse Code means there’s about a million bazillion possible ways to decode that string into letters, so it’s non-trivial to just brute force a solution.  But if you insert a human into the loop, you can quickly discard the blind alleys and home in on the right answer.
I wrote a tool for decoding such Morse code strings that is driven by a human, but where the computer does the tedious grunt-work.  The idea is simple; the computer works out all the possibilities for the next 3 letters, and then the human selects which prefixes “look promising” to investigate further.  If it turns into a blind alley, we can backtrack and try again.  Some screenshots along with some prose will explain.
When you start things off, you see all the possibilities for the first letters:
Hm, perhaps I think “SEE” looks like the most likely start.  I can type ‘see’ and now the tool shows
where the blue highlight shows the currently “committed” prefix in both the Morse code and the decodings I’m working with.  Ok, I glance down the list and this looks like a blind alley.  So I press backspace three times and look at the original list from the first screenshot again.  There are a few reasonable looking prefixes starting with “HE”, so I type ‘he’ and see
and it looks like the first word might be ‘HERE’ or ‘HELLO’.  Some further exploring and I’ll quickly find
and I’ll bet that “HELLOWORLD” is the intended decoding.  Shazam!  Doing this all by hand would have taken a lot longer.
Some of you might be thinking that the entire process could be entirely automated—that is, by using English dictionaries, analyses of letter frequencies, etc., the computer could try the most likely paths and not have to brute force it.  You might be right, but the puzzle creators are often very clever to foil such things.  For example, the text might be “THEYEARMMXWAS…” where encoding 2010 as a Roman numeral is likely to foul up an auto-solver.  Or perhaps this might be part of some puzzle entitled “X marks the spot” where the puzzle involved crossing out excess ‘X’s and then “XHELLOXWORLDX” might be the answer, or whatnot.  In general, puzzle creators are good at ensuring that humans will have success where machines alone would fail.
This is an F# blog, so of course I need to show you the F# code for the tool.  It’s a mere 75 lines, with more than a third of those lines devoted to the Morse code table itself.  Which is to say, the code is short and easy—you can hack this up in just a few minutes (I know, because I hacked it up in just a few minutes last night).  So I present the code without further explanation—enjoy!
let morseTable = [
 'A', ".-"
 'B', "-..."
 'C', "-.-."
 'D', "-.."
 'E', "."
 'F', "..-."
 'G', "--."
 'H', "...."
 'I', ".."
 'J', ".---"
 'K', "-.-"
 'L', ".-.."
 'M', "--"
 'N', "-."
 'O', "---"
 'P', ".--."
 'Q', "--.-"
 'R', ".-."
 'S', "..."
 'T', "-"
 'U', "..-"
 'V', "...-"
 'W', ".--"
 'X', "-..-"
 'Y', "-.--"
 'Z', "--.."
let toMorse s = 
    let d = dict morseTable
    System.String.Join("", [|for c in s do yield d.[c]|])
let rec possiblyNextLetters n (morse:string) =
    match n, morse with
    | 0,_ -> [[' ']]
    | _,"" -> [[' ']]
    | _ -> 
        [for c,m in morseTable do
            if morse.StartsWith(m) then
                let r = possiblyNextLetters (n-1) (morse.Substring(m.Length)) 
                let r2 = [for x in r -> c::x]
                yield! r2]
open System
let mutable committed = ""
let toDecode = "......-...-..---.-----.-..-..-.."
while true do
    let committedMorse = toMorse committed
    let restMorse = toDecode.Substring(committedMorse.Length)
    Console.BackgroundColor <- ConsoleColor.Blue 
    Console.Write(" {0}", committedMorse)
    Console.BackgroundColor <- ConsoleColor.Black 
    let nexts = possiblyNextLetters 3 restMorse |> (fun cs -> System.String(Seq.toArray cs))    
    for n in nexts do
        Console.BackgroundColor <- ConsoleColor.Blue 
        Console.Write(" {0}", committed)
        Console.BackgroundColor <- ConsoleColor.Black 
    let k = Console.ReadKey()
    if k.Key = ConsoleKey.Backspace && committed.Length > 0 then
        committed <- committed.Substring(0, committed.Length - 1)
        let k = k.KeyChar
        let k = System.Char.ToUpper(k)
        if k >= 'A' && k <= 'Z' then
            if nexts |> Seq.exists (fun s -> s.StartsWith(string k)) then
                committed <- committed + string k
                Console.WriteLine(" Not a legal next char!")
            Console.WriteLine(" Press a letter to commit that letter, or backspace to uncommit one")

Posted in Uncategorized | Leave a Comment »

Yet another release of F# (August 2010 CTP)

Posted by Brian on August 18, 2010

In case you missed it, today we announced a new release of F#.  If you have VS2010 already, then you probably don’t need this release; this release pretty much just puts the same bits in new CTP packaging.  Briefly:

  • The previous CTP only had the .NET 2.0 versions of the compiler and msbuild tools
  • The previous CTP only installed into the free VS2008 Integrated Shell


  • The new CTP also has the .NET 4.0 versions of the compiler and msbuild tools
  • The new CTP can also install into the free VS2010 Integrated Shell

The MSI tries to be smart – if you have .NET 2.0, it installs those tools; if you have .NET 4.0, it installs those tools; if you have any VS2008, it installs the VS2008 support; if you have the VS2010 Shell, it installs that support.  As always, if you have a prior F# CTP install, uninstall it first, before installing the new CTP.

So basically the new release is of interest to you mostly if you fall into one of these buckets:

  • You have been using the ‘free’ F#-in-VS2008-Shell, and you want to upgrade to ‘free’ 2010 tools so you can take advantage of the pretty new WPF environment and zoomable editor and multi-monitor support and use .NET 4.0 and other “2010-specific stuff”.  The CTP is (and always has been) the moral equivalent of “Visual F# Express”.
  • You have been looking for a .NET 4.0 version of the F# compiler and tools to install on your build server/continuous-integration server, and you didn’t want to install all of Visual Studio on that machine just to get the F# compiler.
  • You’re using F# on Mono/Linux/Mac (grab the ZIP rather than the MSI). (You can already use F# with success on other platforms; I imagine the story here will continue to improve markedly in the coming months, as stuff like this starts to bear fruit.)

So that’s it.  Again, if you have VS2010 Pro or above, you already have all the bits you need on your dev box.  This release just makes those bits more freely available and more conveniently packaged for those without a full VS2010 install.

(There’s a white lie in the previous sentence; there is one new thing in the new CTP, which is an FSharp.Core.dll for Windows Phone 7, but I hope to discuss that in more depth in a future blog post.  I haven’t done _any_ phone programming myself yet, though I am keen to do it, when I find some free time.  If you just can’t wait, then perhaps see Don’s blog for some starter links here.)

Posted in Uncategorized | Leave a Comment »

Self-assessment – what have YOU learned lately?

Posted by Brian on July 7, 2010

I recently reading this blog (about a person who will use a different programming language each day of the week), and thought: wow, I have pretty much only used two languages (F# and C#) in the last couple years.  Does that mean I’ve been getting complacent/stagnant for technical skills, and I need to spend more time sharpening the saw?  But then I did a quick review of the past six months, and I discovered that I have in fact learned a bit – there’s more to technical skills than just programming languages, of course.  Today I shall commemorate some of the new stuff I have learned in 2010, “Achievement”-style (a la XBox achievements or StackOverflow badges).

Achievement unlocked: Inline MSBuild tasks


I wrote my first inline MSBuild task.  MSBuild 4.0 enables you to write “inline tasks” directly in the project file, using a subset of C# and .NET.  At work I needed to do some custom validation that would cause a build error (with a useful diagnostic) if certain constraints were not met.  So I read up on inline MSBuild tasks and created one to do my custom validation, and it works great!  MSBuild is a “programming language” when viewed through the right lens, but it’s one that I (like most who use it) have been learning opportunistically on an (infrequent) as-needed basis.  But MSBuild is pretty powerful, and inline tasks are a useful tool to have in one’s tool cabinet.

Achievement unlocked: DGML graphs


I have used the new DGML support in VS2010 for a few different things already this year.  You saw one already in this blog.  At work, I made a graph to visualize our build system for F# (all the assemblies we have, and their dependencies) to help me visualize how to speed up and parallelize the build.  I “hand authored” some graphs as pictures/figures in a short report I did on a prototyping project at work.  And I also did a link graph of my blog entries (mostly for fun to play around with DGML).  If I need to draw a directed graph, either by hand or programmatically, I know a useful little tool to do it!

Achievement unlocked: VSIX


The new VS2010 Extension Manager makes it easy to install and uninstall VSIX extensions.  I mentioned extensions in a previous blog; if you haven’t already visited the Visual Studio Gallery, you should check it out now!  As I mentioned in another blog post, I authored by own VSIX extension to add Solution Explorer support for creating signature files.  It was lame and buggy, so I didn’t publish it, but I learned all the details of creating and publishing Visual Studio extensions, so I am prepared for when I have a good idea with a good implementation.  :)

Achievement unlocked: Screencasts


I had been thinking about creating screencasts for forever, but only this year did I finally “just do it”.  Hopefully you’ve already seen some of my screencasts: there are currently 4 of them available in the “F# and the VS2010 IDE” section here.  I discovered that the free Microsoft Expression Encoder 3 product makes it straightforward to record screencasts.  (From the technology standpoint, anyway – I still spend many hours on each screencasts, designing the content & examples and then ‘shooting’ many ‘takes’ for each segment until I get it right.)  I work on the Visual Studio tooling for F#, so I enjoy making these screencasts as a way to show off the various productivity features in the product.

Achievement unlocked: Silverlight


I had never done Silverlight stuff until this year, when I started small but eventually learned enough Silverlight and WPF to write a fun online game.  I’m amazed how easy Silverlight is, I can carry over all my general desktop programming skills to web apps.

Future Achievements?

There are a few achievements I would like to get later this year, if I can find the right combination of time/motivation/use-case:

  • Pex: check out Pex for Fun to get a sense of what the Pex code analysis tools can do.  I’d like to leverage this more for static analysis and testing
  • Code Contracts: even though I haven’t used them yet, I can’t help but wonder if, looking back, these will turn out to be the most important tool of 2010.  Check out this video to get a sense – the vision here is awesome, and I hope that in practice they are just as cool.
  • WebSharper.  All the benefits of F# static checking applied to AJAX client web apps?  I’m in!  I just need an excuse of an app to write.

I’m sure as time goes by, I’ll find more technologies I want to learn, too.


So there you go, I have learned a number of useful new technologies this year, and aspire to more.  What does your list look like?

(And yes, the section titles are an homage to Achievement Unlocked, a great silly game.)

Posted in Uncategorized | Leave a Comment »

VS2010 Pro Power Tools extension available

Posted by Brian on June 24, 2010

One of the great new features of VS2010 is the Extension Manager, which makes it very easy to download and install extensions to Visual Studio.  The new Pro Power Tools extension released by Microsoft is an extension I highly recommend you try out.  I’ve created a screencast that demos some features of the extension.  Below are a couple screenshots of a couple of my favorite features.  To install the extension, inside VS just go to “Tools\Extension Manager”, click “Online Gallery”, and type “pro power tools” in the search box, and grab this:


My favorite features:


Column Guides or Guidelines:

Guidelines are vertical rules that run across the editor background; they are especially useful to provide a subtle visual cue about indentation levels in a whitespace-significant language like F#.  I’ve drawn on the screenshot to highlight one of the guidelines:



New ‘Add Reference’ dialog:

Pops up fast and is more pleasant (less frustrating) to use, IMO:


(I should note that a bug in the F# project system may prevent adding certain COM references to F# projects via this dialog.  If you encounter this bug, you can work around it by going to Extension Manager, disabling the ‘Pro Power Tools’ extension, adding the reference you need, and then re-enabling the extension.)


There are lots of other great features in this extension, you can read the summary on the description page of the Visual Studio Gallery.


See also Kirill’s blog, Jason’s blog, and Radames blog, as well as the Visual Studio blog for a articles about the Add Reference dialog, and a couple about the Document Well.

Posted in Uncategorized | Leave a Comment »

Dear Proggit: graphs are cool, but I prefer F#, so I graphed the subreddit interconnections with F# and DGML

Posted by Brian on May 26, 2010

Today I saw this and I thought, hey, that’s pretty cool!

So of course I had to code my own version with F#.  First off, the results, when starting from the “music” subreddit:


(To see the full size version, click here.)

Each node contains the name of the subreddit and the number of readers.  The font size is proportional to log(numReaders).  The color is determined by community age, with young ones being red and rainbowing across to old ones being blue.  And of course the graph arrows are the links to other related subreddits.

All the info is scraped from the reddit sidebars using some hackish Regexes and XML parsing.

Of course I used F# async for the non-blocking I/O to grab multiple web pages concurrently so that the program runs fast.  I used an agent and .NET 4.0 concurrent collections to manage mutable updates to state without creating data races.

All told, just about 70 lines of code to scrape reddit for the info and 40 more to generate the DGML that VS2010 then renders into the pretty graph.  I just hacked it together tonight, so the code is maybe not awesome, but it is good enough to share.  Code is below, have fun with it.  F# and DGML are cool!

// reddit uses XHTML, hurrah
open System.Xml.Linq
let XN name = XName.Get(name, "")

// state used by the workers (multi-threaded)
let results = new System.Collections.Concurrent.ConcurrentBag<_>()

// state managed by the agent (which is logically-single-threaded)
let mutable visited = new System.Collections.Generic.HashSet<string>()
let mutable started = new System.Collections.Generic.HashSet<string>()
let allDone = new System.Threading.ManualResetEvent(false)

type Message =
    | EnsureVisited of string
    | FinishedVisiting of string

printfn "Periodically showing number of known remaining links to follow (as progress indicator)..."
let rec agent = MailboxProcessor.Start(fun mbox ->
    let numIters = ref 0
    let rec Loop() =
        async {
            let! msg = mbox.Receive()
            match msg with
            | EnsureVisited url -> 
                if not(visited.Contains(url)) && not(started.Contains(url)) then
                    started.Add(url) |> ignore
                    visit url |> Async.Start 
            |  FinishedVisiting url ->
                started.Remove(url) |> ignore
                visited.Add(url) |> ignore
            incr numIters
            if !numIters % 10 = 0 then
                printf "%d " started.Count 
            if started.Count <> 0 then
                return! Loop()
                allDone.Set() |> ignore

and visit reddit = async {
    use wc = new System.Net.WebClient()
    let! xhtml = wc.AsyncDownloadString(new System.Uri(""+reddit))
    let y = System.Text.RegularExpressions.Regex.Match(xhtml, "<span class=\"age\">a community for (\d+) years?</span>")
    let m = System.Text.RegularExpressions.Regex.Match(xhtml, "<span class=\"age\">a community for (\d+) months?</span>")
    let months = if y.Success then 12*int(y.Groups.[1].Value) elif m.Success then int(m.Groups.[1].Value) else 0
    let getTitleBox (xhtml:string) =
        // extremely quick and dirty way to parse out just the bit I want
        let start = xhtml.IndexOf("<div class=\"titlebox\">")
        let mutable finish = xhtml.IndexOf("</div>", start)
        while(finish <> -1 && try XElement.Parse(xhtml.Substring(start,finish-start+6)); false with e-> true) do
            finish <- xhtml.IndexOf("</div>", finish+1)
        if finish = -1 then failwith "could not parse"
    let getAttrVal attrName (e:XElement) =
        let s = e.Attributes(XN attrName)
        if Seq.length s = 1 then Some((Seq.head s).Value) else None
    let xe = XElement.Parse(getTitleBox xhtml)
    let numReaders = xe.Descendants(XN "span") 
                  |> Seq.filter (fun e -> match getAttrVal "class" e with None -> false | Some s -> s="number") |> Seq.head 
    let urls = 
        xe.Descendants(XN "a") |> Seq.choose (getAttrVal "href") |> Seq.choose (fun u -> 
            let m = System.Text.RegularExpressions.Regex.Match(u, @"^(?:http://(?:www.)?\w+)/?$")
            if m.Success then Some(m.Groups.[1].Value.ToLowerInvariant()) else None) |> set
    results.Add(reddit, numReaders.Value, urls, months)
    for u in urls do
        agent.Post(EnsureVisited u)
    agent.Post(FinishedVisiting reddit)

// kick it off with a starting reddit
agent.Post(EnsureVisited "music")
allDone.WaitOne() |> ignore

// make DGML of the results
let sb = new System.Text.StringBuilder()
let add (s:string) = sb.AppendLine(s) |> ignore
add @"<?xml version=""1.0"" encoding=""utf-8""?>"
add @"<DirectedGraph GraphDirection=""BottomToTop"" Layout=""Sugiyama"" xmlns="""">"
add @"  <Nodes>"
for (reddit,numReaders,links,months) in results do
    add <| sprintf "    <Node Id=\"%s\" Label=\"%s
%s\" NumReaders=\"%d\" Months=\"%d\" />" 
            reddit reddit numReaders (System.Int32.Parse(numReaders, System.Globalization.NumberStyles.AllowThousands)) months
add @"  </Nodes>"
add @"  <Links>"
for (reddit,numReaders,links, months) in results do
    for l in links do
        add <| sprintf "    <Link Source=\"%s\" Target=\"%s\" />" reddit l
add @"  </Links>"
add @"  <Styles>"
add @"    
    <Style TargetType=""Node"" GroupLabel=""NumReaders"" ValueLabel=""Function"">
      <Condition Expression=""NumReaders &gt; 0"" />
      <Setter Property=""FontSize"" Expression=""Math.Max(12,3*Math.Log(NumReaders))"" />
    <Style TargetType=""Node"" GroupLabel=""Months"" ValueLabel=""Function"">
      <Condition Expression=""Months &gt; 12"" />
      <Setter Property=""Background"" Value=""#FF6666FF"" />
    <Style TargetType=""Node"" GroupLabel=""Months"" ValueLabel=""Function"">
      <Condition Expression=""Months = 12"" />
      <Setter Property=""Background"" Value=""#FF66FF66"" />
    <Style TargetType=""Node"" GroupLabel=""Months"" ValueLabel=""Function"">
      <Condition Expression=""Months &gt;= 9"" />
      <Condition Expression=""Months &lt; 12"" />
      <Setter Property=""Background"" Value=""#FFFFFF44"" />
    <Style TargetType=""Node"" GroupLabel=""Months"" ValueLabel=""Function"">
      <Condition Expression=""Months &gt;= 5"" />
      <Condition Expression=""Months &lt; 9"" />
      <Setter Property=""Background"" Value=""#FFDDBB66"" />
    <Style TargetType=""Node"" GroupLabel=""Months"" ValueLabel=""Function"">
      <Condition Expression=""Months &lt; 5"" />
      <Setter Property=""Background"" Value=""#FFFF6666"" />
add @"  </Styles>"
add @"</DirectedGraph>"
System.IO.File.WriteAllText(@"graph.dgml", sb.ToString())

Posted in Uncategorized | 1 Comment »

F# for Silverlight 4 available

Posted by Brian on May 17, 2010

Today the final Silverlight 4 Tools for Visual Studio 2010 were released (go here for download link).  These tools include the F# runtime (FSharp.Core.dll) for the Silverlight 4 runtime.  For those who may have previously been held up developing with F# for Silverlight 4, today is the day to get unblocked!

To commemorate the occasion, I made a tiny F# ‘hello world’ Silverlight application in the traditional fashion (a C# app with an F# library).  I’ll walk you through the steps.

(Ensure you have already installed the final version of Silverlight 4 tools for VS2010.)

In VS, go to the ‘New Project’ dialog and select the ‘F# Silverlight Library’ template


Then when it asks to choose a Silverlight version, pick Silverlight 4:


Then, for the purposes of this example, I replaced the code in Module1.fs in the new project with this code:

namespace MyFSharp

type MyType() =
    static member FilterOutZs (strs:seq<string>) =
        seq { for s in strs do
                if not(s.StartsWith("Z")) then
                    yield s }

We’ll see how I’ll use this code shortly.

Next, right click on the solution in Solution Explorer and ‘Add… New Project’ a ‘C# Silverlight Application’:


Then it will pop up a dialog about hosting the new Silverlight app, I choose to uncheck the ‘Host the Silverlight application in a new web site’ box.  Once again, be sure that ‘Silverlight 4’ is selected as the Silverlight version in the dialog.


Next I added the highlighted bit to the MainPage.xaml in the C# app:


and then in the C# code-behind, MainPage.xaml.cs, I had this handler:

private void TheText_MouseEnter(object sender, MouseEventArgs e)
    this.TheText.Text =
        string.Join(" ", MyFSharp.MyType.FilterOutZs(
            new[] { "ZZZ", "Hello", "ZZZ", "from", "F#", "ZZZ" }).ToArray());

which calls my F# code.  To make this compile, I need to add a project reference, so right click on the C# project, ‘Add Reference…’ select the ‘Projects’ tab in the dialog, and select the F# library from your solution. 

Right click the C# app in Solution Explorer and select ‘Set as StartUp Project’.  (If I’d created the app first, and then added the library, rather than the other way around, I wouldn’t need this step.)

Now I can press F5 to run it, and I see in my browser:


Not the most enthralling app ever, but it shows that F# is working with Silverlight 4.  Of course you already know how to make more exciting F# Silverlight apps.

Have fun enjoying Silverlight 4 with F#!

Posted in Uncategorized | 2 Comments »