Inside F#

Brian's thoughts on F# and .NET

Monadic parser combinators, part two

Posted by Brian on December 3, 2007

Last time we introduced the concept of monadic parser combinators, but got only a vague feel for what they are and how they work.  Today we will start from the ground up and lay the foundation for a cool parsing library.  We’ll see that just five little functions form a basis for parsing, and that we can build an entire expressive library of parsers and parser-combinators building atop just those five functions.  Along the way we’ll also explore some more aspects of C# 3.0, namely delegates and extension methods.

Representing parsers

To start things off, let’s describe the data type we’ll use to represent parsers.  If we consider a few examples, we can discover an effective representation type.  Consider the expression parser we looked briefly at last time.  We want to be able to take in a string like "40+2" and yield the result 42.  This one example in a vacuum suggests that parsers might be functions that take strings and return integers.  However it’s clear that returning an integer is specific to that particular example.  We might write another parser that will be able to parse "3.1+2.1" and yield 5.2, for example, in which case the return type would be float or double.  Or we might write a parser that, rather than evaluating an expression, just returns a parse tree representing the expression, so that parsing "40+2" yields a tree with a ‘+’ at the root and with two integer leaf nodes.  In each case, the input was a string, but the output depended on what exactly our parser intended to do.  So we might imagine a parser function signature looking something like this:

T Parse<T>(string input);

That is, a parser is a generic function, something that can take in a string and return a value of some arbitrary type.  This is a good first approximation.

Consider another issue.  We intend to build a parser combinator library, which means we intend to build parsers by combining simpler parsers.  For example, the expression parser that parses "40+2" and yields 42 will probably be built atop a simpler parser that just parses natural numbers – that is a parser that can just take in the string "40" and yield the integer 40.  The point is, while the top-level parser is designed to consume the entire string, we want to build it out of smaller component parsers that just eat the portions of the input that they know about.  So, if we feed the string "40+2" to a parser designed to parse just natural numbers, we would like it to return the result 40, but only consume the first two characters of the input.  We can accomplish this by having parsers return not just the result (40), but also return the remaining unparsed input ("+2").  That is, the return type should actually be:

public class ParseResult<T>
{
    public readonly T Result;
    public readonly string RemainingInput;
    public ParseResult(T r, string ri) { this.Result = r; this.RemainingInput = ri; }
}

Great.  One final issue is that a parser can fail.  For example if we hand our expression parser the string ";x%" or some other garbage, we want the result to indicate that the parse failed.  For now, we’ll just return null if the parse fails.  (In the future, we’ll come back and add more elaborate error-handling, so we can create customized error messages and point to the location of the failure in the original input string.)

Ok, so with all these ideas in mind, we can now describe a C# type that represents parsers.  I’ll call the type P, which stands for "Parser" (we’ll be using this type a lot, and so a very short name is good for bloggers like me who want to show you lots of code without worrying that the line lengths will get too long and scroll off the right-hand page margin).  P is generic, and since it is a function type, we represent it in C# as a delegate:

public delegate ParseResult<T> P<T>(string input);

And here are some examples of how we intend to be able to use that type:

P<int> expr = ...;
expr("40+2")  // yields { 42, "" }

P<int> nat = ...;
nat("40+2")   // yields { 40, "+2" }
nat(";x%")    // yields null

P<string> word = ...;
word("This is a sentence.")  // yields { "This", " is a sentence." }

Get it?  A parser of type P<T> takes in a string, and either succeeds (consuming some prefix of the string, and returning a result of type T as well as the remaining unparsed string), or fails (returning null).

(Note that throughout this blog entry, I’ll often have code snippets like the one above that have comments with values in curly braces.  The curly braces are just shorthand; { 40, "+2" } is an easy way for me to write a ParseResult<int> whose Result field is 40 and whose RemainingInput field is "+2".)

C# delegate types

Now that we know the representation type for parsers, it’s almost time to describe those five core functions I alluded to at the beginning of this blog entry.  But first, let’s take a quick aside to look at some C# delegate types that are new to the latest release.  As you may know, the .NET library has included a delegate type for predicates for a while:

public delegate bool Predicate<T>(T x);

An example use of this type would be

Predicate<char> isDigit = char.IsDigit;

The latest library release includes a new set of delegates called just Func, that represent arbitrary function signatures.  The general form is "Func<A1,A2,…,An,Result>" – that is, it’s a generic delegate type where the first few type parameters are the argument types of the function, and the final one is the result type.  So, for example, a function that compares two integers to test if the first is less than the second could be represented using Func like so:

Func<int, int, bool> less = (x, y) => x < y;

That is, a Func<int,int,bool> is a function that takes two integers and returns a boolean.  Here I’ve assigned it a lambda that computes "x less than y", which so that for example less(3,4) yields true.

Note that Func<T, bool> describes the same shape function as Predicate<T>.

These new function delegate types are important to know as we shall occasionally encounter them as we do more with monadic parser combinators.  We’ll see one in just a moment, as it’s time now to look at those five core functions that provide a basis for all of parsing.

A monadic basis for parsing

In linear algebra, a basis is a set of vectors, linear combinations of which span an entire vector space.  In the domain of parsing, we’re about to see that five core functions provide a similar kind of programming basis that spans the entire space of parsing.  That is, we can describe any parser using just combinations of these five core components.  Without further ado, here are the five functions:

public static P<T> Fail<T>();                             // zero
public static P<T> Return<T>(T x);                        // unit
public static P<U> Then<T, U>(P<T> p1, Func<T, P<U>> f);  // bind
public static P<T> Or<T>(P<T> p1, P<T> p2);               // plus
public static P<char> Item();

We’ll describe each one in turn in just a moment.  The first four functions are what make these parsers a monad.  Previously I said that a monad was a specific kind of abstract data type, but didn’t go into much detail.  Monads are a deep topic and very hard to grasp, so I don’t want to spend time talking about them now, other than to say that what all monads have in common is four functions with particular type signatures, and the first four functions here have those specific signatures.  Often times you’ll see these monadic functions represented by the names I have in comments at the right.  However for a parser library, the names I’ve chosen on the left make more sense, so we’ll stick with them.  Like most monads, in addition to supporting the core monad functions, there are some other functions that are specific to the underlying monad representation and the purpose of the monad.  In this case, Item is the only "extra" function we need for the parser monad.

Ok, let’s describe each function now.  Fail returns a parser that always fails, regardless of the input string.  Useful, huh?  So, for example:

P<int> failure = Fail<int>();
failure("anything")  // yields null

That was easy.  Return is almost as simple.  This function results in a parser that does not consume any input, but always succeeds, yielding the result value that was passed to it.  So, for example:

P<int> p = Return(42);
p("whatever")  // yields { 42, "whatever" }

We’re off to a running start; we’ve got a way to create a parser that always fails, and a way to create a parser that ignores the input and always succeeds, yielding a particular value.  Next up is the Then combinator, the most complicated one of the bunch.  Then is used to chain parsers in a sequence.  Let’s take a close look at its signature, which is a little complicated:

public static P<U> Then<T, U>(P<T> p1, Func<T, P<U>> f);

There are two arguments.  The first argument is a parser that yields a result of type T.  The second argument is a function; the function takes a T and returns a new parser that yields a U.  The result is a parser that yields a U.  Let’s give a rough description of the semantics of the resultant parser.  First we run parser p1 on the input.  If it fails, we’re done, and the parse fails.  If it succeeds, we take the resulting T value and pass it to function f, resulting in a new parser that yields a U.  We’ll run that parser on the remaining unparsed input (that’s left over after running p1).  Don’t worry if that description was hard to follow, we’ll see examples shortly that make things clearer.

Next let’s consider the Or function:

public static P<T> Or<T>(P<T> p1, P<T> p2);

This combinator two parsers of the same type, and combines them into a new parser of that same type.  The new parser has simple semantics: first try running parser p1 on the input – if it succeeds, great, that’s the answer, else if it fails, then run parser p2 on the input instead.  This is the way we’ll express alternatives: "try parsing like this, but if that doesn’t work then try this instead".  Again, we’ll see examples shortly.

Finally, we have Item, the only one of the five that does any "real work".  The resultant parser consumes the first character of the input string and returns it.  Or, if the input string is empty, then this parser fails.  So, for example:

P<char> item = Item();
item("foo")  // yields { 'f', "oo" }
item("")     // yields null

That’s it!  With these five components, we’re now prepared to conquer any parsing challenge.  Let’s sum up:

  • Fail() returns a parser that always fails
  • Return() returns a parser that always succeeds with a given value, ignoring the input string
  • Then() is used for chaining parsers in a sequence (first parse this, then do this)
  • Or() returns a parser that tries two alternatives (first parse this, but if that fails, parse this instead)
  • Item() returns a parser that consumes the first character of the input string

Now let’s start authoring some more interesting parsers.

Some small examples

Time to write our first non-trivial parser.  Here it is:

public static P<char> Sat(Predicate<char> pred)
{
    return Then(Item(), c => pred(c) ? Return(c) : Fail<char>());
}
 
Sat(pred) is a character parser that considers the first character of the input.  If that character matches predicate pred, then the parse succeeds, yielding that character.  If the character doesn’t match the predicate (or if we’ve already reached the end of the input string), the parse fails.  Here’s an example of using Sat:
 
P<char> letter = Sat(char.IsLetter);
letter("foo")  // yields { 'f', "oo" }
letter("123")  // yields null
letter("")     // yields null

We used the predicate char.IsLetter to create a parser that parses letters.  If the first character of the input is a letter, the parse succeeds, returning that letter, along with the remaining unparsed input.  If the first character is not a letter, the parse fails.

Let’s trace through these examples.  When we call letter("foo"), here’s what happens.  Look inside the body of Sat.  First, we call Item.  This consumes the first character of the input, the ‘f’.  Next we call Then.  The way Then works is it binds the parse result of its first argument (‘f’) as the parameter to the function that is Then‘s second argument (in this instance, the lambda).  So now we’re inside the lambda expression, with variable c bound to the character value ‘f’.  We test the predicate, and it returns true, so now we call Return(c), which just returns that result (‘f’) without consuming any more input.

Similarly, when we call letter("123"), we call Item to parse off the ‘1’, Then binds that value to c, but this time the predicate returns false, so we call Fail and the parse fails.  And in the final example, when we call letter(""), the original Item call fails, and so the call to Then immediately fails without even considering its second argument.

It’s not important to fully understand the details of logic of tracing through these calls yet.  I haven’t yet shown the bodies of functions like Then, so we can’t trace all the way through yet anyway.  (We’ll see all the code in its entirety at the end of this article.)  For now, you just need to have a sense of what’s going on.  Here’s another example; with each example you’ll get a little more intuitive sense of this.

P<char> letter = Sat(char.IsLetter);
P<char> digit = Sat(char.IsDigit);
P<char> letterOrDigit = Or(letter, digit);
letterOrDigit("foo")  // yields { 'f', "oo" }
letterOrDigit("123")  // yields { '1', "23" }
letterOrDigit(";x%")  // yields null

The example here demonstrates a call to Or.  As a result, letterOrDigit is a parser that will first try letter, and if that succeeds, great, but if it fails, it will then try digit on the same input instead.  Get the idea?  Here’s another example of using Then:

P<int> two = Then(Sat(c => c == 't'), t =>
             Then(Sat(c => c == 'w'), w =>
             Then(Sat(c => c == 'o'), o =>
             Return(2))));
two("two bits")  // yields { 2, " bits" }
two("four")      // yields null

Here, two is a parser that recognizes the string "two" and results in the integer 2.  It does so by using Then to string together a number of parsers in sequence.  Each of the Sat calls recognizes an individual character.  Once we have consumed all three characters, we return the desired result (the integer 2).  By the way, this idiom of stringing together a number of Then calls and finally returning a result is a common one, and in monad-lingo the idiom is known as a comprehension.

Ok, hopefully you’re starting to grok how we can build useful parsers in this framework.  In a moment we’ll build a cute parser that has the behavior suggested here:

englishNumber(@"one hundred forty-six thousand 
                five hundred twenty-two widgets"); // yields { 146522, " widgets" }

But first let’s look at another new feature of C# that we can leverage to improve the syntax of our parser combinators a little.

C# extension methods

Extension methods are a kind of syntax sugar that allows static methods to appear as though they are member methods of the type of the first argument.  For example, if you want to "add a method to the string class", you can do something like this:

public static class StringExtensions
{
    public static string PigLatin(this string s, int n)
    {
        return s.Substring(n) + s.Substring(0, n) + "ay";
    }
}

Note the this keyword before the first parameter of the method – that’s what makes it an extension method.  As a result, you can now write code like this:

"foo".PigLatin(1)  // yields "oofay"
"star".PigLatin(2) // yields "arstay"
      

Cute.  Now that we know what extension methods are, let’s put them to use.  Two of our parser combinators, Then and Or, read better as extension methods.  That is, rather than the code on the left, we can write the code on the right:

Or(p1, p2)              p1.Or(p2)
Then(p, x => rest)      p.Then(x => rest)

just by adding the this keyword to the first parameter of the definition of each of these functions.  This makes the code read better in a few places already; let’s revisit a couple little parsers we wrote, and use the new extension method syntax:

P<char> letterOrDigit = letter.Or(digit);
P<int> two = Sat(c => c == 't').Then(t =>
             Sat(c => c == 'w').Then(w =>
             Sat(c => c == 'o').Then(o =>
             Return(2))));

The code now reads better: "letter or digit", "parse a ‘t’, then parse a ‘w’, …"  While we’re at it, let’s add one more combinator.  In the two parser, you note that we bind the result of each Sat call to a lambda parameter variable, but we don’t ever use those variables (like t, which gets bound to the value ‘t’).  As we’ll see, frequently we won’t need the value produced from parsing the left-hand-side of a Then.  As a result, we’ll add one more function:

public static P<U> Then_<T, U>(this P<T> p1, P<U> p2)
{
    return p1.Then(dummy => p2);
}

Then_ differs from Then in that Then_ discards the value produced by the first parser.  Incidentally, if you’ve still been having trouble grokking Then, you’ll probably find Then_ easier to understand.  The result of p1.Then_(p2) is a parser that first parses the input with p1 and then parses the remainder of the input with p2, yielding the same result as p2.  In contract, Then does the same thing, except along the way it binds the result from p1 to a new lambda variable name.  Put another way, it’s easy to describe Then_ as a combinator that "sequences" two parsers.  Relative to that, Then is the same kind of thing, except that it remembers the first result (whereas Then_ forgets it).

Let’s move on now and bring this all together to create our first non-trivial parser that does something interesting, namely parse numbers spelled out in English.

English number parser

We’ve already seen a parser that parses the word "two" and returns the number 2.  Let’s generalize that parser a little into this function:

public static P<int> Literal(string toParse, int result)
{
    if (toParse == "")
        return Parser.Return(result);
    else
        return Sat(c => c == toParse[0]).Then_(Literal(toParse.Substring(1), result));
}

Now it’s easy to create parsers for other numbers:

P<int> three = Literal("three", 3);
three("three!")  // yields { 3, "!" }
three("blah")    // yields null

The Literal function is recursive.  It takes in a literal string we want to parse, and parses it letter by letter, consuming one character of input and making a recursive call with the rest.  Once we’ve parsed the whole string, we return the value passed in to the original Literal call.  If at any point along the way a Sat call fails, the whole parse fails (yielding a null result for parsing that input string).

Now we can start making parsers for lots of numbers:

public static P<int> OneToNine =
    Literal("one", 1).Or(
    Literal("two", 2).Or(
    Literal("three", 3).Or(
    Literal("four", 4).Or(
    Literal("five", 5).Or(
    Literal("six", 6).Or(
    Literal("seven", 7).Or(
    Literal("eight", 8).Or(
    Literal("nine", 9)))))))));

public static P<int> TenToNineteen = 
    Literal("ten", 10).Or(
    Literal("eleven", 11).Or(
    Literal("twelve", 12).Or(
    Literal("thirteen", 13).Or(
    Literal("fourteen", 14).Or(
    Literal("fifteen", 15).Or(
    Literal("sixteen", 16).Or(
    Literal("seventeen", 17).Or(
    Literal("eighteen", 18).Or(
    Literal("nineteen", 19))))))))));

Why did we split these up into two different parsers?  Because we plan to reuse OneToNine, of course.  Consider how you write numbers in the 20s: twenty, twenty-one, twenty-two, twenty-three, … Though we could write a function that just knows about the 20s, the same pattern generalizes to the 30s, 40s, … to the 90s.  Here’s our next function, along with a sample use:

public static P<int> HigherTens(string tenName, int tenValue)
{
    return Literal(tenName, tenValue).Then(tens =>
           (
               Sat(c => c == '-').Then_(
               OneToNine.Then(ones => 
               Return(tens + ones)))
           ).Or(Return(tens)));
}
P<int> thirties = HigherTens("thirty", 30);
thirties("thirty!")         // yields { 30, "!" }
thirties("thirty-two!")     // yields { 32, "!" }

Do you see how HigherTens works?  First we parse a literal string (e.g. "thirty"), and bind the value (30) to the lambda parameter tens.  Then, either (1) we parse a hyphen, then parse a OneToNine, binding its value to variable ones, and then return the sum of tens and ones, or else if that fails we just (2) return tens.  This function is our first really nice example of using combinators (like Then and Or) to create an interesting new parser out of some simpler parsers.  It’s easy to keep going along these lines to create a parser that goes up to a million or more.  Rather than belabor the point by talking about all the rest of the code, just check out the full source code for it at the end of this blog entry.

Summary

Let’s sum up what we’ve discussed today.  We learned about five functions that form a basis for a monadic parser combinator library:

  • Fail() returns a parser that always fails
  • Return(x) returns a parser that always succeeds with the given value, ignoring the input string
  • Then(p1, f) is used for chaining parsers in a sequence (first parse this, then do this)
  • Or(p1, p2) returns a parser that tries two alternatives (first use p1, but if that fails, use p2 instead)
  • Item() returns a parser that consumes the first character of the input string

We wrote a few more generally useful functions, like

  • Sat(pred) which parses a character only of the character matches a predicate
  • Then_(p1, p2) which runs two parsers in sequence, discarding the first result
  • Literal(str, val) which parses a literal string and returns a given value

and then created a parser that can parse English numbers from one to a million.  Along the way, we also learned some new C# features, including the Func delegate types for representing arbitrary function shapes, and a syntax sugar known as extension methods.

Looking ahead

There’s plenty more to explore with monadic parser combinators.  In my next blog entries, I know I want to cover a few more topics, including

  • a walk through the "plumbing details" of functions like Then and Or
  • write the numeric expression parser evaluator, and explore fun combinators like Many and Chainl1
  • talk about how to enhance the parser representation to deal with error handling

And who knows what else.  I only hope you’ll enjoy reading it as much as I enjoy writing it!

Full source code

No doubt some of you are itching to see the code for functions like Then, so as to gain a better understanding of the "plumbing".  Here is the full code used in today’s blog entry.  Note that the main parser functions are in static class Parser, which means some code says Parser.foo whereas the code snippets in the blog just said foo.  Apart from that, the code should all match up with the blog. Enjoy!

using System;

namespace MPC_Part2
{
    public class ParseResult<T>
    {
        public readonly T Result;
        public readonly string RemainingInput;
        public ParseResult(T r, string ri) { this.Result = r; this.RemainingInput = ri; }
        public override bool Equals(object obj)
        {
            ParseResult<T> other = obj as ParseResult<T>;
            if (other == null)
                return false;
            if (!object.Equals(other.Result, this.Result))
                return false;
            if (!object.Equals(other.RemainingInput, this.RemainingInput))
                return false;
            return true;
        }
        public override int GetHashCode()
        {
            return this.RemainingInput.GetHashCode();
        }
    }

    // representation type for parsers
    public delegate ParseResult<T> P<T>(string input);

    public static class Parser
    {
        // core functions that know the underlying parser representation
        public static P<T> Fail<T>()
        {
            return input => null;
        }
        public static P<T> Return<T>(T x)
        {
            return input => new ParseResult<T>(x, input);
        }
        public static P<U> Then<T, U>(this P<T> p1, Func<T, P<U>> f)
        {
            return input =>
                {
                    ParseResult<T> result1 = p1(input);
                    if (result1 == null)
                        return null;
                    else
                        return f(result1.Result)(result1.RemainingInput);
                };
        }
        public static P<T> Or<T>(this P<T> p1, P<T> p2)
        {
            return input =>
                {
                    ParseResult<T> result1 = p1(input);
                    if (result1 == null)
                        return p2(input);
                    else
                        return result1;
                };
        }
        public static P<char> Item()
        {
            return input =>
                {
                    if (string.IsNullOrEmpty(input))
                        return null;
                    else
                        return new ParseResult<char>(input[0], input.Substring(1));
                };
        }
        // other handy functions
        public static P<U> Then_<T, U>(this P<T> p1, P<U> p2)
        {
            return p1.Then(dummy => p2);
        }
    }

    // example of extension methods
    public static class StringExtensions
    {
        public static string PigLatin(this string s, int n)
        {
            return s.Substring(n) + s.Substring(0, n) + "ay";
        }
    }

    // various running examples from the blog
    class Program
    {
        static void AssertEqual<T>(T x, T y)
        {
            if (!object.Equals(x, y))
                throw new Exception("not equal");
        }
        public static P<char> Item = Parser.Item();
        public static P<char> Sat(Predicate<char> pred)
        {
            return Item.Then(c => pred(c) ? Parser.Return(c) : Parser.Fail<char>());
        }

        public static P<char> Letter = Sat(char.IsLetter);

        public static P<int> Literal(string toParse, int result)
        {
            if (toParse == "")
                return Parser.Return(result);
            else
                return Sat(c => c == toParse[0]).Then_(Literal(toParse.Substring(1), result));
        }

        public static P<int> OneToNine =
            Literal("one", 1).Or(
            Literal("two", 2).Or(
            Literal("three", 3).Or(
            Literal("four", 4).Or(
            Literal("five", 5).Or(
            Literal("six", 6).Or(
            Literal("seven", 7).Or(
            Literal("eight", 8).Or(
            Literal("nine", 9)))))))));

        public static P<int> TenToNineteen = 
            Literal("ten", 10).Or(
            Literal("eleven", 11).Or(
            Literal("twelve", 12).Or(
            Literal("thirteen", 13).Or(
            Literal("fourteen", 14).Or(
            Literal("fifteen", 15).Or(
            Literal("sixteen", 16).Or(
            Literal("seventeen", 17).Or(
            Literal("eighteen", 18).Or(
            Literal("nineteen", 19))))))))));
        
        public static P<int> HigherTens(string tenName, int tenValue)
        {
            return Literal(tenName, tenValue).Then(tens =>
                   (
                       Sat(c => c == '-').Then_(
                       OneToNine.Then(ones => 
                       Parser.Return(tens + ones)))
                   ).Or(Parser.Return(tens)));
        }

        public static P<int> OneToNinetyNine = 
            OneToNine
            .Or(TenToNineteen)
            .Or(HigherTens("twenty", 20))
            .Or(HigherTens("thirty", 30))
            .Or(HigherTens("forty", 40))
            .Or(HigherTens("fifty", 50))
            .Or(HigherTens("sixty", 60))
            .Or(HigherTens("seventy", 70))
            .Or(HigherTens("eighty", 80))
            .Or(HigherTens("ninety", 90));

        public static P<char> Space = Sat(char.IsWhiteSpace);

        public static P<int> OneHundredTo999 =
            OneToNine.Then(hundreds =>
            Space.Then_(
            Literal("hundred", 0).Then_(
            Space.Then_(
            OneToNinetyNine.Then(rest =>
            Parser.Return(hundreds * 100 + rest))))));

        public static P<int> OneTo999 = OneHundredTo999.Or(OneToNinetyNine);

        public static P<int> OneTo999999 =
                OneTo999.Then(thousands =>
                Space.Then_(
                Literal("thousand", 0).Then_(
                Space.Then_(
                OneTo999.Then(rest =>
                Parser.Return(thousands * 1000 + rest))))))
            .Or(OneTo999);

        static void Main(string[] args)
        {
            Console.WriteLine("foo".PigLatin(1));
            Console.WriteLine("star".PigLatin(2));

            Predicate<char> isDigit = char.IsDigit;
            Func<int, int, bool> less = (x, y) => x < y;

            AssertEqual(Item(""), null);
            AssertEqual(Item("foo"), new ParseResult<char>('f', "oo"));
            P<char> letter = Sat(char.IsLetter);
            P<char> digit = Sat(char.IsDigit);
            AssertEqual(letter("foo"), new ParseResult<char>('f', "oo"));
            AssertEqual(digit("foo"), null);
            P<char> letterOrDigit = letter.Or(digit);
            AssertEqual(letterOrDigit("foo"), new ParseResult<char>('f', "oo"));
            AssertEqual(letterOrDigit("123"), new ParseResult<char>('1', "23"));
            AssertEqual(letterOrDigit(";x%"), null);

            P<int> two = Sat(c => c == 't').Then_(
                         Sat(c => c == 'w').Then_(
                         Sat(c => c == 'o').Then_(
                         Parser.Return(2))));
            AssertEqual(two("two bits"), new ParseResult<int>(2, " bits"));
            AssertEqual(two("four"), null);

            P<int> three = Literal("three", 3);
            AssertEqual(three("three!"), new ParseResult<int>(3, "!"));
            P<int> thirties = HigherTens("thirty", 30);
            AssertEqual(thirties("thirty!"), new ParseResult<int>(30, "!"));
            AssertEqual(thirties("thirty-two!"), new ParseResult<int>(32, "!"));
            
            AssertEqual(OneToNinetyNine("twenty!"), new ParseResult<int>(20, "!"));
            AssertEqual(OneToNinetyNine("twenty-five!"), new ParseResult<int>(25, "!"));
            // Note how the next two get the 'wrong' answer.  We'll talk about
            // a combinator called NotFollowedBy, which allows for lookahead, 
            // in a future installment, rather than fix this now.
            AssertEqual(OneToNinetyNine("seventeen!"), new ParseResult<int>(7, "teen!"));
            AssertEqual(OneToNinetyNine("eighty!"), new ParseResult<int>(8, "y!"));

            P<int> englishNumber = OneTo999999;
            AssertEqual(englishNumber("one."), new ParseResult<int>(1, "."));
            AssertEqual(englishNumber("one hundred six."), new ParseResult<int>(106, "."));
            AssertEqual(englishNumber("one hundred forty-six thousand five hundred twenty-two widgets"), new ParseResult<int>(146522, " widgets"));

            Console.WriteLine("done, press a key");
            Console.ReadKey();
        }
    }
}
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
%d bloggers like this: