CodeBetter.Com
CodeBetter.Com
RSS 2.0 via Feedburner
           Do you Twitter? Follow us @CodeBetter

Matthew Podwysocki

Life of a Functional Programmer
  • Async Computation Expressions - Resource and Exception Management

    For the next part of my coverage of Asynchronous Computation Expressions, I'm going to talk about the things you get for free when you use them.  I was reminded of this during a recent chat with Greg Young about how hard it was during asynchronous operations to notify the controlling thread of an exception.  These asynchronous operations are meant to handle such things. 

    Let's get caught up to where we are today with regards to Async Computation Expressions:


    Brief Introduction to Asynchronous Computation Expressions

    Asynchronous operations, which use the Async<'a> type, are basically a way of writing Continuation Passing Style (CPS) programs.  There is a reason I covered that in a previous post here about CPS in regards to Recursion, which is why it's important to learn these concepts.  These asynchronous computations are computations that call a success continuation should the operation succeed and an exception continuation should it fail.  Together, this provides a managed computation expression where you get a few things for "free".

    Listed below are some of the things you get for "free" from the managed computation expressions.  Each of these will be covered in detail in this post:

    • Exception Propagation
    • Cancellation Checking
    • Resource Lifetime Management

    Let's go into each in detail as it makes the story around the asynchronous computation expressions quite intriguing.  I'll hold off on a lot of the details as I'll go into that further in a subsequent post.


    Resource Management

    Simply put, when you use the "use" keyword inside of the asynchronous computation expression, at the end of the scope, the resource will be disposed.  Even should an exception occur, this still happens.  As you may recall, F# has ways of handling resources through the use of the using function, or the use keyword.

    (* val using : 'a -> ('a -> 'b) -> 'b when 'a :> System.IDisposable *)
    using(File.OpenRead("file.txt")) (fun stream -> ())

    let main()
      use stream = File.OpenRead("file.txt")
      () // Resource closed here

    Either I can use the using keyword which then takes a function which takes a single input and returns the result where the input must implement System.IDisposable.  The use keyword is syntactic sugar over the same way of doing this.  Now that we understand this, let's move onto using this inside of an asynchronous computation expression.  As you may note, we can write our computation using the same style.

    let read_file file =
      async {
              use stream = File.OpenRead(file)
              use reader = new StreamReader(stream)
              let text = reader.ReadToEnd()
              return text
            }

    Async.Run (read_file "file.txt")

    As you may notice, our resources will be cleaned up just as it hits the return statement and the stream and reader are no longer needed.  Should an exception occur, the resources still will be cleaned up.  I'll cover that in the next section, what exactly happens there.  But, we also have the ability to bind to asynchronous operations by using the "use!" and "let!" keyword such as the following:

    open Microsoft.FSharp.Control.CommonExtensions

    let read_file file =
      async {
              use! stream = File.OpenReadAsync(file)
              use reader = new StreamReader(stream)
              let! text = reader.ReadToEndAsync()
              return text
            }

    Async.Run (read_file "file.txt")

    Now that we understand the basics, let's move onto exception management and propagation.


    Exception Propagation

    One of the harder topics to deal with during asynchronous operations is exception management.  The asynchronous computation expressions add this behavior for free.  If an exception is thrown during an asynchronous step, the exception will then terminate the entire computation and then clean up any resources that were allocated by using the "use" keyword.  The exception is then propagated via the exception continuation back to the controlling thread.  You can also handle these exceptions by using a try/catch/finally block in F# such as this:

    open Microsoft.FSharp.Control.CommonExtensions
    let read_file file =
      async {
              let! stream = File.OpenReadAsync(file)
              try
                use reader = new StreamReader(stream)
                let! text = reader.ReadToEndAsync()
                return text
              finally
                stream.Close()
            }
    Async.Run (read_file "file.txt")

    But, what happens when you get a failure?  Let's take this code snippet for example:

    let square_even n =
      async {
              do if n % 2 <> 0 then failwith "Not even"
              return n * n
            }

    let result = Async.Run (square_even 3)

    What's going to happen is that we get a FailureException thrown such as the following:

    async_throw

    But what about when we run multiple computations in parallel?  Tasks runnning the Async.Run will report any failure back to the controlling thread.  When you run them in parallel, it is non-deterministic which one will fail first.  Only the first failure will be reported, and an attempt will be made to cancel the rest of the computations.  Any further failures are ignored.

    let task_even n proc =
      async {
              do if n % 2 <> 0 then failwith proc
              return n * n * n 
            }

    let failing_tasks = [ (task_even 3 "1"); (task_even 5 "2")]
    Async.Run(Async.Parallel failing_tasks)

    And the result will look like this:

    async_parallel_exception 

    If you notice, only the first exception was noted.  Had I written this slightly different, the second task could have thrown an exception first and would be noted in the same way.

    But, what if you don't want this exception thrown, but instead handed back to you to deal with in your own way?  By using the Async.Catch combinator, you get a choice whether you get the value of the computation or the exception.  Below is an example of using that concept:

    (* static member Catch : Async<'a> -> Async<Choice<'a, exn>> *)
    let square_even n =
      async {
              do if n % 2 <> 0 then failwith "Not even"
              return n * n
            }

    let result = Async.Run (Async.Catch (square_even 3))
    (* val result : Choice<int, exn> *)

    let print_value (c:Choice<int, exn>) =
      match c with
      | Choice2_1 a -> print_any a
      | Choice2_2 b -> print_any b.Message

    print_value result

    What I've been able to do is catch the exception should it be thrown.  The result is given to me as a choice of either an exception or the real value.  In order to extract the value, I need to go through pattern matching, as the Choice<'a, 'b> is actually a discriminated union.  Depending on the input, this program could print either the result, or "Not even".


    Cancellation Checking

    Cancellation checking behavior in asynchronous computation expressions is quite similar to Thread.Abort in terms of semantics.  What that means is the following:

    • Cancellation is non-deterministic about whether it will succeed
    • The item that caused the cancellation will be notified whether the cancellation succeeded or not.
    • The finally blocks will all be executed.
    • Cancellation may not be caught, but instead will be thrown as an OperationCancelledException should it have been run through the Async.Run.

    Cancellation checks are handled for us in the following ways:

    • At each "let!" "do!" or "use!" bind operation and is checked for the cancellation flag
    • Manually, it can be checked by calling "do! Async.CancelCheck()"
    • If the cancellation flag is set, then the cancellation continuation is called

    Below is a simple call with a check for cancellation:

    let read_file file =
      async {
              use! stream = File.OpenReadAsync(file)
              use reader = new StreamReader(stream)
              let! text = reader.ReadToEndAsync()
              do! Async.CancelCheck()
              return text
            }

    let run_async =
      try
        Async.Run (read_file "file.txt")
      with
        | :? System.OperationCanceledException -> string.Empty

    With this, we could try to cancel the operation using the Async.CancelDefaultGroup method or through an AsyncGroup had you created your function through that.


    Wrapping It Up

    I hope this opened your eyes to the possibilities with asynchronous computation expressions in F#.  They are very interesting and intriguing parts of F#.  Without deep down knowledge of continuations and monads, you can still write very powerful programs in an asynchronous manner.  These are exciting times as Brian McNamara (F# Team Member) has pointed out, we are getting closer and closer to the CTP of F#.  Download the latest bits for F# and try it out. 

    I'm heading off to enjoy some R&R time for a much needed vacation, so this blog will be quiet for a little bit.

  • Aspects of Functional Programming in C# Presentation and Code Redux

    Last month, I posted my Functional C# presentation for the Rockville .NET User Group (RockNUG).  Yesterday, I was able to finally present the material, although I was feeling under the weather and there was a lot of information in just that brief amount of time.  So, I'm going to repost the materials just in case someone missed them.  I'm adding a few things as well which may be of interest.  I'm hoping to expand this for upcoming code camps.

    Here are some resources that will be helpful in covering functional programming aspects:

    Functional Programming


    C# Futures


    Functional Programming Aspects with C#


    Books


    Podcasts


    Video Presentations


    I've continued to add to my Functional C# project as time goes along.  This is intended to be a library of functional programming techniques in C# 3.0 and some demonstrations of moving from imperative style programming to a more functional programming style.  This is an ongoing project and more will be added in time, and I may end up just putting them up not as samples, but as a library.  But in the mean time, there are a lot of interesting topics that are being covered.

    Some of the topics covered in these code projects are:


    My slide deck can be found here and my code snippets once again can be found here.

    As I discussed last night, the DC ALT.NET meeting is on August 28 from 7-9PM with Jeff Schoolcraft on Approaching Ruby.  Details can be found on my previous post.  Hope to see a great crowd, and don't forget to RSVP!

  • Recursing on Recursion - Continuation Passing

    In this continuing series of my back to basics, I've been talking about recursion, and various strategies around using it effectively.  This includes covering the basic types of recursion, whether it be linear, binary, and tail.  Then I took it a step further with topics on list recursion and memoization techniques.  This is an ongoing part of my back to basics series in which I hope is a refresher for many who don't use these things on a daily basis.

    Let's catch up to where we are today:

    This time, we're going to talk about recursion and Continuation Passing Style (CPS).  We'll include both samples in F# as well as C# for these examples.

     

    Continuation Passing Style

    Earlier in the series, I talked about several different ways to approach recursion.  Today we're going to bring CPS into the terminology.  Let's first discuss the first word in the title, Continuation.  Simply put, a continuation represents the remainder of a computation given a point in the computation.  You could almost think that a continuation is GOTO with a parameter that is the value of the function that transferred the control. It is unlike a function call because, while it is possible to return to the original computation it is not always expected or necessary.  Yet, many people get tripped up by this definition.

    Writing a function using CPS takes an explicit continuation function argument which is meant to receive the result of the computation that was performed in the function.  When doing this, the caller is expected to give the function to be invoked during the end of the function.  So, what this means is instead of returning a value from a function, the value is passed to the code that will continue the computation.  Using this programmatic style gives us a few things, which include intermediate values, order of argument processing and tail recursion.  Why is this important?  Besides making you the obvious hit at any party, and an absolute dev maven, it's important if you want to understand another fun topic of Monads as well as asynchronous calls.  But, that's for another day.  I'll go into a scenario below where it's absolutely useful in regards to recursion.  But, in the mean time, let's look at how CPS differs from direct calls.

    Let's first show a quick example of CPS in action using F#:

    #light 

    let square_func x = x * x 
    (* val square_func : int -> int *)

    let square_cps x cont = cont(square_func x) 
    (* val square_cps : int -> (int -> 'a) -> 'a *)

    let result = square_cps 4 (fun x -> x)

     

    What I did in the above syntax is take a standard squaring function, and applied a continuation which handled the result of the computation.  Then in the calling function, I gave it an identity continuation which is simply taking the input and returning it.  This is all well and good, but let's try to apply this more towards our story at hand with recursion.

     

    CPS and Recursion

    In this series, I showed how you could take a normal recursive function and move it from linear to tail recursive.  This time, we'll take that same approach with going towards CPS.  In the past I've been guilty of showing the standard Fibonacci sequence as well as factorial, so let's switch it up and go towards our ImmutableList, or just a List<'a> for those in the F# world.  The immutable list is part of my Functional C# library on MSDN Code Gallery.

    Let's first go with our standard linear recursion and what that looks like when calculating the length of our list.

    F#
    let rec length = function
      | [] -> 0
      | _ :: t -> 1 + length t

     

    C#
    public static int Length<T>(this IImmutableList<T> list)
    {
        if (list.IsEmpty) return 0;
        return 1 + Length(list.Tail);
    }
     

    As you notice, our last calculation wasn't a tail call, instead we were adding 1 to the result of the recursive call.  Now, instead of this, which could stack overflow on rather large data sets, let's optimize it for the tail call.  This usually includes an inner function to do this with an accumulator involved. 

    F#

    let length_tail lst = 
      let rec length_acc l acc =
        match l with
        | [] -> acc
        | _::t -> length_aux t (1 + acc)
      
      length_acc lst 0

     

    C#
    public static int LengthTail<T>(this IImmutableList<T> list)
    {
        Func<IImmutableList<T>, int, int> length_acc = null;
        length_acc = (l, acc) => 
          l.IsEmpty ? acc : length_acc(l.Tail, 1 + acc);

        return length_acc(list, 0);
    }

     

    Now these functions are optimized by the tail call.  I created an inner function which takes the list and an accumulator and recurses over itself by adding 1 to the accumulator.  But, as I've stated before, the C# compiler doesn't optimize for the tail call.  This has been a known issue for some time now, and I don't think it's on the highest priority to fix.  After all, the JITer does do a simple optimization with a tail call on x64 only.  If you're running on an x86, you'll still get stack overflows.  More reason to look at F# when doing recursive algorithms. 

    But, let's take this a step further.  What would it take, to take this above function and turn it into CPS?  Well, to think about turning something like that function, you have to think backwards just a little bit.  You'll see what I mean.

    F#
    let length_cps lst =
      let rec length_cont l cont =
        match l with
        | [] -> cont 0
        | _::t -> length_cont t (fun x -> cont(1 + x))
      
      length_cont lst (fun x -> x)

     

    C#
    public static int LengthCont<T>(this IImmutableList<T> list)
    {
        Func<IImmutableList<T>, Func<int, int>, int> length_cont = null;
        length_cont = (l, cont) => 
          l.IsEmpty ? cont(0) : length_cont(l.Tail, x => cont(1 + x));

        return length_cont(list, x => x);
    }
     

    As you can see from above, it really turned our function inside out.  It's enough to make someone's head hurt sometimes.  Let's give one last example on how it can turn a function inside out, and yes, I lied about not bringing our standard Fibonacci sequence in here as well:

    F#
    let fibonacci_cps n =
      let rec fibonacci_cont a cont =
        if a <= 2 then cont 1
        else
          fibonacci_cont (a - 2) (fun x ->
            fibonacci_cont (a - 1) (fun y -> 
              cont(x + y)))
              
      fibonacci_cont n (fun x -> x)

     

    C#
    static int FibonacciCont(int n)
    {
        Func<int, Func<int, int>, int> fibonacci_cont = null;
        fibonacci_cont =
          (a, cont) => a <= 2 ? cont(1
            : fibonacci_cont(a - 2, x => fibonacci_cont(a - 1, y => cont(x + y)));

        return fibonacci_cont(n, x => x);
    }
     

    This by itself is quite interesting.  But, the problem is still that C# isn't optimized for the tail call, so I see no benefit of writing it this way.  Also, CPS has a tendency to be an order of magnitude slower than standard tail calls and even tail calls for that matter.  With regards to F# though, since it's optimized for the tail call, produces well formed and fast code.  But, there are areas where you should care with regards to using CPS versus standard tail calls.  One such scenario is for parsing unbalanced trees.

     

    Parsing Unbalanced Trees

    When dealing with tree structures, it's a lot harder to get right when using tail recursion as opposed to very structured data.  Therefore, CPS comes to the rescue here.  For these examples, I'm only going to use F# because C# doesn't have the support I need when accomplishing this.  First, let's define a tree structure that we're going to parse and a binary recursive function that calculates the size of our tree.

    type Tree<'a> = 
      | Node of 'a * Tree<'a> * Tree<'a>
      | Leaf of 'a

    let rec tree_size = function
      | Leaf _ -> 1
      | Node(_, left, right) -> tree_size left + tree_size right

     

    The problem with this function is that when I start getting large unbalanced trees, we are at risk of a stack overflow.  So, let's try to move it towards using tail calls in order to fix the issue.

    let tree_size_tail tree =
      let rec size_acc tree acc =
        match tree with
        | Leaf _ -> 1 + acc
        | Node(_, left, right) -> 
            let acc = size_acc left acc
            size_acc right acc
      size_acc tree 0

     

    As you can see, the way this code as written is that the left side is not actually tail recursive at all, and instead only on the right side.  This might be acceptable if the tree were balanced to the right, but if the tree is skewed to the left, then we'll have a problem.  This is where CPS can help to solve this issue.  Let's try now to apply our knowledge from above on how to modify our functions to use CPS with an accumulator, instead.

    let tree_size_cont tree =
      let rec size_acc tree acc cont =
        match tree with
        | Leaf _ -> cont (1 + acc)
        | Node(_, left, right) ->
            size_acc left acc (fun left_size ->
              size_acc right left_size cont)
      
      size_acc tree 0 (fun x -> x)

     

    What we were able to accomplish is the following:

    • Create an inner function which uses an accumulator, our tree and a continuation function as input.
    • Return 1 plus the accumulator if we've reached the leaf of the tree
    • Else, we're going to call the function to get the left tree size recursively until we reach the leaf.  We create a continuation to get the right tree size.
    • Finally, we call the right tree size while passing in the accumulator and our continuation. 

    Using this approach, we don't face the the threat of stack overflows, but as well, we've used only two short lived continuations to compute this item.  This is a pretty advanced technique in most languages and isn't used too much.  But once you understand how they work and their uses, it's very powerful.

     

    Wrapping it Up

    I hope this brief introduction to CPS with regards to recursion was interesting.  If you'd like to know more about continuations in general, Wes Dyer, of the Volta team has a pretty good explanation and the various uses of CPS here.  From here, it's best to move onto monads (aka computation expressions in F#), especially with regards to the asynchronous processing.  As always, I make my code samples available through the MSDN Code Gallery in the Functional C# library.

  • Adding Async Operations to Asynchronous Computation Expressions in F#

    Asynchronous Computation Expressions are an extremely powerful feature in F#.  It's important to not only know how to use them, but also to extend the behavior so that other classes can bind and perform asynchronous behavior. What I want to show in this post is how easy it is to add this behavior to any custom web service that you may create.


    In Review

    We saw earlier in my post about Task Parallel Library and Async Computation Expressions, I briefly mentioned how I added some capabilities back to the WebRequest in the form of GetResponseAsync().  Let's take a look at the code involved for that to make it happen, as well as the code that uses it.

    #light

    open System.IO
    open System.Net
    open Microsoft.FSharp.Control.CommonExtensions

    type System.Net.WebRequest with
      member x.GetResponseAsync() =
        Async.BuildPrimitive(x.BeginGetResponse, x.EndGetResponse)
        
    let download(url:string)
      async { let request = WebRequest.Create(url) 
              use! response = request.GetResponseAsync()
              use stream = response.GetResponseStream() 
              use reader = new StreamReader(stream) 
              let! html = reader.ReadToEndAsync() 
              return (url, html)
            }

    What I'm doing is adding an extension method to the System.Net.WebRequest class to add the GetResponseAsync() function.   In order to create an extension method, property, static method and so on, it's pretty simple.  The function BuildAsyncPrimitive takes in not only the parameters of your function, but the Begin and End functions as well.  This function returns an Async<'a>, in this case being Async<WebResponse> where the result is web response from the WebRequest class.  Underlying this whole piece is the use of continuations which takes in a function for the success and a function for handling failure.  For a better understanding of how this works, I suggest that you look through the F# source code in the controls.fs file.

    If you're curious, the Microsoft.FSharp.Control.CommonExtensions module contains a few extension methods that you can take advantage out of the box.  Some of them include:

    • File - Open (Text, Read, Write)
    • Stream - Read and Write
    • Socket - Accept, Receive and Send
    • SqlCommand - Execute (Scalar, Reader, NonQuery)

    I'll cover extension methods as part of my series on Object Oriented Programming in F# which is coming soon.  Now that we have a basic understanding, let's move onto how to do this with a fresh web service.


    Creating the Web Service

    In this example, I'm going to extend a simple web service to be able to achieve asynchronous behavior during a computation expression.  Let's define a simple MathService web service and then just add an Add function which adds two numbers together.

    [WebService(Namespace = "http://services.codebetter.com/mathservices/")]
    [WebServiceBinding(ConformsTo = WsiProfiles.BasicProfile1_1)]
    public class MathService : WebService
    {

        [WebMethod]
        public int Add(int x, int y)
        {
            return x + y;
        }
    }

    Consuming the Web Service

    Now that we have defined the basic service.  Use the WSDL tool to create a proxy to our web service and compile it to an assembly.  Now that you've done that, we have a MathService.dll that we can now reference.  Let's define now the extension method to our given web service to add the asynchronous behavior.

    #r "MathService.dll"

    open MathServices

    type MathServices.MathService with
      member a.AddAsyncr(x, y) =
        Async.BuildPrimitive(x, y, a.BeginAdd, a.EndAdd)

    What I'm able to do here is add an AddAsyncr method to my MathService which binds the arguments of my function call as well as the BeginAdd and EndAdd methods.  From the Async.BuildPrimitive, this builds me a Async<int> as a result of my addition that I did on the server.  The current implementation can take multiple arguments, so it figures out my x and y parameters to be of type System.Int32.

    Now to consume the service is rather simple.  Let's first define the async computation expression to handle the addition:

    let addNumbers x y =
      async {
              let service = new MathService()
              let! result = service.AddAsyncr(x, y)
              return result
            }

     

    I can then perform both a single operation of the call to the computation expression, or do them in parallel.  Both are defined down below:

    let singleAdd = Async.Run (addNumbers 4 6)
    printfn "Results of single add: %d" singleAdd

    let nums = [1..10]
    let parallelAdd = 
      Async.Run(Async.Parallel 
        [for num in nums -> addNumbers num (num * 2)])
    parallelAdd 
      |> Seq.iteri(fun x y -> printfn "Parallel Add at %d is %d" x y)

    Then my results will look like this after iterating through each of the results:

    async_add


    Clean, concise and to the point.  I like it!


    Wrapping It Up

    With these past couple of posts on asynchronous computation expressions, this gives you an idea on how powerful they are.  And I'm not even talking about the underlying technology with regards to monads, at least not yet anyways.  There are additional things to cover yet with this topic which includes some of the underlying technology, exception handling and so on, so stay tuned.

  • DC ALT.NET - 8/28/2008 - Ruby with Jeff Schoolcraft

    The August meeting for DC ALT.NET will be on August 28th, 2008 from 7-9PM.  Check our site and our mailing list for continuing updates and future meetings.  This month, Jeff Schoolcraft, ASP.NET MVP, will host a conversation on Ruby.  This will include some Ruby demos, a little bit of Ruby on Rails, as well as approaching it from the ASP.NET mindset.  This talk will go very nicely with our talk on ASP.NET MVC next month.

    Approaching Ruby

    IronRubyOne of the main intentions for the ALT.NET group I've found is for constant improvement.  This month is no exception as we're stepping outside of the statically typed world that many live in with C#, VB.NET, F# and so on.  We'll be covering the MRI as well as hopefully get to IronRuby.  You can find out more information on progress of IronRuby the IronRuby site.

    Details

    The details of the event are as follows:

    Date/Time: 8/28/2008 - 7-9PM

    Location:
    Motley Fool
    2000 Duke Street
    Alexandria, VA, 22314
    Map Link

    Hope to see a great crowd there!



  • Task Parallel Library and Async Computation Expressions

    Very recently, I've given a few talks on Asynchronous and Concurrent Programming in F#.  In this talk, I gave a brief overview of the options you have when dealing with concurrency and asynchronous behavior.  During these talks, I was asked a few times about where asynchronous computation expressions (workflows) fit and how it differs from doing things with the Task Parallel Library from the Parallel Extensions for .NET.  There are some differences worth exploring and I'll post some code snippets to compare and contrast the two.


    The Challenge

    Today's challenge is to take a sample from the Task Parallel Library samples under AsyncDownload.  The purpose of this example is to download HTML from a given website and return the HTML in a tuple with the original URL.  We'll take two approaches, one using the Task Parallel Library (TPL) and the other using asynchronous computation expressions.  We'll compare and contrast the approaches used to achieve the end result.


    Using the Task Parallel Library

    Let's take the original sample as written in C# and rewrite it in F#.  Instead of returning an IEnumerable<T>, I'm going to rewrite this while using a sequence expression.  The other change I'm going to make is using a tuple instead of a KeyValuePair<TKey, TValue> for my storage since it doesn't have to be so formal.  This sample uses the System.Net.WebClient to download the strings asynchronously.  This particular class uses events in order to subscribe to the DownloadStringCompleted event and then begin the download below.  Let's take a look at what this code might look like:

    let download (urls:string list) =
      seq { use results = new BlockingCollection<(string * string)>()
            use pagesRemain = new CountdownEvent(1)
            let _ = Task.Create(fun _ ->
              urls |> List.iter(fun url ->
                let wc = new WebClient()
                wc.DownloadStringCompleted.Add(fun args ->
                  if args.Error = null then
                    results.Add(((args.UserState :?> string), args.Result))
                  if pagesRemain.Decrement() then
                    results.CompleteAdding()
                )
          
                pagesRemain.Increment()
                wc.DownloadStringAsync(new Uri(url), url)
              )
        
              if pagesRemain.Decrement() then results.CompleteAdding()
            )
      
            for result in results.GetConsumingEnumerable() do yield result
          }

    let urls = ["http://microsoft.com"; "http://msn.com"]
    let results = download urls
    results |> Seq.iter(fun x -> printfn "%s : %s" (fst x) (snd x))

    What this code accomplishes is the following:

    • Wrap the entire operation in a sequence expression.
    • Create a BlockingCollection of a tuple for storing our results
    • Create a CountdownEvent to track whether we are done or not.
    • Create a Task for the TPL
    • Iterate through each URL given
    • Create a WebClient and add a handler which checks whether it should add to the collection as well as complete adding
    • Decrement the remaining and start the download async behavior
    • Clean up and then iterate through each result tuple

    Due to shared state issues, we have to worry about locking and such while adding to our collection.  Sometimes shared state is nice for quick operations, but I quickly shy away from this approach should I need to scale to the nth degree.  Instead, I'd advocate more of a shared-nothing approach through message passing.  Each situation must be analyzed to see whether a shared state approach works or not.  Functional languages such as F# tend to shy away from this, especially when worried about the "Assembly Language" level approach of dealing with locks, mutexes, semaphores and other goodies.  But, overall, I'm liking the abstraction layer over creating tasks and I think it's getting better to a point where we don't have to think about the concurrency constructs as much as we do today. 


    Using Asynchronous Computation Expressions

    Now, let's take a look at an approach using asynchronous computation expressions.  This time, we'll use a monadic expression, much as we did above with the seqeuence one.  We'll make the Async<'a> class the heart and soul of our operation.  This allows us to represent a program fragment that will be executed at some point in the future.  That fragment being of course the much dreaded word, Monad.  Which, I agree with Simon Peyton Jones, that they should be called "Warm Fuzzy Things" instead of Monad.  We'll get into what that word really means in the future, but in the mean time, let's dig into the code.  Note that we had to add the GetResponseAsync method back to the WebRequest due to it being removed from the latest public bits of F#.  As you can see, it's pretty trivial to extend any type that exposes the asynchronous behavior from the Begin/End pattern. 

    type System.Net.WebRequest with
      member x.GetResponseAsync() =
        Async.BuildPrimitive(x.BeginGetResponse, x.EndGetResponse)
        
    let download(url:string)
      async { let request = WebRequest.Create(url) 
              use! response = request.GetResponseAsync()
              use stream = response.GetResponseStream() 
              use reader = new StreamReader(stream) 
              let! html = reader.ReadToEndAsync() 
              return (url, html)
            }
            
    let siteList = ["http://www.microsoft.com/";"http://msn.com/"]
    let results = 
      Async.Run(Async.Parallel 
        [for site in siteList -> download site]) 
    results |> Seq.iter(fun x -> printfn "%s : %s" (fst x) (snd x))

    What this code is able to accomplish is the following:

    • Create a WebRequest for the given URL.
    • Asynchronously get the the response
    • Get the stream and put it into a reader
    • Asynchronously read the HTML on the page to the end
    • Return a tuple of the URL and the HTML
    • Run each URL from our site list in parallel to return us a list which I can iterate.

    Seems pretty simple, doesn't it?  Now if only concurrency were this easy.  Oh wait, it just is...  What we also get for free is resource lifetime management through the use keyword, binding with continuations, exception management and so on without much effort.  I'll cover more of this in the future.  I just wanted to whet the appetite for what is coming down the pike.


    Wrapping it Up

    This was just a quick primer on the differences between using the TPL tasks and asynchronous computation expressions.  I'll dive deeper into each in the near future and how they tick.  And possibly I can rewrite to combine the two and see how well they can compliment each other.



  • Recursing into Recursion - Memoization

    Lately, I've been heads down on a lot of concurrency items which will hopefully come out soon.  In the mean time, I want to get back to the basics one last time with recursion.  As I posted earlier, I've been talking about recursion in the past couple of posts, and this one will be one last on the topic.  I'll do my best to post the samples in both C# and F#.  As always, you can find my C# samples on MSDN Code Gallery at the Functional C# Samples.

    Let's catch up to where we are today:

    Memoization

    When doing heavy computations through recursion, memoization becomes a pretty important topic.  The idea behind memoization is to speed up your programs by avoiding repetitive calculations for previously processed function inputs.  Let's go ahead and try a simple example using the ever so trite Fibonacci sequence as a quick example.

    F#
    let rec fibonacci n =
      if n <= 2 then 1 
      else fibonacci(n-1) + fibonacci(n-2)

    C#
    static int Fibonacci(int n)
    {
        return n <= 2 ? 1 : Fibonacci(n - 2) + Fibonacci(n - 1);
    }

    But, if you call these above functions time and time again with nearly the same input, they have to go through the same computation again and again, which isn't necessary.  So, instead, let's try through the technique of memoization to speed up this computation.

    F#
    let fibonacci = 
      let t = new Dictionary<_,_>()
      let rec fibCached n = 
        if t.ContainsKey(n) then t.[n]
        elif n <= 2 then 1
        else
          let res = fibCached(n - 2) + fibCached(n - 1)
          t.Add(n, res)
          res
      fun n -> fibCached n

    C#
    public static Func<int, int> Fibonacci()
    {
        var t = new Dictionary<int, int>();
        Func<int, int> fibCached = null;
        fibCached = n =>
        {
            if (t.ContainsKey(n)) return tNo;
            if (n <= 2) return 1;
            var result = fibCached(n - 2) + fibCached(n - 1);
            t.Add(n, result);
            return result;
        };

        return n => fibCached(n);
    }

    As you can see from the above code, we're using a Dictionary<TKey, TValue> as our storage for already computed values.  Then we go ahead and compute the values if we have not seen them before.  We'll see a pretty impressive speed boost when calling with repetitive values.  But, that's not a generic solution, and I'd rather have the ability to take a generic function and pass any function to this and compute.  So, let's go ahead and make this generic.

    F#
    let memoize (f : 'a -> 'b) =
      let t = new System.Collections.Generic.Dictionary<'a, 'b>()
      fun n ->
        if t.ContainsKey(n) then t.[n]
        else
          let res = f n
          t.Add(n, res)
          res
      
    let rec fibonacci= 
      memoize(fun n -> if n <= 2 then 1 else fibonacci(n - 1) + fibonacci(n - 2))

    C#
    public static Func<T, TResult> Memoize<T, TResult>(this Func<T, TResult> func)
    {
        var t = new Dictionary<T, TResult>();
        return n =>
        {
            if (t.ContainsKey(n)) return tNo;
            
            var result = func(n);
            t.Add(n, result);
            return result;
        };
    }

    static int Fibonacci(int n)
    {
        return n <= 2 ? 1 : Fibonacci(n - 2) + Fibonacci(n - 1);
    }

    Func<int, int> fibonacci = Fibonacci; 
    var fibonacciMemoized = fibonacci.Memoize();
    var fibFast1 = fibonacciMemoized(30);

    As you can see, I was able to create a generic memoize function that takes a function as its input, checks whether the item is in the cache and if not, adds it to the cache.  Using the C# version, we created an extension method for our given Func<TArg, TResult>.  This allows us to memoize the function correctly.  Alternatively, I could have written the memoized function this way:

    C#
    static Func<int, int> Fibonacci()
    {
        return Extensions.Memoize((int n) => n <= 2 ? 1 : Fibonacci(n - 2) + Fibonacci(n - 1));
    }

    var fibonacci = Fibonacci;
    var fibFast1 = fibonacci(30);

    To me, this reads a bit more of functional programming and the true power of it.  I basically took the above methods and combined them so that my Fibonacci function now returns the function instead taking an integer and returning an integer.

    Beware of the Bad Memoization

    In order to get memoization to work properly, you must remember not to include the value to the function when declaring the function.  This will cause a fresh memoization table to be created each and every time.  Such functions such as these should be avoided as below:

    F#
    let rec fibonacciIncorrect n =
      memoize (fun n -> if n <= 2 then 1 else fibonacciIncorrect(n - 2) + fibonacciIncorrect(n - 1)) n

    C#
    static int FibonacciIncorrect(int n)
    {
        return Extensions.Memoize((int x) => x <= 2 ? 1 : FibonacciIncorrect(n - 2) + FibonacciIncorrect(n - 1))(n);
    }

    Generic Memoization Service

    Due to the subtlety of problems mentioned above, it may be better to return an object instead of a function.  These can be pretty simple objects with no more functionality than to return values and discard the given collection.  Let's go ahead and do this sample in F# first.  What I'm going to do is define the holder type for the calculated value:

    type ITable<'a, 'b> =
      inherit System.IDisposable
      abstract Item : 'a -> 'b with get

    This defines for me an interface with a Discard method and an Item method which accepts 'a and returns 'b via a get accessor.  Pretty simple holder for my data.  Now, let's define the memoize function which does the work:

    let memoize f =
      let outerTable = new Dictionary<_,_>()
      { new ITable<_,_> with
          member t.Item
            with get(n)
              if outerTable.ContainsKey(n) then outerTable.[n]
              else
                let res = f n
                outerTable.Add(n, res)
                res

          member t.Dispose() =
            outerTable.Clear()
      }

    What you can see from the above is that I'm creating an inner class of type ITable which I defined above.  Then I implement the given methods and properties I defined on the interface for Item and Discard().  During the Item property, I'm checking whether I have the given item.  If I do, I return it from the collection, else I execute the given function and store the result.  During my discard, I give myself the ability to reset the cached values.  Now, let's implement the Fibonacci sequence function once again with our new function that we defined above.

    let rec fibonacci =
      memoize 
        (fun n ->
          if n <= 2 then 1 else fibonacci.[n - 1] + fibonacci.[n - 2])

    Now with this function, I can execute the given methods directly on the class that was returned from the memoize function.  As you can see, I am getting values from the fibonacci object and calling the function once again on itself.  I can then calculate the given numbers while using this code below:

    let result = fibonacci.[10]
    printfn "%i" result

    That's pretty slick, isn't it?  Now, let's try in C#.  It's not going to be as elegant nor to the point.  But, let's define our Table class first.

    public class Table<T, U> : IDisposable
    {
        private readonly Dictionary<T, U> dictionary;
        private readonly Func<T, U> func;

        public Table(Dictionary<T, U> dictionary, Func<T, U> func)
        {
            this.dictionary = dictionary;
            this.func = func;
        }

        public U this[T n]
        {
            get
            {
                if (dictionary.ContainsKey(n))
                    return dictionaryNo;

                var result = func(n);
                dictionary.Add(n, result);
                return result;
            }
        } 

        public void Dispose()
        {
            dictionary.Clear();
        }
    }

    Now that this is defined, then the rest of our methods should come nicely as noted here.

    static Table<T, U> MemoizeTable<T, U>(Func<T, U> func)
    {
        return new Table<T, U>(new Dictionary<T, U>(), func);
    }

    public static Table<int, int> Fibonacci
    {
        get
        {
            return Memoize((int n) => n <= 2 ? 1 : Fibonacci[n - 2] + Fibonacci[n - 1]);
        }
    }
    var fibonacci = Fibonacci;
    var fibFast1 = fibonacci[30];

    Not as elegant in the F# version because C# won't let me create an anonymous inner class as F# does.  If you look around the F# codebase, you'll find them using this plenty to define such things that return IEnumerable<T> for Map and so on.  Not the prettiest solution in C#, as you can well imagine.

    Some Notes on Memoization

    Don't forget that proper memoization requires the functions that you are writing to be pure.  What that means is that given the same input, the same output will always be calculated and returned.  Also note that none of the code I wrote was particularly thread safe, so I would definitely recommend if you were going to worry about concurrency here with our shared mutable state, that you would use the appropriate locks and so on.

    Wrapping It Up

    Now I was able to show you how to speed up some of those recursive algorithms using memoization.  This allows for a bit better efficiency when doing large and pure calculations.  Next time, I'm going to start going back to my concurrency topics as I have a lot yet untouched.  With my recent presentations on the matter, there is plenty to talk about.

  • Reminder - DC ALT.NET Meeting 7/24/2008 - LINQ Deep Dive

    Just as a reminder from the previous post, the July meeting for DC ALT.NET will be on July 24, 2008 from 7PM-9PM.  Check out our site and our mailing list for more information as it becomes available.  This month, K. Scott Allen, of OdeToCode and a co-host of Herding Code, will present a deep dive into LINQ and a code-along so that we can follow along.  The intent is to go as deep as we can with LINQ to find out what works, what doesn't and how to use it effectively.  So, bring your laptops and get ready...

    LINQ

    K. Scott A