Damon Payne: Hand waving software architect

103db signal to noise ratio at < .03% total harmonic distortion
Solution Architect, software developer, geek
Damon Payne at Blogged
2009 Microsoft MVP - Client App Dev
2007 Microsoft MVP - Solution Architecture
 Thursday, April 03, 2008
« Upgrade Complete | Main | Managing Concurrency With Trees[1] »

That’s right, the tracks on my album start at Zero…

I’ve started cutting articles with a lot of code up into smaller pieces since Google reader seems to reject some of the longer posts.

I got a comment via IM on the command/concurrency article (http://www.damonpayne.com/PermaLink,guid,1cb04d95-b824-4f27-823c-53fbf11fb50c.aspx) that I felt deserved an answer.

I read the command design pattern article, liked it, sorta lost me at the end there with the trees though

Fair enough, let me reiterate the goal and show some code.  Using the shotgun/iterative approach and the terminology from the last example, we set eight threads as eligible to run even though we know most of them will be blocking until their prerequisites have completed.  We will assume that creating a ManualResetEvent is not too expensive by itself.    In my tests creating 100,000 objects takes about 2 milliseconds and the same number of ManualResetEvents takes 330 milliseconds, not much overhead given that we are dealing with tasks than could take many minutes or hours to run.  Still, I alluded that there might be a more clean way to set up the code.  It would be nice to have as few threading primitives as possible without sending signals from Thread to Thread all the time.  The F# feature of data being immutable in their scope is often combined with recursive tree-based algorithms to solve problems on many cores.  This approach might use some memory but makes concurrency less painful because one need not worry about another thread modifying the current working data set.  All of this got me to thinking about using a Tree structure for dependent tasks.  In my last example( link) the dependencies are all known at compile time anyway, so it felt wasteful to also include synchronization code to make sure A was done before executing B.

Example Domain

I know that I’m going to use a Command-esque pattern for work.  I’m going to use an Interface and a base class this time.  After a few rounds of experimentation I decided on this as the contract for chunks ‘o work:

    public interface IExecutable

    {

        /// <summary>

        /// Do your work.

        /// </summary>

        void Execute();

 

        /// <summary>

        /// Should return true only when this IExecutable is 100% done.

        /// </summary>

        bool IsComplete { get; }

 

        /// <summary>

        /// We want classes to say what their IMMEDIATE dependency is, no need for

        /// ancestral knowledge because of the "seating" algorithm we'll use.

        /// </summary>

        List<Type> DependencyTypes { get; }

 

        /// <summary>

        /// We want classes to say what their IMMEDIATE dependency is, no need for

        /// ancestral knowledge because of the "seating" algorithm we'll use.

        /// </summary>

        List<IExecutable> DependencyInstances { get; }

 

        /// <summary>

        /// The task is "seated", not scheduled per se but it's order of execution and dependencies are

        /// now known.

        /// </summary>

        bool Home { get; set; }

 

        /// <summary>

        /// A Callback that will be set by any code seating/scheduling this IExcecutable.  The implementation

        /// should call this method with itself as the argument after setting <code>IsComplete</code> to true

        /// </summary>

        CompleteCallback Complete { get; set; }

 

    }

 

    /// <summary>

    /// Callback delegate

    /// </summary>

    /// <param name="task"></param>

    public delegate void CompleteCallback(IExecutable task);

For the purposes of this example I’ve only implemented DependencyTypes but it should be clear how DependencyInstances would be implemented based on an instance of IExecutable.

Sample Tasks

As with my previous examples, I am using tasks that take many seconds to run instead of milliseconds in order to stress the nature of the problem.  The example is a data import application that needs to do things in a certain order:

The order of execution is known at compile time, so we could just create and execute like so:

            RootTask t = new RootTask();

            TaskTree workTree = new TaskTree(t);

            workTree.IsRoot = true;

            //

            OddJob oj = new OddJob();

            RandomTask rt = new RandomTask();

            RedVendorFileDownload redDown = new RedVendorFileDownload();

            RedVendorFileUnzip redZip = new RedVendorFileUnzip();

            RedVendorFileImport redImport = new RedVendorFileImport();

            RedVendorPhotoDownload redPhoto = new RedVendorPhotoDownload();

            GreenVendorFileDownload greenDown = new GreenVendorFileDownload();

            GreenVendorFileImport greenImport = new GreenVendorFileImport();

            GreenVendorFileUnzip greenUnzip = new GreenVendorFileUnzip();

            Console.WriteLine("=========Manual Execution=========");

            t.Execute();

            oj.Execute();

            redDown.Execute();

            redZip.Execute();

            redImport.Execute();

            redPhoto.Execute();

            rt.Execute();

            greenDown.Execute();

            greenUnzip.Execute();

            greenImport.Execute();

All of the tasks merely Sleep() for a certain amount of time and don’t do any real work.  In my example, in-order execution takes 54.76 seconds.

Arbor Day

The first thing needed was an actual Tree class in C#.  I already alluded that myself and others (http://dvanderboom.wordpress.com/2008/04/03/tree-data-structure-source-code-posted/) had built implementations.  I started with my definition of Tree:

    /// A Tree is a collection of trees, one tree/node is called the Root.  A collection of Trees can be empty, otherwise a Tree

    /// consists of a Root and zero or more non-empty subtrees whose nodes are connected by an edge to the root.

Obviously I should use Generics and make a Tree<TNodeType>.  This is fairly easy in C#:

   public class Tree<TNodeType> : IEnumerable<TNodeType>

    {

        public Tree()

        {

            IsRoot = false;

            ChildNodes = new List<Tree<TNodeType>>();

        }

 

        public Tree(TNodeType value) : this()

        {

            Value = value;

        }

 

        /// <summary>

        /// The value of this Tree/Node

        /// </summary>

        public TNodeType Value { get; set; }

 

        /// <summary>

        /// If this Tree/Node is the root...

        /// </summary>

        public bool IsRoot { get; set; }

 

        /// <summary>

        ///

        /// </summary>

        public List<Tree<TNodeType>> ChildNodes { get; set; }

 

 

        public void AddChild(Tree<TNodeType> child)

        {

            ChildNodes.Add(child);

            child.Parent = this;

        }

 

 

        /// <summary>

        ///

        /// </summary>

        public Tree<TNodeType> Parent { get; set; }

There’s nothing much to see here in the 1st part, we’ve just built a Tree data structure.  I will introduce some other features of the tree class as we use them.  There are also two extension methods I will use in the next phase:

        public static bool IsNullOrEmpty(this System.Collections.ICollection collection)

        {

            if (null == collection || 0 == collection.Count)

            { return true; }

            return false;           

        }

 

        public static string FriendlyClassName(this object o)

        {

            string className = o.ToString();

            return className.Substring(className.LastIndexOf(".") + 1);

        }

We can also add all of the tasks to a list and iterate across the various Values in the tree.  The Tree<TNodeType> class IEnumerable<TNodeType> implementation is this (for now):

        public virtual IEnumerator<TNodeType> GetEnumerator()

        {

            return new TreeEnumerator<TNodeType>(this);

        }

 

        System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()

        {

            return GetEnumerator();

        }

For now there’s no need to go into the implementation of TreeEnumerator.  Assume that tasks are returned in the appropriate order.  Simple iteration over a tree containing these elements shockingly runs in the exact same amount of time as the previous method.

In the next article we’ll dig into the creation of the tree structure.