Damon Payne: Hand waving software architect

103db signal to noise ratio at < .03% total harmonic distortion
Solution Architect, software developer, geek
Damon Payne at Blogged
2009 Microsoft MVP - Client App Dev
2007 Microsoft MVP - Solution Architecture
 Thursday, April 24, 2008

My wife is 1 week overdue now, so they are inducing her into labor around 7am tomorrow morning.  I will therefore be mostly off the grid for a brief while, though I'll post pictures when everything seems OK.

I have several semi-complete articles, so I should be posting quite a bit in May.



Thursday, April 24, 2008 10:16:29 AM (Central Standard Time, UTC-06:00)  #    Disclaimer  |  Comments [1]  |  Trackback
 Thursday, April 17, 2008

In the past I've occasionally announced something "coming soon" that never saw the light of day, and because of this I've greatly reduced the number of times that I mention any ideas that are still in the vaporware stage.  This time though I just wanted to get it out there.

I mentioned here that Unit Testing was a big chance to save developers time using concurrency.  I actually downloaded the NUnit source immediately after posting that and started looking at what it would take to get NUnit to run tests concurrently.  It's my current opinion, though I reserve the right to change my position later, that NUnit is an over engineered mess.  It uses some interesting combinations of patterns that ultimately makes an amazingly deep and hard to follow pipeline for test execution and result reporting.  The path for running a test suite goes something like this:

  1. Click "Run tests" button
  2. GUITestRunner
  3. ... passes to ThreadedTestRunnger (so the UI remains responsive)
  4. ... passes to ProxyTestRunner
  5. ... crosses a remoting boundary for some reason
  6. ... RemoteTestRunner
  7. ... passes to SimpleTestRunner
  8. Foreach unit test, execute, collect results...
  9. ... pass the results back across the remoting boundary to display in the GUI

I'm sure I'm missing at least one step in there.  The first trial of making SimpleTestRunner execute the tests in Parallel failed because there seems to be some bizarre hook in the test result capturing scheme that didn't like being threaded.  I will either track down this issue, create my own XTestRunner implementation, or blow an evening making my own unit test scheme that can run in parallel.  If I can then get TestDriven and NCover to hook into that, life would be good indeed.



Thursday, April 17, 2008 8:45:53 AM (Central Standard Time, UTC-06:00)  #    Disclaimer  |  Comments [0]  |  Trackback
 Wednesday, April 16, 2008

I couldn't leave this one alone:

http://www.scienceblog.com/cms/world039s-oldest-living-tree-discovered-sweden-15937.html

The first thing I thought of when I saw this, sadly, was not "wow, what an awesome discovery!" but "Wait, isn't the world only 6,000 years old?".  As much as it would be a tragedy to harm these trees, I think we ought to take a core sample and count the rings on the trees.  What would surely follow is the sound of one thousand "thuds" as the creationists trip over themselves to refute the fact that the age of trees can be determined by counting rings.  Or, as my friend put it "The devil planted that tree to confuse us and test our faith."

Johnny Appleseed was the antichrist.



Wednesday, April 16, 2008 1:38:38 PM (Central Standard Time, UTC-06:00)  #    Disclaimer  |  Comments [0]  |  Trackback

I had the most frustrating dasBlog issue over the past couple of days:

I suddenly found myself unable to post anything on my blog.  This coincidentally happened at the exact same time my hosting provider had me try a web.config change to alleviate some session/viewstate issues I was having.  I would try to make a post, get no errors, and be returned to the front page and see that my post was missing.  Referral logs were strangely missing, and the dasBlog event log was also strangely empty.  The only thing I could get a vague error message from was trying to post a comment.

Apparently, my "Anonymous Asp.Net user" had suddenly hit its disk-space quota.  I don't know if this quota was created recently or not.  It seems odd indeed that dasBlog would cruise along and never report an error when it's out of disk space.  I wonder if this is something dasBlog is doing or something that's happening because of the hosting environment?



Wednesday, April 16, 2008 11:39:55 AM (Central Standard Time, UTC-06:00)  #    Disclaimer  |  Comments [0]  |  Trackback
 Saturday, April 12, 2008

http://www.klipschcorner.com/silverlight/DeepZoomTestPage.html

I have a multi scale image sample here, using code from Scott Hanslemann and others.   I had other photos to include, but no matter what machine I tried to build this on, the Deep Zoom Composer seemed to want to consume 100% CPU and 100% memory once I got up to 10 images or so.  The Palladium speakers are certainly beautiful, and with some print quality images you can count the coils on the transistors in the crossover.  These speakers are so beautiful.



Friday, April 11, 2008 11:06:26 PM (Central Standard Time, UTC-06:00)  #    Disclaimer  |  Comments [0]  |  Trackback
 Tuesday, April 08, 2008

I have been told by more than one person that my Tree Concurrency articles were pretty, but vague and not applicable to what most developers are working on.  I had to consider this carefully.  I think it’s a natural way to express concurrency plus the dependency of one Task upon another.  Still, how many people would use such a scheme?

Long ago I wrote about this notion of Cage Builders (http://www.damonpayne.com/2005/05/10/CageBuilders.aspx) ; my favorite admonishment I heard constantly was “Don’t EVER use threads unless there’s no other way”.  No other way to what, I wonder?  I believe we are definitely past this ridiculous idea, or very nearly so anyway.  This is certainly a consideration for framework developers, and soon to be a common consideration of people developing line of business apps as well.  Concocted whilst I ate lunch today, here are five situations where any software developer could use the help of concurrency in their day to day work.

1 – Unit Testing

Real world projects could potentially have hundreds or thousands of unit tests in a solution.  If your team does not believe in unit tests, well, good for you, you won’t have this problem.  A good unit test strategy, especially for large projects with many people touching code, might be as follows:

1)      Check out code

2)      Run unit tests

3)      Make changes

4)      Create unit tests for your code

5)      Run unit tests to ensure nothing else has been broken by your code

6)      Get latest code that changed while you were working

7)      Run unit tests again

8)      Check in

 Running these tests before check ins and after check outs could become a non trivial chunk of your day.  Assuming one is following the best practice that each unit test must stand alone (set up any state needed for the test, run the test, destroy any created state) a group of Unit Tests is a great thing to be ran in parallel.  A suite of tests that takes a long time to run dis-incents developers from running them, and from adding to them. 

Unit test tools could benefit greatly from running in Parallel.  This would not only be useful on many-core machines.  Even on a single core machine (becoming rare as time goes on) you are very likely to have an application that contains a combination of business logic, disk access, computation logic, and so forth.  If Unit tests are being run in parallel, compute- or disk-bound tests can execute while tests requiring out of process calls block on network I/O.

If you are NOT following the practice of making unit tests stand alone, I happen to have published some libraries to manage in-order concurrent execution of tasks starting here http://www.damonpayne.com/ct.ashx?id=03c5e472-578d-4eee-ac28-5fca6434f617&url=http%3a%2f%2fwww.damonpayne.com%2fdefault.aspx%23a89abf6be-1354-40f4-a4cc-facaa28e6c4f.

 

2 – Batch Routines

While not as embarrassingly parallel as a Ray Tracer (http://msdn2.microsoft.com/en-us/magazine/cc163340.aspx) , many data import, data export, image generation, or data processing processes could take advantage of concurrency.  This may be the age of SOA, the Internet Service Bus, Ajax, distributed systems and “Real Time” in general, but a lot of business is still done and money is made using Batch processes.  One of CarSpot’s most useful features that keep customers around is our ability to send data and photos to AutoTrader.com, Cars.com, and so forth.  This is still mostly accomplished using Batch processing, FTP get and put, and a Zip file or two.   If you can make your batch processes go faster you can run them more often.  Running more often means the appearance of near real-time to your customers.

3 – Opening Files

Have you ever popped up an OpenFileDialog that allows the user to select multiple things to Display, Modify, or otherwise “use”?  You may ultimately be limited by the file system or whatever you are retrieving things from, but why not use concurrency so that things pop faster for the user?  This is especially true if you are opening multiple files at once and then computing something based on the file contents.  Using the Command pattern, which I am obviously obsessed with lately, works well for this.

4 – Application Startup

I don’t know how many times I’ve seen an ASP.Net application that caches a bunch of data on startup, and starts a logging engine, and starts an NHibernate or other ORM repository.  It’s easy to see from watching the Output window in the debugger that these things often involve out of process calls, JIT-ing, code generation, reflection emit, or other operations that take a little time.  Use concurrency and save a noticeable amount time when you are debugging hundreds of times per day.  Package all of these startup tasks as discrete units of work and get more done every day.

5 – Compiling Code

 In the beginning there was UNIX.  On UNIX, there was make.  Make makes your projects; make was a command line build tool.  As a build tool, one of make’s primary jobs was to determine when things were dirty and what needed to be built. You basically told make that these files over here (like *.cpp in such and such a directory) were processed with this tool over here (such as a C++ compiler) to produce such-and-such an output (like an executable or library).  If several object files from several directories needed to be combined into a shared object (DLL), make could determine the dependencies and build first what could be built.  Since make’s first job was dependency checking, this made a parallel-make easier to build and there were parallel and even distributed make implementations long ago.  Compiling hundreds of thousands of lines of C++ in 1998 was painful.

We need ms-build to use concurrency to build my solutions faster.   Ms-build already knows what projects depend on what other projects.  Make it so.

The Sadly Missing #6

Obviously there are more cases than what I’ve listed here.  I plan on sitting in my home theater with a nice chateauneuf-du-pape and thinking back over every application I’ve worked on in my consulting career to dig up some more ideas.  There is one thing that continually bothers me: this whole GUI thread thing.  To some degree, the Next Big Thing in concurrency has got to be something about solving this single-threaded painting issue.  I hope to attack this topic, but not until this winter unless a huge revelation comes to me out of nowhere.



Tuesday, April 08, 2008 8:26:28 PM (Central Standard Time, UTC-06:00)  #    Disclaimer  |  Comments [2]  |  Trackback

Managing Concurrency With Trees[4]

Read the entire saga:

Managing Concurrency With Trees[0]

Managing Concurrency With Trees[1]

Managing Concurrency With Trees[2]

Managing Concurrency With Trees[3]

 

As I indicated at the end of the last article, I might come back to this topic.  I’m back.  Two things made me unable to put this topic down.  #1 was using Reflector and finding various code in Parallel.ForEach written around things like this:

private const int DEFAULT_LOOP_STRIDE = 8;

 

As near as I can read the code, the Parallel extensions don’t really get their swerve on until you’ve got quite a few more tasks than I’m using for examples here.  My uneducated suspicion is that this has to do with building up a queue of work for each thread and implementing the work stealing.

Recall from my last example I can even frustrate the TPL into executing quite a few things sequentially with that specific Enumerator implementation.  This is what I’d like to avoid.

I’ll definitely need to come up with some realistic examples with hundreds or thousands of little tasks, but for now I’ll try to see my original idea through.  But not with Parallel FX.  It’s time for Payneallel FX.

I made the comment to my muse  that I wondered how much it would take to make my own version of Parallel.ForEach: no fancy work stealing and  very specific to my problem.  The PFX team surely has very good reasons for doing what they are doing but I was frustrated at being unable to use my custom Enumerator.  I want to be able to write code like this:

Parallel.ForEach<IExecutable>(workTree, executeTask);// tree based, low # of tasks

… or like this …

            Parallel.ForEach<OddJob>(jobs, doJob); //List based, high # of tasks

…and have it work without hinting.   Thinking I was going to be blowing my whole weekend on this, I started thinking and typing.  Once I decided how to limit the number of threads created and re-use a thread rather than make a new thread for each task, it went much better than I expected.  Around 208 lines of code later, I had something that worked 100% the way I wanted it to!

There are three classes: a static class containing a static ForEach method, a WorkerPool class, and a Worker class.

Paralell.ForEach seems like a pretty good way to write this kind of code, so I imitated this interface.  Here’s the sample Main method:

            int exeCount = 0;

 

            Action<OddJob> doJob = delegate(OddJob j)

            {

                j.Execute();

                Interlocked.Increment(ref exeCount);

                Console.WriteLine("Executed " + exeCount);

            };

 

            List<OddJob> jobs = new List<OddJob>();

            for (int i = 0; i < 20; ++i)

            {

                jobs.Add(new OddJob());

            }

           

            Payneallel.ForEach<OddJob>(jobs, doJob);

This should look just like TPL code.  The actual implementation is presented in its entirety here with some basic description afterwards.

    public static class Payneallel

    {

        public static void ForEach<TSource>(IEnumerable<TSource> source, Action<TSource> body)

        {

            WorkerPool<TSource> pool = new WorkerPool<TSource>();

            foreach (TSource src in source)

            {

                Worker<TSource> worker = pool.GetWorker();

                worker.Arg = src;

                worker.Work = body;

                worker.Go();

            }

            pool.WaitAll();

        }

This couldn’t be simpler.  When the method is called a WorkerPool (thread pool) is created.  These are my threads, not threadpool threads, so I don’t need to worry about running long tasks on them.  We can choose to limit the number of threads created or not without needing to change this method or require anything of the client.  Worker<TSource>.Go() means “GO”, so work begins as soon as possible, this is where I had issues with my Tree<T> and the TPL.  Next, the WorkerPool implementation:

class WorkerPool<T>

        {

            public WorkerPool()

            {

                _workers = new List<Worker<T>>(Environment.ProcessorCount);

                for (int i = 0; i < Environment.ProcessorCount; ++i)

                {

                    Worker<T> worker = new Worker<T>("Payneallel " + i);

                    _workers.Add(worker);

                    worker.Done = new Action(WorkerDone);

                    worker.Go();

                }

                _workerDoneEvent = new ManualResetEvent(false);

            }

 

            private ManualResetEvent _workerDoneEvent;

            private static List<Worker<T>> _workers;

            private object _syncRoot = new object();

 

            /// <summary>

            ///

            /// </summary>

            public void WorkerDone()

            {

                lock (_syncRoot)

                {

                    _workerDoneEvent.Set();

                }

            }

 

            public Worker<T> GetWorker()

            {

                Worker<T> worker = GetFreeWorker();

                while (null == worker)

                {                   

                    _workerDoneEvent.WaitOne();

                    worker = GetFreeWorker();

                }

                _workerDoneEvent.Reset();

                return worker;

            }

 

            private Worker<T> GetFreeWorker()

            {

                foreach (Worker<T> w in _workers)

                {

                    if (!w.Busy)

                    {

                        Console.WriteLine("returning worker from pool");

                        return w;

                    }

                }

                return null;

            }

 

            public void WaitAll()

            {

                while (true)

                {

                    foreach(Worker<T> w in _workers)

                    {

                        w.Finish();

                    }

                    return;                

                }

            }

               

        }

The pool creates one Worker<T> for each physical CPU on the machine.  I’ve come back to wait handles as a means of making things work, but in this case I don’t mind because I’m limiting the number of threads I’m creating.  If Payneallel.ForEach calls GetWorker() but they are all busy, we block until a Worker is available.  WaitAll() makes sure the call to Payneallel.ForEach returns to the calling thread in a synchronous fashion.  The rest of the work is done in the Worker<T> class:

class Worker<T>

        {

            /// <summary>

            /// Set up initial state

            /// </summary>

            /// <param name="name"></param>

            public Worker(string name)

            {

                Name = name;

                Work = null;

                _active = true;

                Busy = false;

                _syncRoot = new object();

               

                _acceptingWorkEvent = new ManualResetEvent(true);

                _gotWorkEvent = new ManualResetEvent(false);

 

                ThreadStart ts = new ThreadStart(DoWork);

                Thread t = new Thread(ts);

                t.Name = Name;

                t.Start();

            }

 

            /// <summary>

            /// set _active to false so the thread will exit

            /// </summary>

            ~Worker()

            {

                _active = false;               

            }

 

            private ManualResetEvent _acceptingWorkEvent;

            private ManualResetEvent _gotWorkEvent;

 

            /// <summary>

            /// The name of the worker, used to Name the thread for easier debugging

            /// </summary>

            public string Name { get; set; }

 

            /// <summary>

            /// The worker is currently doing something

            /// </summary>

            public bool Busy { get; set; }

            private bool _active;

            private object _syncRoot;

 

            /// <summary>

            ///

            /// </summary>

            public Action<T> Work { get; set; }

 

            /// <summary>

            ///

            /// </summary>

            public T Arg { get; set; }

 

            /// <summary>

            /// Call back to the Pool to signify we're done

            /// </summary>

            public Action Done { get; set; }

 

            /// <summary>

            /// Block until done

            /// </summary>

            public void Finish()

            {

                _acceptingWorkEvent.WaitOne();

                _active = false;

            }

 

            /// <summary>

            /// Reset the wait flag so the thread running DoWork will start up again

            /// </summary>

            public void Go()

            {

                _gotWorkEvent.Set();

            }

 

            public void DoWork()

            {

                while (_active)

                {

                    _gotWorkEvent.WaitOne();//Wait for someone to call Go on us

                    if (null != Work && null != Arg) //make sure they set the callback action and the argument to the action

                    {

                        lock (_syncRoot)

                        {

                            Busy = true;

                            _acceptingWorkEvent.Reset();//Not accepting work now..

                        }

                        Work(Arg);//Do the action

                        Work = null;

                        Arg = default(T);

                        lock (_syncRoot)

                        {

                            Busy = false;

                            _acceptingWorkEvent.Set();//We are accepting work

                            _gotWorkEvent.Reset();//Don't have work

                            Done();//Call back to the pool in case someone is waiting on a free worker

                        }

                    }

                }

            }

        }

The Worker<T> has two wait handles to indicate working and available.  I believe the overhead in creating a thread is just enough that it makes sense to do things this way.  Creating a dozen threads is not a big deal (my x64 machine has around 1000 going most of the time) but I would probably avoid creating hundreds or thousands in order to do hundreds or thousands of Units of Work.  In the constructor for Worker<T>, the thread is created and started, however it isn’t doing anything until Go() is called, at which time the Action<T> is executed by the thread.  Note that I’m not taking any care whatsoever to think about thread local storage and the like.  I should probably look at this again to be sure the appropriate granularity of locking occurs.  The PFX team has got my head going 1000km/h and I’m just playing with petty alternate implementations of what they are doing.

I altered my TaskTreeEnumerator back to its original state before I had tweaked it beyond recognition trying to get PFX to play with me. Here is the new/original implementation for use with Paynellel:

        public override bool MoveNext()

        {

            lock (_syncRoot)

            {

                UpdateWork(_tree);

            }

 

            if (_runnableWork.Count > 0)

            {

                return true;

            }

            else

            {

                while (0 == _runnableWork.Count && !_tree.TrueForAll( new Predicate<IExecutable>(Executed) ))

                {

                    Thread.Sleep(200);

                    lock (_syncRoot)

                    {

                        UpdateWork(_tree);

                    }

                    if (_runnableWork.Count > 0)

                    {

                        return true;

                    }

                }

                //use this to block something until a task completes?

                Console.WriteLine("MoveNext->false");

                return false;

            }           

        }

 

        public override IExecutable Current

        {

            get

            {

                _current = _runnableWork.Dequeue();

                _current.Scheduled = true;               

                Console.WriteLine("Current Returning:" + base.Current.FriendlyClassName());

                return base.Current;

            }

        }

Because Paynellel.Foreach begins executing work as soon as it gets a single item, spinning inside MoveNext()  allows us to wait until prerequisite tasks have completed before moving on.

I just listened to the .NET Rocks episode today with Stephen Toub, program manager for the Parallel Computing Platform group at Microsoft.  He indicated there are some low-level synchronization primitives internal to the Task Parallel Library that they intend to make public in future CTPs of the Parallel Extensions framework.  He also indicated that to some degree, the December preview was “rough” internally, and was mainly meant to test the look of the API and get early client feedback.   I’m eagerly awaiting the next CTP.



Tuesday, April 08, 2008 7:32:55 PM (Central Standard Time, UTC-06:00)  #    Disclaimer  |  Comments [0]  |  Trackback

Someone in my fan club has too much time/money on their hands.  I took this screenshot just now:

 



Tuesday, April 08, 2008 10:41:53 AM (Central Standard Time, UTC-06:00)  #    Disclaimer  |  Comments [3]  |  Trackback
 Monday, April 07, 2008

I sat down to continue some actual programming work today since it seemed support and manager-y type things had settled down.  Having gotten into WPF long before Silverlight, I found that to have a similar experience there are a lot of things I'm going to have to implement for the "CarSpot Silverlight Application Framework"

  • Commanding: I have seen references to people creating their own commanding frameworks for Silverlight 2.
  • Pages/Navigation: I wound up creating a Screen Stack for navigating from one Logical Page to another but without Pages and a Navigation Service I have to wire this all together myself. 
  • Security: It seems odd that Client Application Services was left out of Silverlight.  There are any number of ways to hack this in since the Silverlight control is sitting in the browser, but I was surprised to see that there was no "CurrentPrincipal" on the CurrentThread and so forth. I need security for my .aspx apps, why not for my Silverlight apps?  If one is building a line of busniess application, there are surely bits of information or actions that need to be hidden except for certain user roles.

Still, I'll take this over JavaScript any day.



Monday, April 07, 2008 3:30:18 PM (Central Standard Time, UTC-06:00)  #    Disclaimer  |  Comments [0]  |  Trackback

Deep Thoughts from Deeper in .NET 2008

I had a great time at Deeper in .NET 2008 this Saturday, but was it educational? Read on!

Geo 2008

I work with some Geocoding technology on the side but I had not heard of some of the cool Geocode support built in to SQL Server 2008.  There are about 70 functions for things like radius and intersection built right in.  I’ve never wanted to be inescapably locked in to the Microsoft platform so badly before…

Get a Twitter Account

While I can claim to be an earlier than early adopter of the whole blogging thing, I just now got my twitter account.

https://twitter.com/damonpayne

Implement Unisex Compilers

This seemed incredibly funny on Saturday.  If I hear “So, we say ‘Mr. Compiler’ …“ one more time I’m going to crack.   Someone who shall remain nameless casually leaned over to me and says “Just so you know, Damon, the compiler is Male”.  I had to leave the room to avoid being disruptive…

LINQ to Link

You’ll either think this is hilarious, or not…

 

            var updownupdownleftrightleftrightbabaselectstart =

                from rupees in Hyrule

                      select rupees;

            foreach (var rupee in updownupdownleftrightleftrightbabaselectstart)

            {

                //...

            }

It seemed funny on Saturday.

You talking to me?

On some of the LINQ demos shown during the day, the following was present in the database code:

        /**

         * This code was generated by a tool.

         **/

Personally, I don’t like the framework mocking me.

Click Once

Late at night, as the drink-tickets flowed, I got into a discussion about ClickOnce and limited user accounts, as well as what is and is not possible to do with ClickOnce.  I can see that I’m going to have to revisit this topic.

 



Monday, April 07, 2008 9:42:08 AM (Central Standard Time, UTC-06:00)  #    Disclaimer  |  Comments [0]  |  Trackback
 Sunday, April 06, 2008

http://www.cinematical.com/2008/04/06/r-i-p-charlton-heston-dead-at-83/

As a big fan of Sci Fi and movies in general tonight's regularly schedule blogging is cancelled: Charlton Heston is dead at 83 years of age. 

Although I'm not religious I enjoyed Ben Hurr and The 10 commandments.  I enjoyed Planet of the Apes and The Omega Man.  I despised the Fat Master of Douchebaggery (Michael Moore) for taking his schtick to Charlton's doorstep despite his senility and obvious Alzheimer's symptoms.  My wife and I are getting ready to watch "I Am Legend" on Blu-Ray this week.  Will Smith, you are no Charlton Heston.  Rest in peace sir.



Sunday, April 06, 2008 7:45:56 PM (Central Standard Time, UTC-06:00)  #    Disclaimer  |  Comments [0]  |  Trackback
 Friday, April 04, 2008

I went to my first Nerd Lunch in a long time today.   It was good to see some of the Milwaukee peoples from the user group and introcude Mr. Vanderboom to some new people.  Some nerd celebrities are in town too, for Deeper in .NET tomorrow.  I can now say I've met Richard Campbell of .NET Rocks fame.

I'll be at Deeper in .NET tomorrow, unless my wife goes into labor!



Friday, April 04, 2008 2:55:35 PM (Central Standard Time, UTC-06:00)  #    Disclaimer  |  Comments [0]  |  Trackback
 Thursday, April 03, 2008

This article may not make much sense unless you’ve also read:

Managing Concurrency With Trees[0]

Managing Concurrency With Trees[1]

Managing Concurrency With Trees[2]

As I stated in my initial goal, a Tree structure is a natural way to express a number of tasks that need to happen in a certain order, and yet many of these things could be going on at the same time.  I had only read articles on the Task Parallel Library up to this point, but recalled both a video from channel 9 and a DotNetRocks Podcast with Joe Duffy where he stated that the TPL likes to work on collections that implement IEnumerable<T>.   This got me a little bit excited; I assume very few people out there are interested in the ranting of an MVP from Wisconsin with an Orage blog, but if I could shoehorn my ideas into the TPL that would be something different.

My initial goal, then, was to use the TaskTreeScheduler notion of flattening and queuing eligible tasks and combine it with some sort of clever IEnumerable<IExecutable> implementation such that I could pull in the TPL and write code like this (using the same workTree from the last article):

            Action<IExecutable> executeTask = delegate(IExecutable exeTask)

            {

                exeTask.Execute();

                Console.WriteLine("PFX DONE " + exeTask.FriendlyClassName());

            };

           

            Parallel.ForEach<IExecutable>(workTree, executeTask);

This saves me from having to write most of the threading code.  It wasn’t meant to be, though, read on.

To give you an idea on how I’d like this to work, let’s take a simpler example with a linear list of instances of the OddJob IExecutable implementation:

            Action<OddJob> doJob = delegate(OddJob j)

            {

                j.Execute();

            };

 

            List<OddJob> jobs = new List<OddJob>();

            for (int i = 0; i < 100; ++i)

            {

                jobs.Add(new OddJob());

            }

 

            Parallel.ForEach<OddJob>(jobs, doJob);

You’ll notice that the TPL has made quite a few worker threads for me and set about executing quite a few of my OddJobs at once.  By default the library will build at least one thread per physical core on your machine. 

The first thing I discovered is that the TPL does not instantly execute each piece of work it gets out of your IEnumerable<T> collection.  For example, here is the first iteration of the MoveNext() implementation for TaskTreeEnumerator:

        public override bool MoveNext()

        {

            lock (_syncRoot)

            {

                UpdateWork(_tree);

            }

 

            if (_runnableWork.Count > 0)

            {

                return true;

            }

            else

            {

                while (!_tree.TrueForAll(new Predicate<IExecutable>(Executed)) && _runnableWork.Count == 0)

                {

                    lock (_syncRoot)

                    {

                        UpdateWork(_tree);

                    }

                }

                //use this to block something until a task completes?

                return false;

            }           

        }

An enumerator should return false when you are past the last object in your collection.  I wanted to keep this from ever happening and basically block whatever traffic cop is consuming the enumerator until more work could be added to the Queue of _runnableWork.  With the code as shown above the first task (RootTask) is never executed; we are therefore stuck in an infintie while loop.  Current is requested and returned, but no joy.

Through experimentation I found some additional interesting behavior.  The TPL will allow MoveNext() to return false several times before it actually gives up and believes the collection is empty.  While attampting to decipher this behavior, I thought I could restructure the calling code to take multiple iterations over the tree, which would be FAR from idea but still let me achieve some consolation prize:

            Action<IExecutable> executeTask = delegate(IExecutable exeTask)

            {

                exeTask.Execute();

                Console.WriteLine("PFX DONE " + exeTask.FriendlyClassName());

            };

 

 

            Predicate<IExecutable> allDone = new Predicate<IExecutable>(Executed);

            while (!workTree.TrueForAll(allDone))

            {

                Console.WriteLine("=========TPL Iteration=========");

                Parallel.ForEach<IExecutable>(workTree, executeTask);

            }

 

This, however, met with failure of another sort.  Here’s the entire enumerator implementation, complete with the various debug statements:

public class TaskTreeEnumerator : TreeEnumerator<IExecutable>

    {

        public TaskTreeEnumerator(Tree<IExecutable> tree) : base(tree)

        {

            if (!tree.Value.IsComplete)

            {

                _runnableWork.Enqueue(tree.Value);

                tree.Value.Scheduled = true;

                Console.WriteLine("Setting " + tree.Value.FriendlyClassName() + " as runnable");

            }           

        }

 

        public override bool MoveNext()

        {

            lock (_syncRoot)

            {

                UpdateWork(_tree);

            }

 

            if (_runnableWork.Count > 0)

            {

                return true;

            }

            else

            {

                lock (_syncRoot)

                {

                    UpdateWork(_tree);

                }

                //use this to block something until a task completes?

                Console.WriteLine("MoveNext->false");

                return false;

            }           

        }

 

        public override IExecutable Current

        {

            get

            {

                _current = _runnableWork.Dequeue();

                _current.Scheduled = true;               

                Console.WriteLine("Current Returning:" + base.Current.FriendlyClassName());

                return base.Current;

            }

        }

 

        public bool Executed(IExecutable target)

        {

            return target.IsComplete;

        }

 

        public override void Reset()

        {

            Console.WriteLine("Reset!!");

        }

       

 

        private void UpdateWork(Tree<IExecutable> tree)

        {

            if (!tree.Value.IsComplete

                && !_runnableWork.Contains(tree.Value)

                && !tree.Value.Scheduled

                && null != tree.Parent

                && tree.Parent.Value.IsComplete)

            {               

                _runnableWork.Enqueue(tree.Value);

                Console.WriteLine("Setting " + tree.Value.FriendlyClassName() + " as runnable");

            }

            else if (tree.Value.IsComplete)

            {

                foreach (Tree<IExecutable> subTree in tree.ChildNodes)

                {

                    UpdateWork(subTree);

                }

            }

        }

    }

You can tell by the number of “ifs” in UpdateWork that I was getting frantic.  Given the code above, check out the output:

See anything odd in there?  Despite having four Worker Threads created, the tasks are all ran sequentially.  Not very concurrent of the TPL, is it?  Either I have made some kind of colossal mistake or the code just can’t handle this situation, which would be disappointing.  If the TPL would start executing Tasks AS they are yielded from the Enumerator I’m satisfied this would work.  There’s probably a very good reason why they are doing it this way.    There is a ParallelEnumerable class in the December CTP of the Parallel Extensions but that appears to be for Parallel LINQ and may not resolve the situation.

If I get time, I plan on taking a peek with Reflector to see if I can uncover a method that will work for what I want to do.

{Edit: I couldn’t help myself.  Using reflector I found the code in ForEachWorker<> where it seems to try to get a chunk of at least 8 tasks to run before actually starting the work.  It’s hard to decipher some Generics code via Reflector but I’d seriously like to see the PFX team change how this works.  Granted, it’s easy to write the threading code if you have, say, four long-running tasks.}

And today, look what drifted into my RSS reader from the Parallel FX team blog:

http://blogs.msdn.com/pfxteam/archive/2008/04/03/8354128.aspx

“…Parallel Extensions relies on threads, so you can continue to use existing synchronization primitives with Parallel Extensions just as you would with threads you spin up manually or with threads from the ThreadPool; we're also introducing some new synchronization and coordination primitives in upcoming releases of Parallel Extensions, so stay tuned for those …” (emphasis added)

If I wanted to, I could interpret “coordination primitives” as being the sort of construct I’m trying to create here.  I can’t wait for the next CTP.  For now, though, this wraps up my current thoughts on this subject.



Thursday, April 03, 2008 9:33:03 PM (Central Standard Time, UTC-06:00)  #    Disclaimer  |  Comments [0]  |  Trackback

You may want to be familiar with these articles before continuing:

Managing Concurrency With Trees[0]

Managing Concurrency With Trees[1]

Client Code

Now that I’ve built and populated my Tree<IExecutable>, I need to get something to go to work on it.  The client code is very simple:

            Console.WriteLine("=========Task Tree Scheduler=========");

            TaskTreeScheduler scheduler = new TaskTreeScheduler();

            scheduler.HardwareThreadsOnly = false;

            scheduler.Run(workTree);

The last line of code shown here will block until all the work contained in the tree is done.

The TaskTreeScheduler Class

So, how does the work actually get done?

    public class TaskTreeScheduler

    {

        public TaskTreeScheduler()

        {

            _maxThreads = Environment.ProcessorCount;

            _eligible = new Queue<IExecutable>();

            _syncRoot = new object();

        }

 

        private Tree<IExecutable> _work;

        private int _maxThreads;

        private int _runningThreads;

        private object _syncRoot;

        private Queue<IExecutable> _eligible;

As you can already see, we use the Tree structure to describe dependencies, but we will be slowly creating a flattened view into the tree using a Queue<IExecutable>.  The next two methods show how we get some threads started:

 

        private void AddEligible(IExecutable task)

        {

            lock (_syncRoot)

            {

                _eligible.Enqueue(task);

                Console.WriteLine("{0} tasks elibible to run", _eligible.Count);

            }

        }

 

        public void Run(Tree<IExecutable> workTree)

        {

            _work = workTree;

            AddEligible(_work.Value);

            ThreadStart ts = new ThreadStart(RunTrafficCop);

            Thread cop = new Thread(ts);

            cop.Name = "TrafficCop";

            cop.Start();

            cop.Join();

        }

When we pass a Tree<IExecutable> to Run, a “traffic cop” thread gets started.  The traffic cop could be the UI thread, however I wanted to allow the UI to block or go about it’s business, so Run uses Thread.Join().  The first item in the Queue is the Value of the Root of the tree, and we print some simple debug information to help visualize what goes on as work happens.  The entire RunTrafficCop method is fairly short:

        public void RunTrafficCop()

        {

            while (true)

            {

                int work = 0;

                lock (_syncRoot)

                {

                    work = _eligible.Count;

                }

                bool allComplete = _work.TrueForAll(new Predicate<IExecutable>(Complete));

                while (work > 0 || !allComplete)

                {

                    IExecutable task = null;

                    if (_eligible.Count > 0)

                    {

                        Console.WriteLine("***Waiting for work***");

                        task = _eligible.Dequeue();                       

                    }

                    else

                    {

                        Thread.Sleep(2);//Traffic cop sleeps

                        allComplete = _work.TrueForAll(new Predicate<IExecutable>(Complete));

                        continue;

                    }

                    StartThread(task);

 

                    lock (_syncRoot)

                    {

                        work = _eligible.Count;

                    }

                    allComplete = _work.TrueForAll(new Predicate<IExecutable>(Complete));

                }

                return;

            }

        }

We could do with one less while() loop here, but here’s what is going on.  The root task will be in the queue of eligible work.  As the traffic cop iterates, it will either Dequue() the next item that can be executed or enter a very brief wait state and check the work in the tree using the Predicate<IExecutable> to check the IsComplete flag of each instance.  If work is found, we call StartThread:

        public void StartThread(IExecutable task)

        {

            Interlocked.Increment(ref _runningThreads);

            PrintThreadUsage();

            task.Complete = new CompleteCallback(OnTaskComplete);

            ThreadStart ts = new ThreadStart(task.Execute);

            Thread t = new Thread(ts);

            t.Name = task.FriendlyClassName();

            t.Start();

        }

So here we’ve started a thread for an eligible-to-run IExecutable instance. This is where the light plumbin of the IExecutable contract comes into play and we see the expected behavior of implementing classes:

        public override void Execute()

        {

            base.Execute();

            Console.WriteLine("OddJob::Execute");

            Spin(2500);

            IsComplete = true;

            Complete(this);

        }

Spin(int) exists on the base class and merely uses Thread.Sleep to simulate some work being done for 2500 milliseconds.  When the work is done we set IsComplete to true and execute the callback delegate supplied by the TaskTreeScheduler.  The remaining interesting code lives in TaskTreeScheduler.OnTaskComplete:

        public void OnTaskComplete(IExecutable task)

        {

            StopThread();

            Tree<IExecutable> childTree = _work.FindTreeByValue(task);

            lock (_syncRoot)

            {

                foreach (Tree<IExecutable> tree in childTree.ChildNodes)

                {

                    _eligible.Enqueue(tree.Value);

                }

            }

        }

OnTaskComplete finds the Tree whose Value is the task that just finished executing.  Now that the Parent is done executing, any children are eligible to run, and when they complete their children will be eligible to run, and so forth.  Running the program now gives me output like this:

I realize there are a few implementation details I left out.  I will post the Solution in a .zip file eventually.  The static structure of the solution is as follows:

The next post wraps up my initial thoughts in this realm although I’m sure I’ll come back to this.  I have alluded to some experimentation with the Task Parallel library in addition to my custom code, the next part deals with the TPL.



Thursday, April 03, 2008 8:15:39 PM (Central Standard Time, UTC-06:00)  #    Disclaimer  |  Comments [0]  |  Trackback

You may want to be familiar with Managing Concurrency With Trees[0] before reading this article.

We can easily see from the previous article that RootTask has no depencies, and that after RootTask executes RandomTask, OddJob, RedVendorFileDownload, and GreenVendorFileDownload are all eligible to run.   We can express this in code like so in the IExecutable implementation constructors:

        public RedVendorFileDownload()

        {//Depends on nothing but root task           

        }

        public RedVendorFileUnzip()

        {

            DependencyTypes.Add(typeof(RedVendorFileDownload));

        }

        public RedVendorFileImport()

        {

            DependencyTypes.Add(typeof(RedVendorFileUnzip));

        }

… and so forth.  There are other good ways to code this, for example creating a mechanims that would simple register the child types as Tree nodes rather than expressing the immediate prerequisite like this.  I needed to refresh my mind (reaching back to college) on Tree mechanics, so why not add all of my IExecutable commands to a list and write an algorithm to seat them:

            RootTask t = new RootTask();

            TaskTree workTree = new TaskTree(t);

            workTree.IsRoot = true;

            //

            OddJob oj = new OddJob();

            RandomTask rt = new RandomTask();

            RedVendorFileDownload redDown = new RedVendorFileDownload();

            RedVendorFileUnzip redZip = new RedVendorFileUnzip();

            RedVendorFileImport redImport = new RedVendorFileImport();

            RedVendorPhotoDownload redPhoto = new RedVendorPhotoDownload();

            GreenVendorFileDownload greenDown = new GreenVendorFileDownload();

            GreenVendorFileImport greenImport = new GreenVendorFileImport();

            GreenVendorFileUnzip greenUnzip = new GreenVendorFileUnzip();

 

            List<IExecutable> exe = new List<IExecutable>();                       

            exe.Add(redDown);

            exe.Add(oj);

            exe.Add(redImport);

            exe.Add(redZip);

            exe.Add(rt);

            exe.Add(redPhoto);

            exe.Add(greenDown);

            exe.Add(greenImport);

            exe.Add(greenUnzip);

 

            RandomzeAndPrint(exe);

 

            BuildTree(workTree, exe);

            Console.WriteLine("==========Work Tree:==========");

            Console.WriteLine(workTree.ToString());

RandomizeAndPrint() merely sorts the array by a random comparator so I could be sure I wasn’t cheating.  BuildTree is slightly more interesting:

        private static void BuildTree(TaskTree workTree, List<IExecutable> exe)

        {

            //We do this, which may take multiple iterations,

            while (!exe.TrueForAll(new Predicate<IExecutable>(IsHome)))

            {

                foreach (IExecutable i in exe)

                {

                    if (i.Home) { continue; }

                    //Tasks without dependencies can belong to the root node

                    if (i.DependencyInstances.IsNullOrEmpty() && i.DependencyTypes.IsNullOrEmpty())

                    {

                        i.Home = true;

                        Console.WriteLine("Adding {0} as child of {1}", i.FriendlyClassName(), workTree.Value.FriendlyClassName());

                        workTree.AddChild(new Tree<IExecutable>(i));

                    }

                    else

                    {

                        workTree.AddToTree(i);

                    }

                }

            }

        }

The iteration is needed in case we have not yet added the parent of the current IExecutable.  We could also do this using recursion and passing a smaller List<IExecutable> on each successive call to BuildTree.  This method depends on the TrueForAll implementation of Tree<TNodeType>, which is as follows:

        public bool TrueForAll(Predicate<TNodeType> pred)

        {

            bool allGood = TrueForAll(pred, this);

            return allGood;

        }

 

        protected bool TrueForAll(Predicate<TNodeType> pred, Tree<TNodeType> tree)

        {

            bool trueForAll = true;

            //Check root of current tree

            if (!pred.Invoke(tree.Value))

            {

                return false;

            }

            foreach (Tree<TNodeType> subTree in tree.ChildNodes)

            {

                //Can only be true if everything is true

                trueForAll = TrueForAll(pred, subTree);

                if (!trueForAll)

                {

                    return false;

                }

            }

 

            return trueForAll;

        }

The ToString() implementation on Tree<TNodeType> is overriden to print out each node value “beneath” it’s parent in an easy to read Console format.  In order to make some of the next steps more readable, I’ve created a class TaskTree which extends Tree<IExecutable>.

  Running my test program up to this point shows the randomly ordered List<IExecutable> being turned into a Tree<IExecutable> by the BuildTree method.

So, how do we put this together and actually run the IExecutable tasks in the order shown here while taking advantage of multi-core computing?  The next article will contain the actual Scheduler algorithm and some performance metrics.

 



Thursday, April 03, 2008 8:12:09 PM (Central Standard Time, UTC-06:00)  #    Disclaimer  |  Comments [0]  |  Trackback