Damon Payne: Hand waving Silverlight Architect

103db signal to noise ratio at < .03% total harmonic distortion
Solution Architect, software developer, geek
Damon Payne at Blogged
2009 Microsoft MVP - Client App Dev
2007 Microsoft MVP - Solution Architecture
 Tuesday, April 08, 2008
« Managing Concurrency With Trees[4] | Main | Klipschtastic Deep Zoom Sample »

I have been told by more than one person that my Tree Concurrency articles were pretty, but vague and not applicable to what most developers are working on.  I had to consider this carefully.  I think it’s a natural way to express concurrency plus the dependency of one Task upon another.  Still, how many people would use such a scheme?

Long ago I wrote about this notion of Cage Builders (http://www.damonpayne.com/2005/05/10/CageBuilders.aspx) ; my favorite admonishment I heard constantly was “Don’t EVER use threads unless there’s no other way”.  No other way to what, I wonder?  I believe we are definitely past this ridiculous idea, or very nearly so anyway.  This is certainly a consideration for framework developers, and soon to be a common consideration of people developing line of business apps as well.  Concocted whilst I ate lunch today, here are five situations where any software developer could use the help of concurrency in their day to day work.

1 – Unit Testing

Real world projects could potentially have hundreds or thousands of unit tests in a solution.  If your team does not believe in unit tests, well, good for you, you won’t have this problem.  A good unit test strategy, especially for large projects with many people touching code, might be as follows:

1)      Check out code

2)      Run unit tests

3)      Make changes

4)      Create unit tests for your code

5)      Run unit tests to ensure nothing else has been broken by your code

6)      Get latest code that changed while you were working

7)      Run unit tests again

8)      Check in

 Running these tests before check ins and after check outs could become a non trivial chunk of your day.  Assuming one is following the best practice that each unit test must stand alone (set up any state needed for the test, run the test, destroy any created state) a group of Unit Tests is a great thing to be ran in parallel.  A suite of tests that takes a long time to run dis-incents developers from running them, and from adding to them. 

Unit test tools could benefit greatly from running in Parallel.  This would not only be useful on many-core machines.  Even on a single core machine (becoming rare as time goes on) you are very likely to have an application that contains a combination of business logic, disk access, computation logic, and so forth.  If Unit tests are being run in parallel, compute- or disk-bound tests can execute while tests requiring out of process calls block on network I/O.

If you are NOT following the practice of making unit tests stand alone, I happen to have published some libraries to manage in-order concurrent execution of tasks starting here http://www.damonpayne.com/ct.ashx?id=03c5e472-578d-4eee-ac28-5fca6434f617&url=http%3a%2f%2fwww.damonpayne.com%2fdefault.aspx%23a89abf6be-1354-40f4-a4cc-facaa28e6c4f.

 

2 – Batch Routines

While not as embarrassingly parallel as a Ray Tracer (http://msdn2.microsoft.com/en-us/magazine/cc163340.aspx) , many data import, data export, image generation, or data processing processes could take advantage of concurrency.  This may be the age of SOA, the Internet Service Bus, Ajax, distributed systems and “Real Time” in general, but a lot of business is still done and money is made using Batch processes.  One of CarSpot’s most useful features that keep customers around is our ability to send data and photos to AutoTrader.com, Cars.com, and so forth.  This is still mostly accomplished using Batch processing, FTP get and put, and a Zip file or two.   If you can make your batch processes go faster you can run them more often.  Running more often means the appearance of near real-time to your customers.

3 – Opening Files

Have you ever popped up an OpenFileDialog that allows the user to select multiple things to Display, Modify, or otherwise “use”?  You may ultimately be limited by the file system or whatever you are retrieving things from, but why not use concurrency so that things pop faster for the user?  This is especially true if you are opening multiple files at once and then computing something based on the file contents.  Using the Command pattern, which I am obviously obsessed with lately, works well for this.

4 – Application Startup

I don’t know how many times I’ve seen an ASP.Net application that caches a bunch of data on startup, and starts a logging engine, and starts an NHibernate or other ORM repository.  It’s easy to see from watching the Output window in the debugger that these things often involve out of process calls, JIT-ing, code generation, reflection emit, or other operations that take a little time.  Use concurrency and save a noticeable amount time when you are debugging hundreds of times per day.  Package all of these startup tasks as discrete units of work and get more done every day.

5 – Compiling Code

 In the beginning there was UNIX.  On UNIX, there was make.  Make makes your projects; make was a command line build tool.  As a build tool, one of make’s primary jobs was to determine when things were dirty and what needed to be built. You basically told make that these files over here (like *.cpp in such and such a directory) were processed with this tool over here (such as a C++ compiler) to produce such-and-such an output (like an executable or library).  If several object files from several directories needed to be combined into a shared object (DLL), make could determine the dependencies and build first what could be built.  Since make’s first job was dependency checking, this made a parallel-make easier to build and there were parallel and even distributed make implementations long ago.  Compiling hundreds of thousands of lines of C++ in 1998 was painful.

We need ms-build to use concurrency to build my solutions faster.   Ms-build already knows what projects depend on what other projects.  Make it so.

The Sadly Missing #6

Obviously there are more cases than what I’ve listed here.  I plan on sitting in my home theater with a nice chateauneuf-du-pape and thinking back over every application I’ve worked on in my consulting career to dig up some more ideas.  There is one thing that continually bothers me: this whole GUI thread thing.  To some degree, the Next Big Thing in concurrency has got to be something about solving this single-threaded painting issue.  I hope to attack this topic, but not until this winter unless a huge revelation comes to me out of nowhere.



Friday, April 11, 2008 6:55:15 AM (Central Standard Time, UTC-06:00)
A quick comment about your #5: Most build tools do use concurrency with a dependancy tree like you are talking about (my build tool of choice is SCons, which was the example I gave when you first brought up all this stuff whenever ago). But even "make" has -j which, does allow you to use multiple processes to perform a build, even if it uses those processes ineffeciently. As a run of thumb, a lot of people these days always type make -j N+1 where N is the number of CPUs/cores you have.
Friday, April 11, 2008 10:50:30 PM (Central Standard Time, UTC-06:00)
Dave,
When you say "most", I think you mean "most build tools have the capability", rather than most build tools are doing this by default. Make doesn't do this unless you tell it to, and the tools we .NET people are using do not seem to do this even when we're telling them to. The ideal is that the IDE would invoke multi-core builds for us, to save us time. It is a little sad that something that's been around for years and years is just now making it in to the main stream.
Name
E-mail
Home page

Comment (Some html is allowed: a@href@title, strike) where the @ means "attribute." For example, you can use <a href="" title=""> or <blockquote cite="Scott">.  

Enter the code shown (prevents robots):

Live Comment Preview