Feeds:
Posts
Comments

Archive for the ‘.net’ Category

Blog has moved! You should check this post out on kevinpelgrims.com

Now that we’ve covered the Task Parallel Library, it’s time to move on.

What is PLINQ?

PLINQ stands for Parallel LINQ and is simply the parallel version of LINQ to Objects. Just like LINQ you can use it on any IEnumerable and there’s also deferred execution. Using PLINQ is even easier than using the Task Parallel Library!

Regular for loop and LINQ compared to PLINQ (with time in seconds)

How do we use PLINQ?

You can even make existing LINQ queries parallel simply by adding the AsParallel() method. That’s how easy it is! This makes it easy to use the power of parallelization, while enjoying the readability of LINQ. Isn’t that great?

var employees = GetEmployees();

// Regular LINQ
var query = employees.Select(e => e.Skills.Contains("C#"));

// Extension method style PLINQ
var queryParallel1 = employees.AsParallel()
                              .Select(e => e.Skills.Contains("C#"));

// Query expression style PLINQ
var queryParallel2 = from e in employees.AsParallel()
                     where e.Skills.Contains("C#")
                     select e;

Important fact: PLINQ uses all the processors of your system by default, with a maximum of 64. In some cases you might want to limit this, to give a machine some more power to take care of other tasks. Everybody deserves some CPU time! So don’t be greedy and use WithDegreeOfParallelism() on heavy queries. Following example uses a maximum of 3 processors, even if there are 16 available.

var queryDegree = employees.AsParallel()
                           .WithDegreeOfParallelism(3)
                           .Select(e => e.Skills.Contains("C#"));

By default PLINQ doesn’t care about the order of your output, compared to the input. This is because order preservation costs more time. You can enable order preservation though, again in a very simple way, by using the AsOrdered() method. It’s good to know that OrderBy() will also take care of order preservation.

var employees = GetEmployeesOrderedByName();

var queryOrdered = employees.AsParallel()
                            .Select(e => e.Skills.Contains("C#"))
                            .AsOrdered();

We want more!

PLINQ has a lot more to offer than what we talked about here, so be sure to use Google and MSDN if you want to know more. Check out this “old” (2007) yet interesting article on PLINQ from MSDN magazine. An important read is Understanding Speedup in PLINQ on MSDN, which explains a bit more of how PLINQ works and why it sometimes defaults to sequential mode anyway.

Advertisements

Read Full Post »

Blog has moved! You should check this post out on kevinpelgrims.com

I have talked about parallel programming in .NET before, very briefly: Parallel programming in .NET – Introduction. This follow-up post is long overdue 🙂

What is the TPL?

The Task Parallel Library is a set of APIs present in the System.Threading and System.Threading.Tasks namespaces. The point of these APIs is to make parallel programming easier to read and code. The library exposes the Parallel.For and Parallel.ForEach methods to enable parallel execution of loops and takes care of spawning and terminating threads, as well as scaling to multiple processors.

How do we use the TPL?

Following code uses the sequential and the parallel approach to go over a for-loop with some heavy calculations. I use the StopWatch class to compare the results in a command window.

//Sequential
watch = new Stopwatch();
watch.Start();
for (int i = 0; i < 20000; i++)
{
    SomeHeavyCalculations(i);
}
watch.Stop();
Console.WriteLine("Sequential Time: " + watch.Elapsed.Seconds.ToString());

//Parallel
watch = new Stopwatch();
watch.Start();
System.Threading.Tasks.Parallel.For(0, 20000, i =>
{
    SomeHeavyCalculations(i);
}
);
watch.Stop();
Console.WriteLine("Parallel Time: " + watch.Elapsed.Seconds.ToString());

The result of running this on my laptop (with multiple cores) looks like this:

Result of comparison sequential - parallel

As you can see, the parallel for-loop runs A LOT faster than the sequential version. By using all the available processing power, we can speed up loops significantly!

Below is a screenshot of the task manager keeping track of what’s happening  while executing the sequential and the parallel. What we can see here is that at first (where the red arrow is pointing at) we only use 1 core heavily. When the parallel code kicks in, all cores peak.

Task manager during comparison sequential - parallel

So, looking at the above code, implementing all this parallelism doesn’t seem to be that hard. The TPL makes it pretty easy to make use of all the processors in a machine.

Creating and running tasks

It’s possible to run a task implicitly by using the Parallel.Invoke method.

Parallel.Invoke(() => DoSomething())
Parallel.Invoke(() => DoSomething(), () => DoSomethingElse())

All you need to do is pass in a delegate, using lamba expressions makes this easy. You can call a named method or have some inline code. If you want to start more tasks concurrently, you can just insert more delegates to the same Parallel.Invoke method.

If you want more control over what’s happening, you’ll need to use a Task object, though. The task object has some interesting methods and properties that we can use to control the flow of our parallel code.

It is possible to use new Task() to create a new task object, but it’s a best practice to use the task factory. (Note that you can’t use the task factory if you want to separate the creation and the scheduling of the task.)

// Create a task and start it
var task1 = new Task(() => Console.WriteLine("Task1 says hi!"));
task1.Start();

// Create a task using the task factory
var task1 = Task.Factory.StartNew(() => Console.WriteLine("Task1 says hi!"));

You can also get results from a task, by accessing the Result property. If you access it before the task is completed, the thread will be blocked until the result is available.

Task<int> taskreturn = Task.Factory.StartNew(() =>
  {
    int calc = 3 + 3;
    return calc;
  });
int result = taskreturn.Result;

To be continued..

You can chain tasks by using the Task.ContinueWith method. It’s also possible to access the result of the preceding task in the next one, using the Result property.

// Regular continuation
Task<int> task1 = Task.Factory.StartNew(() => 5);
Task<string> task2 = task1.ContinueWith(x => PrintInt(x.Result));

// Chained continuation
Task<string> task1 = Task.Factory.StartNew(() => 5)
                     .ContinueWith(x => PrintInt(x.Result));

The methods ContinueWhenAll() and ContinueWhenAny() make it possible to continue from multiple tasks by taking in an array of tasks to wait on and the action to be undertaken when those have finished. More about those functions can be found on MSDN.

The force is strong with this one

We only looked at a few functions of the TPL and I think it’s clear this is a very powerful library. When working on applications that need a lot of processing power, parallel programming in .NET can make it easier to improve performance, a lot.

Resources

Of course there is a lot more to TPL than covered in this small introduction, so go ahead and explore!

Read Full Post »

Blog has moved! You should check this post out on kevinpelgrims.com

Yesterday I attended a session on unit testing by Roy Osherove in Copenhagen. As I am trying to learn more about unit testing and TDD by applying it in a pet project, it was very interesting to see what a veteran like Roy had to say about the subject of unit testing. Also very interesting was his approach in this session, as he tried to teach us about good habits by showing us bad (real world) examples.

He also pointed out that anyone interested in writing unit tests and working test driven should do test reviews. It can be used as a learning tool, for example test review some open source projects. But it can also be used internally almost as a replacement of code reviews, because reviewing tests takes a lot less time and should give you a good idea of what the code is supposed to do (when working test driven).

I took some notes during the session that I would like to share – and keep here for my own reference 😉 I wrote down most of his tips, so to the unit testing experts out there some of it might seem really basic. But I thought it was interesting to have it all written down.

Three important words

The basic, yet very important requirements for tests:

  • Readable
  • Maintainable
  • Trustworthy

Unit test VS integration test

Unit tests are used for testing stuff in memory. The tests don’t change and they’re static. They don’t depend on other things.

Integration tests would be used when there is a dependency on the filesystem, a database, a Sharepoint server, etc.

Unit tests and integration tests have their own testproject!

Basics

  • Avoid test logic: too complicated
    • Ifs, switches, for loops, ..
  • No multiple asserts
    • This can be okay when you’re asserting using the same object
  • Avoid “magic numbers”
    • Using the number 42 somewhere raises the question whether it is important that the number is equal to 42; a good idea would be to use a variable with a descriptive name
  • Don’t assert on calculations or concatenations
    • Assert(“user,password”, Bleh()) is better than Assert(user + “,” + password, Bleh())
  • Don’t change or remove tests!
  • DateTime.Now (or friends like Random) –> NOT okay! These values change everytime
  • Test only publics

Reuse

  • Factory methods (usually in the same class as the tests using them)
    • make_xx
  • Configure initial state
    • init_xx
  • Common tests in common methods
    • verify_xx

Tests are isolated

  • Don’t call other tests in a test
  • No shared state, have cleanup code for shared objects

Mock != Stub (in short)

  • Mock = used for asserts
  • Stub = used to help the test
  • Fake = can be both

Tip

If you need to test things related to a database, that would be an integration test and it’s a good idea to use the TransactionScope class in .NET so you can rollback everything when the test is done.

Read Full Post »