Feeds:
Posts
Comments

Archive for the ‘parallel’ Category

Blog has moved! You should check this post out on kevinpelgrims.com

Now that we’ve covered the Task Parallel Library, it’s time to move on.

What is PLINQ?

PLINQ stands for Parallel LINQ and is simply the parallel version of LINQ to Objects. Just like LINQ you can use it on any IEnumerable and there’s also deferred execution. Using PLINQ is even easier than using the Task Parallel Library!

Regular for loop and LINQ compared to PLINQ (with time in seconds)

How do we use PLINQ?

You can even make existing LINQ queries parallel simply by adding the AsParallel() method. That’s how easy it is! This makes it easy to use the power of parallelization, while enjoying the readability of LINQ. Isn’t that great?

var employees = GetEmployees();

// Regular LINQ
var query = employees.Select(e => e.Skills.Contains("C#"));

// Extension method style PLINQ
var queryParallel1 = employees.AsParallel()
                              .Select(e => e.Skills.Contains("C#"));

// Query expression style PLINQ
var queryParallel2 = from e in employees.AsParallel()
                     where e.Skills.Contains("C#")
                     select e;

Important fact: PLINQ uses all the processors of your system by default, with a maximum of 64. In some cases you might want to limit this, to give a machine some more power to take care of other tasks. Everybody deserves some CPU time! So don’t be greedy and use WithDegreeOfParallelism() on heavy queries. Following example uses a maximum of 3 processors, even if there are 16 available.

var queryDegree = employees.AsParallel()
                           .WithDegreeOfParallelism(3)
                           .Select(e => e.Skills.Contains("C#"));

By default PLINQ doesn’t care about the order of your output, compared to the input. This is because order preservation costs more time. You can enable order preservation though, again in a very simple way, by using the AsOrdered() method. It’s good to know that OrderBy() will also take care of order preservation.

var employees = GetEmployeesOrderedByName();

var queryOrdered = employees.AsParallel()
                            .Select(e => e.Skills.Contains("C#"))
                            .AsOrdered();

We want more!

PLINQ has a lot more to offer than what we talked about here, so be sure to use Google and MSDN if you want to know more. Check out this “old” (2007) yet interesting article on PLINQ from MSDN magazine. An important read is Understanding Speedup in PLINQ on MSDN, which explains a bit more of how PLINQ works and why it sometimes defaults to sequential mode anyway.

Advertisements

Read Full Post »

Blog has moved! You should check this post out on kevinpelgrims.com

I have talked about parallel programming in .NET before, very briefly: Parallel programming in .NET – Introduction. This follow-up post is long overdue 🙂

What is the TPL?

The Task Parallel Library is a set of APIs present in the System.Threading and System.Threading.Tasks namespaces. The point of these APIs is to make parallel programming easier to read and code. The library exposes the Parallel.For and Parallel.ForEach methods to enable parallel execution of loops and takes care of spawning and terminating threads, as well as scaling to multiple processors.

How do we use the TPL?

Following code uses the sequential and the parallel approach to go over a for-loop with some heavy calculations. I use the StopWatch class to compare the results in a command window.

//Sequential
watch = new Stopwatch();
watch.Start();
for (int i = 0; i < 20000; i++)
{
    SomeHeavyCalculations(i);
}
watch.Stop();
Console.WriteLine("Sequential Time: " + watch.Elapsed.Seconds.ToString());

//Parallel
watch = new Stopwatch();
watch.Start();
System.Threading.Tasks.Parallel.For(0, 20000, i =>
{
    SomeHeavyCalculations(i);
}
);
watch.Stop();
Console.WriteLine("Parallel Time: " + watch.Elapsed.Seconds.ToString());

The result of running this on my laptop (with multiple cores) looks like this:

Result of comparison sequential - parallel

As you can see, the parallel for-loop runs A LOT faster than the sequential version. By using all the available processing power, we can speed up loops significantly!

Below is a screenshot of the task manager keeping track of what’s happening  while executing the sequential and the parallel. What we can see here is that at first (where the red arrow is pointing at) we only use 1 core heavily. When the parallel code kicks in, all cores peak.

Task manager during comparison sequential - parallel

So, looking at the above code, implementing all this parallelism doesn’t seem to be that hard. The TPL makes it pretty easy to make use of all the processors in a machine.

Creating and running tasks

It’s possible to run a task implicitly by using the Parallel.Invoke method.

Parallel.Invoke(() => DoSomething())
Parallel.Invoke(() => DoSomething(), () => DoSomethingElse())

All you need to do is pass in a delegate, using lamba expressions makes this easy. You can call a named method or have some inline code. If you want to start more tasks concurrently, you can just insert more delegates to the same Parallel.Invoke method.

If you want more control over what’s happening, you’ll need to use a Task object, though. The task object has some interesting methods and properties that we can use to control the flow of our parallel code.

It is possible to use new Task() to create a new task object, but it’s a best practice to use the task factory. (Note that you can’t use the task factory if you want to separate the creation and the scheduling of the task.)

// Create a task and start it
var task1 = new Task(() => Console.WriteLine("Task1 says hi!"));
task1.Start();

// Create a task using the task factory
var task1 = Task.Factory.StartNew(() => Console.WriteLine("Task1 says hi!"));

You can also get results from a task, by accessing the Result property. If you access it before the task is completed, the thread will be blocked until the result is available.

Task<int> taskreturn = Task.Factory.StartNew(() =>
  {
    int calc = 3 + 3;
    return calc;
  });
int result = taskreturn.Result;

To be continued..

You can chain tasks by using the Task.ContinueWith method. It’s also possible to access the result of the preceding task in the next one, using the Result property.

// Regular continuation
Task<int> task1 = Task.Factory.StartNew(() => 5);
Task<string> task2 = task1.ContinueWith(x => PrintInt(x.Result));

// Chained continuation
Task<string> task1 = Task.Factory.StartNew(() => 5)
                     .ContinueWith(x => PrintInt(x.Result));

The methods ContinueWhenAll() and ContinueWhenAny() make it possible to continue from multiple tasks by taking in an array of tasks to wait on and the action to be undertaken when those have finished. More about those functions can be found on MSDN.

The force is strong with this one

We only looked at a few functions of the TPL and I think it’s clear this is a very powerful library. When working on applications that need a lot of processing power, parallel programming in .NET can make it easier to improve performance, a lot.

Resources

Of course there is a lot more to TPL than covered in this small introduction, so go ahead and explore!

Read Full Post »

Blog is moving! You should check this post out on kevinpelgrims.com

What is parallel programming?

Sequential programming assumes a set of instructions that are executed sequentially by a single processor. The point of parallel programming is to program systems that consist of multiple processors and therefore have multiple simultaneous instruction streams.

Why do we need it?

The clock speeds of chips are no longer increasing. All future improvements in computer speed will come from parallelism. In other words, more and more processors are being added to new computers. These chips enable software to do multiple tasks at the same time. Each processor it’s own task. So one way to make applications run faster, is to use parallel programming.
This is not the same as multithreading, though. Multithreading can be done on a single core CPU. In that case, two threads can never execute at the same time on the CPU. The operating system (which takes care of multi-threading) divides the time of the processor between all open threads. When you have too many executing threads at once, your system slows down as there is not enough time for all the threads to run at full speed.

Concurrent vs Parallel

To be clear, parallel programming is not the same as concurrent programming. As they say, a picture says more than a thousand words:

Comparison between concurrent and parallel

Well, maybe it could use some explanation.
Concurrent applications tend to create a thread that handles a whole series of tasks. Most of the time these concurrent applications create threads because they need an isolated process for a concurrent event.
Parallel applications divide a process into small tasks that are executed on seperate threads. Because the tasks are small, the threads can be divided evenly over the processors, resulting in very efficient use of a multi-core CPU.

Parallel programming in .NET 4.0

In the .NET 4.0 framework parallel is included in the form of Task Parallel Library (TPL) and Parallel LINQ (PLINQ). These functions are actually built on top of the existing thread pool in previous versions of the .NET framework. Parallel is pretty easy to implement, but also very easy to misuse or overuse. Caution is required!

Overview of parallelism in .NET (MSDN)

The following posts will give an introduction to parallel programming in C# 4.0 and provide some guidelines and best practices.

Resources

Read Full Post »