Programming

# Parallel Programming with Microsoft Visual Studio 2010 : Task Parallelism - Sort Examples

8/18/2011 3:49:00 PM
Sorting a collection is the one way to demonstrate task parallelism. The examples in this article sort an integer collection. As you are probably aware, there are various sorting algorithms; this section uses three different ones, to compare them and identify the quickest of the three—a race of sort algorithms!

The sort algorithms to compare are bubble sort, insertion sort, and pivot sort. In this example, the potential dependency is the integer collection, which is sorted simultaneously by different sort algorithms. This is a problem because the integer collection is being sorted in place. There are various techniques for resolving a dependency. One solution is to make copies of the shared data and provide each task with a private copy, which is the approach chosen here; each task (sort algorithm) gets a copy of the original integer collection as a parameter.

1. Bubble Sort

Bubble sort is a commonly used sort algorithm. However, this does not mean it is the quickest sort algorithm; in fact, it’s probably one of the slowest sort algorithms. You’ll see just how slow a bubble sort is when compared to other sort algorithms. Popularity does not always equate to quality.

A bubble sort performs binary comparisons. It typically begins at the start of the collection and compares the value of the first two elements. If the second element is less than the first, it swaps the element’s position in the collection (assuming an ascending sort). It then performs the same comparison and swap operation on the second and third elements, and so on until the end of the collection. The sort is then repeated from the beginning until the collection is fully sorted. Here is the sample code.

`public static void BubbleSort(List<int> sortedList) {    int count = sortedList.Count;    int temporary;    for (int iteration = 0; iteration < count; ++iteration) {        for (int index = 0; index + 1 < count; ++index) {            if (sortedList[index] > sortedList[index + 1]) {                temporary = sortedList[index];                sortedList[index] = sortedList[index + 1];                sortedList[index + 1] = temporary;            }        }    }}`

#### 2. Insertion Sort

An insertion sort iterates over the collection several times. Each iteration places a selected item in the correct location in the sort sequence. For example, the initial iteration would scan the collection and place the first element in the correct position for the sorted sequence. The second iteration would scan the list again and place the second element in the correct position. This continues until each element of the list is positioned correctly. Insertion sorts can use two lists: an unsorted list (input) and a sorted list (output); however, this example uses only one list, performing the sort in place. The sort repositions each element by using an insert, reposition, and delete operation. Here’s the code.

`public static void InsertionSort(List<int> sortedList, CancellationToken token) {    int count = sortedList.Count;    int currentLocation, currentValue, insertionLocation;    sortedList.Insert(0, 0);    for (int location = 1; location < count + 1; ++location) {        currentLocation = location;        insertionLocation = location - 1;        currentValue = sortedList[currentLocation];        while (sortedList[insertionLocation] > currentValue) {            sortedList[currentLocation] = sortedList[insertionLocation];            --currentLocation;            --insertionLocation;        }        sortedList[currentLocation] = currentValue;    }    sortedList.Remove(0);}					  `

#### 3. Pivot Sort

Pivot sorts are fun! A pivot sort is commonly known as a quick sort. The algorithm first chooses a pivot value, dividing the collection into two collections. The first collection contains the elements that are less than the pivot value, while the second collection contains values greater than the pivot. You then perform a pivot sort on the two sub-collections by using new pivot values. You continue to divide and sort collections recursively until the sub-collections each contain one element. Finally, you assemble the sorted sub-collections to create the sorted list. Because this example sorts in ascending order, it always places lesser values in the left sub-collection and greater values in the right sub-collection. Here’s the sample code.

`public static void PivotSort(List<int> integerList, int start, int end, int pivot){    if (start < end)    {        pivot = integerList[end];        int location = start;        int bound = end;        while (location < bound)        {            if (integerList[location] < pivot)            {                ++location;            }            else            {                integerList[bound] = integerList[location];                integerList[location] = integerList[bound - 1];                --bound;            }        }        integerList[bound] = pivot;        PivotSort(integerList, start, bound - 1, pivot);        PivotSort(integerList, bound + 1, end, pivot);    }					  `

#### 4. Using the Barrier Class

For a horse race to be fair, the horses need to start at the same time. Horse racing uses a starter gate; the .NET Framework offers the Barrier class. The Barrier class is in the System.Threading namespace. It is introduced in the .NET Framework 4. You use the Barrier classes to create logical gates or phases in your application. When you initialize a Barrier, you can set a maximum number of elements. Until the maximum number is reached, adding an element to the Barrier will block the current task. When the Barrier reaches capacity, it “spills.” At that point, the waiting tasks execute. So when the Barrier is full (all the horses are at the gate), it releases all tasks and the race begins.

This diagram demonstrates a barrier with a capacity of three.

You could also think of a Barrier as a bucket. When the bucket is full, it will tip over. Of course, at that point, everything in the bucket is released.

Here are the helpful instance methods of the Barrier class:

• Barrier constructor(int participantCount) This creates an instance and sets Barrier capacity.

• void SignalAndWait() This signals that a task has reached a Barrier.

• long AddParticipants(int participantCount) This increases the capacity of the Barrier.

• void RemoveParticipants(int participantCount) This reduces the capacity of the Barrier.

The sorting example uses a Barrier to mark the start of the sorting phase. The maximum for the Barrier is set to three (bubble, insertion, and pivot sort). Using the Barrier guarantees that the three sort routines start at the same time, which makes for a fair race.

Each sort algorithm is started in a separate code region. Each region follows the same general pattern:

1. Duplicate list

2. Create task for sort algorithm

Here is the Insertion Sort Region, which is one of the three sort regions.

`// Insertion SortRegionList<int> insertionList = integerList.ToList();Task taskInsertionSort = new Task(() => {    sortBarrier.SignalAndWait();    using (new SortResults("Insertion Sort")){        SortAlgorithms.InsertionSort(insertionList);    }});taskInsertionSort.Start();`

You also need some code to display the duration of each sort algorithm to find out which sort algorithm is the quickest. This code waits for the sorting tasks to complete, then iterates and displays the results.

`Task.WaitAll(new Task[] { taskBubbleSort, taskInsertionSort, taskPivotSort });foreach (string result in SortResults.results){    Console.WriteLine(result);}					  `

When you run this, you’ll get results similar to the ones below, which show the result of sorting 10,000 integers with durations in milliseconds. And the winner is...pivot sort! The race was not even close.

Where does the code calculate the duration? First, a Stopwatch class, which is defined in the System.Diagnostics namespace, is used to track the duration of each algorithm. In this program, the SortResults class is a wrapper for the Stopwatch class and calculates the duration of a sort algorithm.

Here is a general description of the SortResults class.

1. In the SortResults class, I create an instance of the Stopwatch class as a member field.

2. The constructor calls the Start method of the Stopwatch instance and sets the name of the sort algorithm.

3. The Dispose method calls the Stop method of the Stopwatch. The result is then formatted and added to a results collection. The results collection is later iterated to display the results.

Here is the code for the class.

`class SortResults : IDisposable{    // Instance Members    public  SortResults(string name){         sortName = name;         _stopwatch.Start();    }    private Stopwatch _stopwatch = new Stopwatch();    private string sortName;    public void Dispose(){        _stopwatch.Stop();        results.Add(string.Format("{0} : {1:N0}ms", sortName,            _stopwatch.ElapsedMilliseconds));    }    // Classwise members (static)    // The rest of the class...}`

Each sort region uses the SortClass via a using statement (see the sort regions discussion, earlier in this section). Here is a snippet of code from the bubble sort region.

`using (new SortResults("Bubble Sort")){  // Perform sort}`

What is being done in the using statement and subsequent block? SortResults objects are disposable and can be instantiated in a using statement. In our example, an anonymous SortResult object is created. As shown previously, the SortResults object creates a Stopwatch and starts it in the constructor. The closing brace of the using statement is important. When execution reaches the closing brace, it calls the SortResults.Dispose method, which stops the Stopwatch and records the results.

#### 5. Refactoring the Pivot Sort

The pivot sort algorithm performed better than the bubble and insertion sort algorithms. It is the champion. Magnifique! However, even the pivot sort could be improved. One challenge of parallel programming is identifying potential opportunities for parallelism. Of course, not every opportunity for parallelism should be exploited. Sometimes the cost of parallelism offsets the benefits.

The pivot sort works by pivoting around a value. It creates less than and greater than collections in which the pivot value is the delineating factor between the two. It then sorts each sub-collection along a new pivot value, and so on. In the pivot sort example used previously, the sort is performed sequentially. Sorting the sub-collections in parallel could improve performance, but possibly not in every circumstance. For example, if one of the sub-collections were significantly larger than the other, sorting the collections in parallel would provide only a limited advantage. Sorting the sub-collections in parallel only makes sense when the left and right collections are relatively close to the same size.

In the following example, the code for the pivot sort has been refactored to sort sub-collections in parallel. In this example, the sub-collections must be of similar size and must contain more than 50 items to sort in parallel; otherwise, the algorithm chooses a sequential sort. These changes improve the pivot sort by about 20 percent for a collection of 10,000 integers.

These are the steps to pivot sort the sub-collections.

1. If a collection contains fewer than 50 items, don’t perform a parallel sort.

2. Check whether the collections have similar sizes.

3. If the sizes are similar, sort the sub-collections in parallel.

4. When the sizes differ significantly, sort the sub-collections sequentially.

Here is the refactored sample code.

`if (sortedList.Count >50) {double delta = ((double)left.Count) / right.Count;if ((delta > .75) && (delta < 1.333)) {var taskLeft = Task.Factory.StartNew(() => PivotSort(left));                var taskRight = Task.Factory.StartNew(() => PivotSort(right));                rleft = taskLeft.Result;                rright = taskRight.Result;}else{            rleft = PivotSort(left);            rright = PivotSort(right);}}					  `

 Top 10