Welcome to

Magenic Technologies Community Blog

Sign in | Join | Help

Parallel For Loops with Thread Local State

A recent MSDN post contained the following code:

string[] entropy_array = { "e", "r", "t", "y", "u", "ı", "o", "p",
                     "ğ", "ü", "a", "s", "d", "f", "g", "h", "j", "k", "l",
                     "ş", "i", "z", "c", "v", "b", "n", "m", "ç", "ö", " "};

Dictionary<string, int> entropy = new Dictionary<string, int>();
     

Parallel.For(0, entropy_array.Length, i =>
{
    for (int j = 0; j < entropy_array.Length; j++)
    {
        entropy.Add(entropy_array[i] + entropy_array[j], 0);
    }
});

The author of the post, who wanted to add all two letter combinations from the letters found in the entropy_array array into the entropy dictionary, wondered why the number of elements in the dictionary varied from run to run.

The problem with the code is that the introduction of Parallel.For() brings multiple threads into the picture, and, since Dictionary<T1, T2>.Add() isn't thread safe, correctness can't be guaranteed.

Huseyin Yildiz proposed a solution to this problem. His solution involved setting up a dictionary for each thread created by Parallel.For(), letting each thread do its work, and placing the intermediate results in the thread-local dictionary. Once the thread ended, the results from the thread-local dictionary could be added to the main dictionary.

This scenario is entirely possible using one of the overloads to Parallel.For(). I'll start by posting the solution, built as a simple console application, and will spend the rest of this blog post explaining the code:

class Program
{
    static void Main(string[] args)
    {
        Dictionary<string, int> Entropy = new Dictionary<string, int>();
        string[] EntropyArray = { "e", "r", "t", "y", "u", "ı", "o", "p",
                     "ğ", "ü", "a", "s", "d", "f", "g", "h", "j", "k", "l",
                     "ş", "i", "z", "c", "v", "b", "n", "m", "ç", "ö", " "};

        Parallel.For(
            0,
            EntropyArray.Length,
            () => new Dictionary<string, int>(),
            (i, state) =>
            {
                for (int j = 0; j < EntropyArray.Length; j++)
                {
                    state.ThreadLocalState.Add(EntropyArray[i] + EntropyArray[j], 0);
                }
            },
            (SubList) =>
            {
                lock (EntropyArray)
                {
                    foreach (KeyValuePair<string, int> Pair in SubList)
                        Entropy.Add(Pair.Key, Pair.Value);
                }
            }
            );
    }
}

The overload of Parallel.For() employed here takes five parameters:

  • int fromExclusive: The zero-based index used as the start of the loop.
  • int toExclusive: The zero-based index used as the stopping point of the loop. Once the loop indexer gets to this value, processing wil stop.
  • Func<Tlocal> threadLocalSelector: A function that returns the object to be used as the thread-local object for the thread executing its piece of the total loop.
  • Action<int,ParallelState<Tlocal>> body: The action to be taken by each loop iteration.
  • Action<Tlocal> threadLocalCleanup: A function specifying cleanup code to be executed once the thread's portion of the processing has finished.

The fromExclusive parameter in my example is simple:

    0,

The toExclusive parameter in my example is also simple:

    EntropyArray.Length,

The threadLocalSelector parameter is a function body that creates a new string-to-int dictionary to be used as the thread-local object for the thread:

    () => new Dictionary<string, int>(),

The body parameter performs the processing, placing the results in the thread-local dictionary:

    (i, state) =>
    {
        for (int j = 0; j < EntropyArray.Length; j++)
        {
            state.ThreadLocalState.Add(EntropyArray[i] + EntropyArray[j], 0);
        }
    },

The first parameter passed to the body is the loop indexer, and the second parameter is an object of type ParallelState, which has a property called ThreadLocalState containing the thread-local object.

Finally, the threadLocalCleanup parameter specifies the code run when the thread has finished its execution:

    (SubList) =>
    {
        lock (EntropyArray)
        {
            foreach (KeyValuePair<string, int> Pair in SubList)
                Entropy.Add(Pair.Key, Pair.Value);
        }
    }

This code, which is called with the thread-local object as a parameter, adds the items in the thread-local list to the main list. Because the Add() method is not thread safe, the code block is locked.

The caveat to all of this is that this code, with its copying and object allocations, may not be as performant as a sequential algorithm, given the small input set. Given a larger input set, however, the performance gains will most likely outweigh the additional overhead.

Published Monday, March 31, 2008 8:14 PM by jefff

Comments

No Comments

Anonymous comments are disabled