Getting Started with OpenMP-II
Managing Shared and Private Data
Nearly every loop (at least if it's a useful one) reads or writes memory, and it's the programmer's job to tell the compiler which pieces of memory should be shared among the threads and which pieces should be kept private. When memory is identified as shared, all threads access the exact same memory location. When memory is identified as private, however, a separate copy of the variable is made for each thread to access in private. When the loop ends, these private copies are destroyed. By default, all variables are shared except for the loop variable, which is private. Memory can be declared as private in the following two ways.
* Declare the variable inside the loop-really inside the parallel OpenMP directive-without the static keyword.
* Specify the private clause on an OpenMP directive.
The following loop fails to function correctly because the variable temp is shared. It needs to be private
// WRONG. Fails due to shared memory.
// Variable temp is shared among all threads, so while one thread
// is reading variable temp another thread might be writing to it
#pragma omp parallel for
for (i=0; i < 100; i++)
{
temp = array[i];
array[i] = do_something(temp);
}
The following two examples both declare the variable temp as private memory, which fixes the problem.
// This works. The variable temp is now private
#pragma omp parallel for
for (i=0; i < 100; i++)
{
int temp; // variables declared within a
parallel construct are, by definition, private
temp = array[i];
array[i] = do_something(temp);
}
//This also works. The variable temp is declared private
#pragma omp parallel for private(temp)
for (i=0; i < 100; i++)
{
temp = array[i];
array[i] = do_something(temp);
}
Every time you instruct OpenMP to parallelize a loop, you should carefully examine all memory references, including the references made by called functions. Variables declared within a parallel construct are defined as private except when they are declared with the static declarator, because static variables are not allocated on the stack.
Reductions
Loops that accumulate a value are fairly common, and OpenMP has a specific clause to accommodate them. Consider the following loop that calculates the sum of an array of integers.
sum = 0;
for (i=0; i < 100; i++){
sum += array[i]; // this
//variable needs to be shared to generate
//the correct results, but private to avoid
//race conditions from parallel execution
}
The variable sum in the previous loop must be shared to generate the correct result, but it also must be private to permit access by multiple threads. To solve this case, OpenMP provides the reduction clause that is used to efficiently combine the mathematical reduction of one or more variables in a loop. The following loop uses the reduction clause to generate the correct results.
sum = 0;
#pragma omp parallel for reduction(+:sum)
for (i=0; i < 100; i++){
sum += array[i];
}
Under the hood, OpenMP provides private copies of the variable sum for each thread, and when the threads exit, it adds the values together and places the result in the one global copy of the variable.
The following table lists the possible reductions, along with the initial variables-which is also the mathematical identify value-for the temporary private variables.
Multiple reductions in a loop are possible by specifying comma-separated variables and reductions on a given parallel construct. The only requirements are:
1. the reduction variables can be listed in just one reduction
2. they cannot be declared constant, and
3. they cannot be declared private in the parallel construct.