I am struggling in using Parallel.For in the below code instead of for loop. Since the size of the CoefficientVector vector array is rather big, it makes sense to me only to reset the array elements value instead of creating it new for each iteration.
I try to replace the outer loop with Parallel For; and assuming each partition of the parallel for, ran by a separate thread, will have it's own copy of CoefficientVector class it therefore makes sense(?) to me to have one instance of the CoefficientVector object for each thread and reset the vector elements rather than recreating the array. I though find it hard to do this optimisation(?) on Parallel For. Could anyone help please.
static void Main(string[] args)
{
System.Diagnostics.Stopwatch timer = new System.Diagnostics.Stopwatch();
timer.Start();
int numIterations = 20000;
int numCalpoints = 5000;
int vecSize = 10000;
CalcPoint[] calcpoints = new CalcPoint[numCalpoints];
CoefficientVector coeff = new CoefficientVector();
coeff.vectors = new Vector[vecSize];
//not sure how to correctly use Parallel.For here
//Parallel.For(0, numCalpoints, =>){
for (int i = 0; i < numCalpoints;i++)
{
CalcPoint cp = calcpoints[i];
//coeff.vectors = new Vector[vecSize];
coeff.ResetVectors();
//doing some operation on the matrix n times
for (int n = 0; n < numIterations; n++)
{
coeff.vectors[n].x += n;
coeff.vectors[n].y += n;
coeff.vectors[n].z += n;
}
cp.result = coeff.GetResults();
}
Console.Write(timer.Elapsed);
Console.Read();
}
}
class CoefficientVector
{
public Vector[] vectors;
public void ResetVectors()
{
for (int i = 0; i < vectors.Length; i++)
{
vectors[i].x = vectors[i].y = vectors[i].z = 0;
}
}
public double GetResults()
{
double result = 0;
for (int i = 0; i < vectors.Length; i++)
{
result += vectors[i].x * vectors[i].y * vectors[i].z;
}
return result;
}
}
struct Vector
{
public double x;
public double y;
public double z;
}
struct CalcPoint
{
public double result;
}
Parallel.For
method currently has 12 overloads. Besides the variations of int
, long
, ParallelOptions
and ParallelState
action arguments you can notice several having additional generic argument TLocal
like this:
public static ParallelLoopResult For<TLocal>(
int fromInclusive,
int toExclusive,
Func<TLocal> localInit,
Func<int, ParallelLoopState, TLocal, TLocal> body,
Action<TLocal> localFinally
)
Executes a for loop with thread-local data in which iterations may run in parallel, and the state of the loop can be monitored and manipulated.
In other words, TLocal
allows you to allocate, use and release some thread-local state, i.e. exactly what you need (TLocal
will be your CoefficientVector
instance per thread).
So you can remove the coeff
local variable and use the aforementioned overload like this:
CalcPoint[] calcpoints = new CalcPoint[numCalpoints];
Parallel.For(0, numCalpoints,
() => new CoefficientVector { vectors = new Vector[vecSize] }, // localInit
(i, loopState, coeff) => // body
{
coeff.ResetVectors();
//doing some operation on the matrix
for (int n = 0; n < coeff.vectors.Length; n++)
{
coeff.vectors[n].x += n;
coeff.vectors[n].y += n;
coeff.vectors[n].z += n;
}
calcpoints[i].result = coeff.GetResults();
return coeff; // required by the body Func signature
},
coeff => { } // required by the overload, do nothing in this case
);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With