Load Balancing Parallel Loops on Message-passing Systems
Carino, R.L., & Banicescu, I. (2002). Load Balancing Parallel Loops on Message-passing Systems. In S.G. Akl, T. Gonzales (Eds.), Proc. 14th IASTED International Conference on Parallel and Distributed Computing and Systems. IASTED. 362-367.
Abstract
In large scientific applications with parallel loops
that execute on message-passing systems,
the strategy of partitioning data among processors
and assigning each processor to perform the computations
on its own data generally results in load imbalance due to
nonuniform data distribution, algorithmic variances,
heterogeneity of processors, or
unpredictable phenomena such as network latency and
operating system interference. Runtime data redistribution
is necessary to balance processor loads.
Recently, loop scheduling algorithms based on
a probabilistic analysis of parallel loop iterates with
variable running times have been developed. Assuming
a centralized workqueue approach, these algorithms
execute the iterates in variable
size chunks, where the sizes are adaptively determined
during runtime such that the chunks complete before
the optimal time with a high probability.
This study uses experiments with these loop scheduling
algorithms on a computationally intensive parallel
application of profiling a quadrature routine. The
experimental results
strongly suggest the benefits of using a distributed
workqueue approach in conjunction with
these loop scheduling algorithms for load balancing
parallel loops on message-passing systems.
The aggregate processor time with straightforward
parallelization is reduced by as much as 75%
(three-fourths) when this approach is utilized. This
result highlights the potential for a
successful integration of this approach into other applications,
especially those with parallel loops characterized by irregular
behavior that cannot be predicted before runtime.