Dynamic Loop Scheduling on Processor Groups
Carino, R.L., Banicescu, I., Rauber, T., & Runger, G. (2004). Dynamic Loop Scheduling on Processor Groups. Proc. ISCA 17th International Conference on Parallel and Distributed Computing Systems. ISCA. 78-84.
Abstract
Parallel loops are major sources of concurrency in scientific and
engineering applications. The efficient execution of such loops
requires dynamic loop scheduling algorithms in cases of nonuniform
or unpredictable iteration times, or non-homogeneous or variably-loaded
processors. For dynamic loop scheduling based
on the foreman-worker parallel strategy, a well-known
performance degradation factor is the bottleneck arising when
a large number of workers simultaneously request tasks from
the foreman.
This paper describes a dynamic loop scheduling
algorithm which addresses the bottleneck by using processor groups.
The processors are initially
divided into a few groups, where each group executes
the foreman-worker strategy on a specified
portion of the iteration space. Each foreman maintains for its group,
the ratio of the cost of remaining iterations to the number of
workers, and periodically reports the ratio to a manager.
Upon detecting large differences in the reported
ratios, the manager initiates the transfer of workers between groups
to balance the ratios.
Preliminary tests of an MPI implementation of this processor group
strategy
on a general-purpose non-homogeneous Linux cluster
indicate that the strategy achieves
performance increase
over the straight-forward foreman-worker strategy.