![]()
On Multiprocessor System Scheduling
Abstract
In the most general case, a job submitted. for execution ean be
represented by an unknown directed acyclic graph (dag, i.e., the
data dependency relation graph of a program) revealed as execution
proceeds. It is known that, for distributed memory parallel machine
models, there is no scheduling algorithm which can complete execution
of every unknown dag within a constant factor of optimal execution
time (assuming the dag is known) [6]. On the other hand, for shared
memory parallel machine models, using parallel profiles to represent
parallel jobs, it is shown that the
Dynamic Equi-partition Scheduling Policy (DEQ) [27] [33] guarantees
a mean response time within a constant factor of the optimum which
assumes complete information of job parallel profiles) [5].
As an effort to move those positive theoretical results for
shared memory parallel models one step closer towards reality,
we extend those results to all the unknown dags. We show the
DEQ achieves a mean response time, for any unknown dag, within
four times the optimum (assuming the dag is known). Therefore,
for the same class of unknown dags, we have a scheduling strategy
which achieves a near optimal solution in both the completion time
and the mean response time for shared memory parallel models but
no scheduling policy can guarantee a solution within a constant
factor of the optimum in completion time (nor mean response time)
for distributed memory models.
We also study the issue of effectiveness of private eaches for
processors. Since time for all processors to access the shared
memory simultaneously is usually much longer than the time for a
processor to aceess its own private cache, scheduling with private
caches falls into the distributed memory model so that the lower
bound [6] applies. However, in this paper we are able to show the
effectiveness of private caches by proving that a version of DEQ
achieves a mean response time with five times the optimal mean
response time in the cache clock time for a large class of parallel
jobs well accepted in the parallel scheduling community. This
shows an improvement of systemn performance by using private
caches over that of purely shared memory. We also notice that
the same idea can be applied to improve system performance of
distributed memory parallel machine for special classes of jobs
classified as computation intensive (informally time complexity
being sufficiently larger than the space complexity).