TELKOM
NIKA Indonesia
n
Journal of
Electrical En
gineering
Vol.12, No.6, Jun
e
201
4, pp. 4858 ~ 4
8
7
5
DOI: 10.115
9
1
/telkomni
ka.
v
12i6.584
7
4868
Re
cei
v
ed
Jan
uary 25, 201
4
;
Revi
sed Ma
rch 2
0
, 2014;
Acce
pted April 3, 2014
A Heuristic Greedy Algorithm for Scheduling Out-Tre
e
Task Graphs
Jian Jun Zha
ng*, Wei We
n Hu, Mei Ni Yang
Coll
eg
e of Scie
nce, Nava
l Uni
v
ersit
y
of En
gi
neer
ing,
No. 717, Jief
an
g Avenu
e, W
uhan C
i
t
y
, Hu
be
i
Provinc
e
, P.
R. Chin
a, +
86-138
71
162
29
7
*Corres
p
o
n
id
n
g
author, e-ma
i
l
:
w
a
hh
09
12@
163.com
A
b
st
r
a
ct
T
he sched
ul
ing
of Out-T
r
ee task graphs is
on
e of the
critical
factors in i
m
pl
e
m
e
n
tin
g
the co
mp
iler
s
of par
all
e
l l
a
n
gua
ges
an
d i
m
pr
ovin
g th
e
perfor
m
a
n
ce
o
f
paral
lel
co
mputin
g. Altho
u
gh th
ere
are
a few
hetero
g
e
neity
base
d
al
gorith
m
s in th
e liter
a
t
ure suit
ab
le fo
r schedu
lin
g Out-T
r
ee task graphs, they us
u
a
ll
y
requ
ire sig
n
ific
antly hi
gh sch
edu
lin
g
costs and
may
not d
e
liver
goo
d qu
ality sche
dul
es
w
i
th low
e
r costs.
T
h
is pa
per
pre
s
ents a h
eur
istic gre
edy sch
e
duli
ng
al
gor
ith
m
for Out-T
r
ee
task grap
hs w
i
th an
ob
jective
t
o
simulta
neo
usly
meet h
i
gh
p
e
rformanc
e an
d fast
schedu
ling ti
me, w
h
ich is base
d
on list an
d ta
sk
dup
licati
on, tri
e
s to fi
nd th
e
best p
o
int
bet
w
een b
a
la
nci
n
g lo
ads
an
d s
horten
i
ng
the
sched
ule
le
ngt
h a
n
d
improves
the
sched
ule
p
e
rformanc
e w
i
th
out i
n
cre
a
sin
g
the ti
me c
o
mp
lexity
of th
e a
l
gor
ith
m
.
T
h
e
compar
ative s
t
udy show
s that t
he propos
ed al
gorith
m
surpass
e
s pre
v
ious a
ppro
a
c
hes in ter
m
s of
sched
ule l
e
n
g
th ratio, spee
du
p and effici
enc
y metrics.
Ke
y
w
ords
:
parallel distributed com
put
ing,
heterogen
eous com
p
uting system
, ta
sk scheduling, Out-Tr
ee
task graph, tas
k
dupl
icatio
n
Copy
right
©
2014 In
stitu
t
e o
f
Ad
van
ced
En
g
i
n
eerin
g and
Scien
ce. All
rig
h
t
s reser
ve
d
.
1. Introduc
tion
The effe
ctive task
sche
duli
ng of ap
plicat
ions
plays
a
signifi
cant rol
e
in the p
e
rfo
r
man
c
e
of parall
e
l di
stributed
com
p
uting. Tasks
must
be
sche
duled
and a
s
sign
ed to p
r
o
c
e
s
sors in a
way
that minimi
ze
s the
total ex
ecutio
n time
or th
e
sc
hedu
le len
g
th of th
e di
stribute
d
appli
c
ation.
O
n
e
of the mo
st i
m
porta
nt issu
es fo
r hi
gh
-p
erfo
rm
an
ce
computing
wit
h
Hetero
gen
eou
s
Comp
uting
Systems
(HCS) is the m
a
pping
st
rate
g
i
es they a
d
o
p
t. In genera
l
, static task sched
uling f
o
r
HCS
s is
NP
-co
m
plete p
r
oblem [1]. Beca
use of
its key import
ance on p
e
rforman
c
e, th
e
hetero
gen
eity ba
sed
sche
d
u
ling al
go
rith
ms h
a
s
be
en
extensively
studied
and
va
riou
s h
e
u
r
isti
cs
were
pro
p
o
s
e
d
in
the lite
r
at
ure
[1-9]. In
t
h
is
pap
er,
we
co
nsi
d
e
r
the
sched
uling
of
Out-Tre
e
ta
sk
grap
hs i
n
HCSs. Ma
ny parallel
pro
g
ra
ms exhibit
in
the Out-T
r
e
e
stru
ctu
r
e, and this type
of
parall
e
l program paradigm
arise
s
in ma
ny app
licatio
n area
s. The
sch
eduli
ng
probl
em of O
u
t-
Tree ta
sk g
r
a
phs pl
ays a very importa
nt role in
imple
m
enting the compile
rs of p
a
rallel la
ngu
a
ges
and imp
r
ovin
g the perfo
rm
ance of parall
e
l comp
uting.
There a
r
e
several
cla
ssi
cal h
e
u
r
istic
sched
uling
al
gorithm
s in
HCS
s [2-5], whi
c
h i
s
suitabl
e for
sched
uling
Out-T
r
ee ta
sk g
r
ap
hs
. T
he Ta
sk Du
plicatio
n Sch
edulin
g (H_T
DS)
algorith
m
[3] sched
ules t
a
sks
acco
rdi
ng to their
le
v
e
l
s a
nd a
s
sign
s e
a
ch t
a
s
k
t
o
a
suitable
pro
c
e
s
sor
wh
ile gua
rante
e
i
ng the sho
r
te
r sche
dul
e l
e
ngth and
re
asonabl
e time complexity, but it
need
s
too
many
processors wh
en sche
duling
O
u
t-Tree ta
sk gra
p
h
s
. Th
e Hete
ro
gen
eou
s
Earlie
st-Fini
s
h-Time
(HEF
T) alg
o
rithm
[4], which
i
s
b
a
se
d on
list
sche
duling,
schedul
es th
e t
a
sk
ac
cor
d
ing
t
o
i
t
s
Rank
u
valu
e which i
s
b
a
s
ed
on
its
averag
e exe
c
uti
on
co
st, and
use
s
an i
n
sertion
based st
rate
gy that con
s
i
ders the po
ssible in
se
rt
io
ns of a task
on the mo
st suitabl
e pro
c
essor
whi
c
h mini
mi
ze
s its e
a
rlie
st finish time. It ignore
s
the
load b
a
lan
c
e
betwe
en p
r
o
c
essors a
nd th
e
eco
nomi
z
atio
n on pro
c
e
ssors,
which ea
sily lead
s to an und
esi
r
abl
e sched
ule le
ngth and n
e
e
d
s
too many pro
c
e
s
sors wh
en
sch
eduli
ng O
u
t-Tree task
grap
hs.
In this paper,
based on the
parall
e
l com
p
uting
model i
n
HCS
s and the cha
r
a
c
teri
stics of
the Out-Tre
e
gra
p
h
s
, we
prop
ose a
ne
w alg
o
rith
m,
calle
d the
He
uristi
c G
r
e
e
d
y
Algorithm, f
o
r
Sched
uling
O
u
t-Tree ta
sk
grap
hs (HGA
S_OT), th
e
motivation be
hind
whi
c
h i
s
to delive
r
g
o
od
quality of
schedul
es with
lower cost
s. Com
b
ing
the
strate
gie
s
of li
st sch
e
duling
an
d t
a
sk
dupli
c
ation, a
nd takin
g
the
load bal
an
ce
s into con
s
id
eration, it sch
edule
s
the ta
sk
acco
rdin
g
to
Evaluation Warning : The document was created with Spire.PDF for Python.
TELKOM
NIKA
ISSN:
2302-4
046
A Heuri
s
tic G
r
eed
y Algo
rithm
for Sched
uli
ng Out-Tre
e
Task Graph
s (Jian
Jun Z
hang
)
4869
the prio
rity which i
s
the
m
a
ximal value
of all its
ea
rli
e
st completio
n
times
wh
en
sched
uling it
to
all pro
c
e
s
so
rs respe
c
tively, and trie
s to
sched
ule
e
a
ch leaf task o
n
its mo
st sui
t
able processor
while g
uarant
eeing the
sho
r
ter sch
edul
e length an
d le
ss n
u
mbe
r
of use
d
pro
c
e
ssors.
2. The Propo
sed Algori
t
h
m
2.1. Prelimin
aries.
A sche
dulin
g system mo
de
l con
s
ist
s
of an
appli
c
atio
n, a target co
mputing environment,
and a
pe
rformance
crite
r
i
on for
sche
d
u
ling. An a
p
p
lication i
s
rep
r
esented
by
a directe
d
a
cyclic
grap
h,
G
= (
V
,
E
), where
V is the
set o
f
v
tasks
and
E
is the
set o
f
e
edge
s b
e
twee
n the ta
sks.
Data
is a
v
×
v
matrix
of communi
cation
data, whe
r
e
data
i
,
j
is the amount of da
ta required to
b
e
transmitted from task
n
i
to
n
j
. In a given
task g
r
ap
h, a task witho
u
t any pare
n
t is called an
en
tr
y
tas
k
a
nd a ta
sk
without an
y child is call
ed an
ex
it task
.
Out-T
r
ee ta
sk grap
h is o
ne
of the basi
c
stru
ctu
r
e
s
for
parall
e
l proc
e
ssi
ng, an exa
m
ple of
whi
c
h is sho
w
n in Figu
re
1. We assum
e
that t
he target comp
utin
g environm
en
t consi
s
ts of a set
Q
of
n
heterogen
eou
s p
r
oce
s
sors
con
necte
d in a f
u
lly conn
ecte
d topology in
which all int
e
r-
pro
c
e
s
sor
co
mmuni
cation
s are a
s
sume
d to perfo
rm without conte
n
tion
and co
mputation ca
n
be
overlap
ped
with commu
nication. Task e
x
ecution
s
of
a given appli
c
ation a
r
e a
s
sume
d to be
non-
pree
mptive.
W
is a
m
×
n
comp
utation
co
st matrix in whi
c
h ea
ch
w
ij
(also den
oted by
w
(
n
i
,
P
j
))
gives the e
s
timated execut
ion time to complete task
n
i
on processor
P
j
. Before sched
uling, the
tasks a
r
e lab
e
led with the
averag
e execution co
sts.
Figure 1. An example Out
-
Tree T
a
sk Graph with 1
3
Tasks
The
data tra
n
sfer rates b
e
twee
n p
r
o
c
essors are
stored
in
matri
x
B
of
si
ze
n
×
n
. T
he
comm
uni
cati
on start-up co
sts of
p
r
o
c
e
s
sors are
given
i
n
a
n
-d
ime
n
s
i
ona
l ve
c
t
or
L
. The
comm
uni
cati
on cost of th
e edge
(
i
,
j
),
whi
c
h is fo
r t
r
an
sferring
d
a
ta from ta
sk
n
i
(sch
edule
d
on
P
l
) to
n
j
(schedul
ed on
P
k
), is defin
ed by
c
ij
=
L
l
+(
Data
ij
/
B
lk
). Before
sche
duling, ave
r
a
ge
comm
uni
cati
on cost
s a
r
e
use
d
to lab
e
l
the edg
es. T
he average
communi
catio
n
co
st of a
n
edge
(
i
,
j
) i
s
defin
e
d
by
ij
ij
Da
t
a
cL
B
, where
B
is the
avera
g
e
tran
sfer rat
e
amo
ng the
pro
c
e
s
sors
and
L
is the averag
e co
mmu
nicatio
n
start
-
up time.
Before prese
n
ting our al
go
rithm, it is nece
s
sary to de
fine a few parameters for O
u
t-Tree
task g
r
ap
hs.
Defini
tion 1.
Parameter
ct
(
n
i
,
P
j
)
d
eno
tes the exe
c
ution time of the task
n
i
and its
ancesto
rs
on pro
c
e
s
sor
P
j
,
which
can
be recursiv
ely denot
ed
as
ct
(
n
i
,
P
j
)=
ct
(
pr
e
d
(
n
i
),
P
j
)+
w
(
n
i
,
P
j
),
starting from
the
only entry ta
sk, where
pr
ed
(
n
i
) is the
o
n
ly immediat
ely
pred
ecesso
r task of
task
n
i
and th
us
ct
(
pre
d
(
n
i
),
P
j
) i
s
the time
when all
data
need
ed by
n
i
has
arrive
d at pro
c
e
s
sor
P
j
; espec
ially, for the root task
n
1
,
ct
(
n
1
,
P
j
)=
w
(
n
1
,
P
j
). In o
r
de
r to c
o
mp
ute
ct
(
n
i
,
P
j
), the immediately p
r
ede
ce
sso
r
task of
n
i
mu
st have bee
n schedul
ed.
Defini
tion 2.
Parameter
la
c
t
(
n
i
) and
ec
t
(
n
i
,
P
q
) den
o
t
es the latest
allowabl
e co
mpletion
time and th
e
earlie
st
com
p
letion time
of the task
n
i
re
spe
c
tively, whi
c
h
can
b
e
cal
c
ul
ated
by
lact
(
n
i
)=
max{
ct
(
n
i
,
P
j
)|
j
PQ
} an
d
ec
t
(
n
i
,
P
q
)=
min{
ct
(
n
i
,
P
j
)|
j
PQ
}, wh
ere
P
q
gives t
h
e
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 23
02-4
046
TELKOM
NI
KA
Vol. 12, No. 6, June 20
14: 4868 – 4
875
4870
corre
s
p
ondin
g
pro
c
e
s
sor
o
n
whi
c
h ta
sk
n
i
com
p
lete
s its execution i
n
its ea
rlie
st compl
e
tion ti
me,
ect
(
n
i
,
P
q
). Ad
ditionally,
SL
denote
s
the current sche
du
le l
ength at e
a
ch
sched
ulin
g step.
After all tasks in a Out-Tre
e
task g
r
ap
h are sch
edul
e
d
, the sch
edu
le length (i.e., overall
compl
e
tion time) will b
e
the actual fini
sh
time of
the exit task. If the
r
e are multipl
e
exit tasks a
nd
the conve
n
tio
n
of insertin
g a pse
udo exit
task is
n
o
t applied, the schedul
e length
(whi
ch is al
so
called
m
a
kespan
) i
s
def
ine
d
as t
he max
i
mum of
all t
he act
ual f
i
ni
sh
t
i
mes of
t
he ex
it
t
a
sks.
Th
e
obje
c
t
i
v
e
f
unc
t
i
on of
t
he t
a
sk s
c
he
dulin
g algorit
h
m
is t
o
det
ermi
ne t
he as
sig
n
me
nt
of
t
a
sks of
a
given appli
c
at
ion to pro
c
e
s
sors
su
ch tha
t
its sche
dule
length is mini
mized.
2.2. Descrip
tion of the
Algorithm
Our alg
o
rith
m has two
major p
h
a
s
e
s
: a task pri
o
ritizin
g
pha
se for
com
p
uting the
prio
rities of
all the l
eaf ta
sks a
n
d
a
proce
s
sor se
le
ction
pha
se
for
sel
e
cting
the
tasks in
the
orde
r
of their pri
o
rities a
nd
scheduli
ng
each sele
cted
t
a
sk a
nd it
s
ancesto
r ta
sks on
its
“b
est”
pro
c
e
s
sor, which mini
mize
s the task’s fi
nish time.
Task pri
o
riti
zi
ng pha
se: In
this pha
se,
our al
gorith
m
requi
re
s the
priority of e
a
ch le
a
f
tas
k
to be
s
e
t with the
lact
attributes. T
he task li
st
l
_
tas
k
i
s
gene
rated by sorti
ng the ta
sks
in
decrea
s
in
g o
r
der
of their
lact
attribute
s
.
Tie-b
r
ea
kin
g
i
s
d
one
ra
ndo
mly. There
ca
n be
alternati
v
e
polices fo
r tie-brea
king. Si
nce t
he
alternative polices increa
se th
e
time compl
e
xity, we prefer a
rand
om sele
ction strategy.
Processo
r
se
lection
pha
se
: In this p
h
a
s
e, o
u
r
algorithm firs
t
ass
i
gns the firs
t
tas
k
in
l
_
task
an
d all
its an
ce
stors
to the pro
c
e
s
sor
whi
c
h
gu
arante
e
s it
s e
a
rlie
st co
mpl
e
tion time. Th
en,
it deploys the
followin
g
g
r
e
edy strategy that, for sub
s
eque
nt tasks
in
l
_
task
, whil
e
gua
ra
nteei
ng
not to increa
se the
cu
rren
t
SL
, the alg
o
rithm a
s
sign
s the le
af no
de an
d its a
n
c
e
s
tor n
ode
s to
use
d
p
r
o
c
e
s
sor by
avoidin
g
du
plicatio
n
of task. On
th
e othe
r h
and,
if it is
ne
ce
ssary to
in
crea
se
the current
SL
, while
gu
aranteein
g
the i
n
crea
se i
n
th
esch
e
dule l
e
ngth i
s
a
s
le
ss a
s
p
o
ssibl
e
, the
algorith
m
sch
edule
s
the le
af node an
d all its ancest
o
r nod
es to
a new p
r
o
c
e
s
sor o
r
a u
s
ed
pro
c
e
s
sor. Fi
gure 2
sho
w
s the steps inv
o
lved in the p
r
opo
se
d algo
rithm in detail.
Algorithm 1
:
Begin
Comp
ute pa
rameter
lact
s
of
t
he leaf
t
a
sks,
s
o
rt
leaf
t
a
s
ks
n
1
,
n
2
,…,
n
m
in desce
nding
orde
r by
lact
s, for convenie
n
ce (sa
k
e
)
, st
ill denote the
m
as
n
1
,
n
2
,…,
n
m
, and put them into
l
_
task
in turn;
w
h
ile
(
l
_task
Φ
)
do
Remove the l
eaf task
n
i
from
l
_
task
;
SL=
m
ax
{
SL
,
ect
(
n
i
,
P
q
)};
if
there exist
certai
n
P
j
su
ch that
ct
(
n
i
,
P
j
)<
=
SL
, where
j
PQ
Sched
ule the
leaf task
n
i
a
n
d
its ancesto
rs witho
u
t dupl
ication to pro
c
e
s
sor
P
j
;
else
Schedul
e the leaf task
n
i
and its a
n
c
e
s
tors witho
u
t duplication
to processo
r
P
q
;
Remove proc
ess
o
r
P
q
from
Q
;
End
w
hil
e
End
Figure 2. Overall step
s of the HGAS_
O
T
Algorithm
2.3. The Time Complexity
and Effecti
v
it
y
of the Algorithm
In this sectio
n, we will an
alyze the time
compl
e
xity
and the effectivity of
the HGAS_OT
algorith
m
.
First of all, in
the task prio
ri
tizing ph
ase
the HGAS_
O
T
algorith
m
traverses
all the tasks
of the DAG to comp
ute the
lact
attribut
es of the leaf tasks and
sorts the leaf tasks in the n
on-
increa
sing o
r
der of thei
r
lact
values, the
worst ca
se ti
me com
p
lexity of this step
woul
d be on t
he
orde
r of O
(
vp
)+
O(
vlog
v
), where
v
i
s
the
numbe
r of ta
sks in th
e ta
sk g
r
aph
and
p
is the
num
b
e
r
of pro
c
e
s
so
rs req
u
ire
d
. Th
e second
ph
ase
is
ca
rri
ed
out to
sea
r
ch whethe
r th
e ne
w le
af n
ode
and its an
ce
stor no
de
s co
uld be
sch
e
d
u
led to th
e suitable p
r
o
c
e
s
sor, all th
e
use
d
p
r
o
c
e
s
sors
may be exam
ined, with the
worst ca
se ti
me com
p
lexity of O(
v
2
p
).
Evaluation Warning : The document was created with Spire.PDF for Python.
TELKOM
NIKA
ISSN:
2302-4
046
A Heuri
s
tic G
r
eed
y Algo
rithm
for Sched
uli
ng Out-Tre
e
Task Graph
s (Jian
Jun Z
hang
)
4871
Con
s
e
quently
, the overall time com
p
lexity of the HGAS_OT algo
rithm is
O
(
v
2
p
).
On the other
hand, Given
an Out-T
r
ee t
a
sk
gra
ph an
d a HCS, the HGAS_OT a
l
gorithm
prod
uces an
sched
ule, the
sched
ule le
ngth of
whi
c
h is th
e valu
es of
SL
wh
en the
algo
rithm
terminates
, This
is
jus
t
the fac
t
we can see
from the d
e
scriptio
n of the HGAS_
O
T
algorithm.
3. Performan
ce and Comparison
3.1. An Illustrativ
e
Exam
ple
To illustrate t
he effectiveness of
the proposed algori
thm, in this section we first give the
sched
uling p
r
oce
s
s for th
e
example O
u
t-Tree ta
sk g
r
a
ph in Fig
u
re
1. Without lo
ss of ge
nerality,
s
u
pp
os
e
n
=8
and ge
nerate
the computat
ion co
st matr
i
x
randomly, a
s
is exp
r
esse
d in (1).
5
7
67
55
6
6
32
3
3
32
4
3
32
2
3
3
3
3
4
7898
9
8
7
9
3
3
3
433
43
6
7
766
6
7
7
34
3
3
3
3
5
4
7
6
97
8
7
68
32
34
32
2
3
34
3
3
4
3
4
3
575
65
6
5
7
78
667
87
7
35
6
3
4
5
4
4
(1)
Let us
sho
w
how th
e HG
AS_OT algo
rithm wo
rks i
n
detail. Firstly, in task p
r
i
o
ritizin
g
pha
se,
lact
value
s
for all l
eaf nod
es
are com
puted i
n
a top-do
wn
fashio
n, start
i
ng from th
e
only
entry task
n
1
. The co
rresp
ondin
g
re
sult
s are sh
own in Table
1. So the list
l
_
task
is
n
11
,
n
9
,
n
10
,
n
12
,
n
8
,
n
13
,
n
6
and
n
7
.
Table 1. The
Latest Allowa
ble Com
p
letio
n
Times of th
e Leaf No
de
s
T
a
sk
n
6
n
7
n
8
n
9
n
10
n
11
n
12
n
13
lact
17
14 18 22
21
25
21
18
Then in pro
c
e
s
sor sele
ction phase the algo
rithm
sch
edule
s
l
eaf node
n
11
and its
ancesto
rs to
processo
r
P
1
, and we have
ct
(
n
11
,
P
1
)=2
0
,
SL
=2
0; for leaf task
n
9
, beca
u
se
ct
(
n
9
,
P
1
)
=
20
+3=2
3>m
a
x{2
0
,
ect
(
n
9
,
P
6
)},
the algo
rithm
assi
gn
s
n
9
a
nd it
s an
ce
st
ors t
o
p
r
o
c
e
s
sor
P
6
, and we
have
ct
(
n
9
,
P
6
)=1
7
; a
s
to
node
n
10
, from
ct
(
n
10
,
P
1
)=20
+3
=2
3>m
a
x{20,18
} an
d
ct
(
n
10
,
P
6
)
=
17+
3=
2
0
≤
max{
20,18}, the
HGAS algo
rithm sched
ule
s
n
10
and all
its ance
s
tors to
p
r
oc
es
so
r
P
6
, and lets
ct
(
n
10
,
P
6
)=20.
Adopting a
b
o
ve strate
gy, the HGAS
_OT alg
o
rith
m seq
uentia
lly carrie
s o
u
t the
sched
uling
proce
s
s of th
e
sub
s
e
que
nt
l
eaf no
de
s: it
sched
ule
s
ta
sk
n
12
an
d all
its a
n
cesto
r
s to
p
r
oc
es
so
r
P
3
and lets
ct
(
n
12
,
P
3
)=18, sch
edule
s
task
n
8
and all its a
n
ce
stors to p
r
ocesso
r
P
2
and
lets
ct
(
n
8
,
P
2
)=15, sched
ule
s
task
n
13
an
d all its ancestors to processor
P
5
and let
s
ct
(
n
13
,
P
5
)=1
5
,
and, sche
dul
es task
n
7
to
p
r
oc
ess
o
r
P
2
(noticing
all its ance
s
tors have a
l
ready be
en
in
p
r
oc
es
so
r
P
2
) an
d lets
ct
(
n
7
,
P
2
)=
ct
(
n
8
,
P
2
)+
w
(
n
7
,
P
2
)=19. At la
st, we
have
l
_
ta
s
k
=
Φ
, an
d
the
algorith
m
terminates.
Th
e p
r
ocesso
r allo
cation
s and
sch
e
d
u
ling time
s
prod
uced
by ou
r
algorith
m
are
sho
w
n in Fig
u
re 3.
On the
othe
r han
d, for th
e HEF
T
alg
o
r
ithm, in it
s t
a
sk p
r
io
ritizin
g
ph
ase, the
up
ward
ran
k
value,
Rank
u
,
for all n
ode
s, which is based on m
ean co
mputat
ion and mea
n
communi
cati
on
co
sts, a
r
e
co
mputed. T
he
corre
s
p
ondin
g
results
are
sho
w
n
in
Tab
l
e
2
.
So
th
e sc
h
e
d
u
lin
g or
de
r
is
n
1
,
n
2
,
n
4
,
n
3
,
n
5
,
n
8
,
n
11
,
n
12
,
n
6
,
n
10
,
n
13
,
n
9
and
n
7
.
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 23
02-4
046
TELKOM
NI
KA
Vol. 12, No. 6, June 20
14: 4868 – 4
875
4872
Figure 3. The
Scheduli
ng
Re
sult Prod
u
c
ed by the HGAS_OT Alg
o
rithm
Table 2. The
Up
ward Ra
nk Values of All the Nod
e
s
T
a
sk
n
1
n
2
n
3
n
4
n
5
n
6
n
7
n
8
n
9
n
10
n
11
n
12
n
13
Rank
u
42.625
36.75
21.125
26.875
20.25
12.5 6.5
15.25
7.75 10.375
13.75
13
8.25
Then in
pro
c
essor
sel
e
cti
on pha
se, th
e HE
FT al
go
rithm seque
n
t
ially carri
es
out the
sched
uling
proce
s
s: it
sch
edule
s
ta
sk
n
1
to its be
st p
r
ocesso
r
P
1
, sched
ule
s
ta
sk
n
2
to its bes
t
p
r
oc
es
so
r
P
1
,
sched
ule
s
ta
sk
n
4
to its
be
st processo
r
P
1
, s
c
h
edules task
n
3
to
its
b
e
s
t pr
oc
esso
r
P
2
which en
able
s
minima
l execution time, sch
edul
e
s
task
n
5
to
its best processor
P
3
whi
c
h
enabl
es mini
mal exe
c
utio
n time,
sched
ules task
n
8
to
its
b
e
s
t
pr
oc
es
so
r
P
2
, schedul
es task
n
11
to
its
b
e
s
t pr
oc
es
so
r
P
1
, schedul
es ta
sk
n
12
to its best
pro
c
e
s
sor
P
3
, schedul
es ta
sk
n
6
t
o
it
s be
st
p
r
oc
es
so
r
P
4
whi
c
h e
nable
s
minim
a
l ex
ecutio
n time, sched
ule
s
task
n
10
t
o
it
s be
st
pro
c
e
s
so
r
P
1
,
sched
ule
s
task
n
13
to
its
b
e
s
t pr
oc
es
so
r
P
5
whi
c
h
enabl
es mi
ni
mal executio
n time, sche
dule
s
t
a
sk
n
9
to its best
pro
c
e
s
sor
P
6
whi
c
h
enabl
es
mini
mal exe
c
utio
n time, and
a
t
last, sche
du
les
tas
k
n
7
t
o
it
s be
st
p
r
o
c
es
sor
P
8
wh
ich e
nabl
es
minimal exe
c
ution tim
e
. The
pro
c
e
s
sor
allocation
s an
d sched
uling
times produ
ced by
the HEFT algo
rithm are sho
w
n in
Figure 4.
Figure 4. The
Scheduli
ng
Re
sult Prod
u
c
ed by the HEFT Algorith
m
3.2. Compari
s
on Metrics
Except the
b
a
si
c met
r
ics, i
.
e., the sche
d
u
le
le
ngth, n
u
mber of u
s
e
d
pro
c
e
s
so
rs a
nd time
compl
e
xity, th
e comp
ari
s
o
n
s
of the algori
t
hms are ba
sed on the foll
owin
g metri
c
s:
(1) S
c
he
dule
Length
Rati
o (SLR): Th
e main
pe
rfo
r
man
c
e m
e
a
s
ure of a scheduli
ng
algorith
m
on a grap
h is the
sch
edule le
n
g
th (
ma
k
e
sp
an
) of its outpu
t sche
dule. Since a large set
of task
gra
p
h
s
with
differe
nt prope
rties i
s
u
s
e
d
, it i
s
n
e
ce
ssary
to n
o
rmali
z
e
the
sched
ule l
e
n
g
th
to a low bou
n
d
, which is ca
lled the Sche
dule Le
ngth
Ratio (SL
R
).
The SLR valu
e of an algorit
hm
on a gra
ph is
defined by:
iM
I
N
ij
j
nC
P
ma
ke
sp
a
n
SL
R
wn
P
P
Q
mi
n
(
,
)
|
.
(2)
The d
enomi
n
ator i
s
the
su
mmation of t
he mi
nim
u
m
comp
utation
co
sts of ta
sks on th
e
CP
MIN
. (For
an un
sche
du
led DA
G, if the comp
utation cost
of e
a
ch
nod
e
n
i
is
set with t
he
Evaluation Warning : The document was created with Spire.PDF for Python.
TELKOM
NIKA
ISSN:
2302-4
046
A Heuri
s
tic G
r
eed
y Algo
rithm
for Sched
uli
ng Out-Tre
e
Task Graph
s (Jian
Jun Z
hang
)
4873
minimum val
ue, then th
e
critical p
a
th
will be
ba
sed
on minim
u
m
com
putation
co
sts,
whi
c
h
is
r
e
pr
es
e
n
t
ed
a
s
CP
MIN
.
) Th
e SLR of a g
r
aph (usi
ng a
n
algo
rithm)
can
not be le
ss than o
ne si
nce
the de
nomin
a
t
or i
s
the
lower b
oun
d. Th
e task
sche
d
u
ling
algo
rith
m that give
s t
he lo
we
st SL
R of
a grap
h is the
best algo
rith
m with re
spe
c
t to perform
ance.
(2) Spee
dup:
The
spee
d
up valu
e for a give
n g
r
aph i
s
co
mp
uted by
dividing th
e
seq
uential ex
ecutio
n time (i.e., cumulati
ve comp
ut
ation co
sts
of the tas
ks i
n
the graph
) by the
parall
e
l execution time (i.e., the
m
a
ke
spa
n
of the output sched
ule). The seq
uential execu
t
ion
time is
comp
uted by a
ssi
g
n
ing all ta
sks to a sin
g
le p
r
ocesso
r that
minimizes th
e cu
mulative
of
the comp
utation co
sts.
i
ij
j
nV
w
n
PPQ
S
p
ee
du
p
m
a
ke
span
min
(
,
)
|
.
(3)
If the sum of the computa
t
ion co
sts is
maxi
mize
d, it result
s in a highe
r sp
eed
up, but
end
s up with
the same
ran
k
ing of the scheduli
ng algo
rithms.
(3) Efficien
cy: The
ratio
of the
spee
dup
value to th
e n
u
mbe
r
of p
r
o
c
e
s
sors u
s
e
d
,
whi
c
h
is anoth
e
r co
mpari
s
o
n
metric u
s
ually u
s
ed for appli
c
a
t
ion grap
hs of
real wo
rld p
r
oblem
s.
3.3. Perform
a
nce an
d Co
mparison
The exam
pl
e Out
-
Tree t
a
sk g
r
a
ph
sh
own
in
Figu
re 1
ca
n
be
use
d
to
co
m
pare
the
HGAS algo
rit
h
m with the
H_T
D
S and
HEFT alg
o
rit
h
m. The H_T
D
S algo
rithm
initially gene
rates
a set of clu
s
ters
simila
r to linear
clu
s
ters. Then dul
pli
c
ation i
s
ca
rri
ed out until system re
sou
r
ce
s
are
exhau
ste
d
. The
p
r
oce
s
sor allo
catio
n
s
and
the
sche
duling
tim
e
s
produ
ced
by the
H_T
D
S
algorith
m
s a
r
e sho
w
n in Fi
gure 5. Ta
ble
3
gives the compa
r
ison re
sult.
Figure 5. The
Scheduli
ng
Re
sult Prod
u
c
ed by the H_TDS Algo
rithm
Table 3.
Com
pari
s
on
with other Algo
rith
ms for DA
G in Figure 1
Schedule
length
Number of
used
processors
Ti
me
complex
i
ty
Schedule length
rati
o
Speedup
Efficiency
HGAS_
OT
20
6
O
(
v
2
p
)
1.053
2.90
0.483
H_TDS
21
8
O
(
v
2
p
)
1.105
2.76
0.345
HEFT
23
7
O
(
v
2
p
)
1.211
2.52
0.360
As
sho
w
n
in
Figu
re
s 3
-
5
and
Table
3,
the
HGAS_
OT al
gorith
m
depl
oys
an
effective
strategy,
whi
c
h
effectively balan
ce
s th
e wo
rklo
a
d
s,
sho
r
ten
s
the
sched
ule le
n
g
th, economi
z
e
s
the pro
c
e
s
sors, and so imp
r
oves the
sch
edule p
e
rfo
r
mance. It only used six p
r
oce
s
sors an
d
its
sched
ule len
g
th is o
n
ly 20. De
spite
its rea
s
o
nabl
e time co
mp
lexity, the H_TDS al
gorit
hm
exce
ssively
use
s
ta
sk d
uplication
s
a
nd ign
o
res t
he e
c
on
omi
z
ation on
pro
c
e
s
sors, wh
ose
numbe
r of
u
s
ed
processors is
8 a
n
d
sched
ul
e le
ngth is 21.
The
HEFT a
l
gorithm i
s
n
o
t
dupli
c
ation b
a
se
d and n
e
g
l
ects the b
a
la
nce of the
wo
rklo
ad
s, the full utilization
of the comp
uting
cap
a
city of heterog
ene
ou
s pro
c
e
s
sors and the redu
ction of the sche
dule le
ngt
h, it uses
sev
en
pro
c
e
s
sors a
nd its
sched
u
l
e length i
s
,
23, more
tha
n
that of the
HGAS_O
T
al
gorithm. Ove
r
all,
the co
mpa
r
ison with
othe
r algo
rithms f
o
r
DAG
in Fi
gure
1 sho
w
s
that
the propo
sed algo
ri
thm
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 23
02-4
046
TELKOM
NI
KA
Vol. 12, No. 6, June 20
14: 4868 – 4
875
4874
outperfo
rm
s the other t
w
o
approa
che
s
i
n
terms
of schedul
e length
ratio, spe
edu
p and efficie
n
cy
metrics.
F
o
r
e
x
te
ns
ive
c
o
mpa
r
is
on
, w
e
p
r
es
ent t
he compa
r
ative evaluation of the HGAS_OT,
H_T
D
S an
d
HEFT al
gorit
hm. We
ado
pt the Ou
t-T
r
ee ta
sk gra
phs
ran
doml
y
generated
by
cla
ssi
cal met
hod
s [4] as the wo
rkl
oad
s for testing these alg
o
rithm
s
in the sam
e
HCS. Firstly 20
to 200
node
s are
gen
erate
d
, co
rre
sp
on
ding m
a
trices
W
,
B
an
d
L
are also
ra
nd
omly
gen
erat
ed
[4]. Visual C i
s
u
s
ed a
s
the
simulatio
n
progra
m
and th
e simul
a
tion i
s
pe
rform
ed
by the person
a
l
comp
uter
with Wind
ows 7,
2G RAM an
d 2.53G
Hz
CPU. The co
m
pari
s
on of the
numbe
r of used
pro
c
e
s
sors, the sche
dule l
ength an
d the
efficiency are sho
w
n in Fi
gure
s
6
-
8, re
spe
c
tively.
Figure 6. Co
mpari
s
o
n
of the Num
b
e
r
of Use
d
Processo
rs
Figure 7. Co
mpari
s
o
n
of the Sche
dule
Length
s
Figure 8. Co
mpari
s
o
n
of the Standa
rd
Efficiencie
s
As sh
own in
Figures 6
-
8, t
he HGAS alg
o
rithm ha
s th
e ch
ara
c
teri
st
ic of less n
u
m
ber
o
f
use
d
p
r
o
c
e
s
sors.
The
sch
edule
len
g
ths produ
ced
by
the
HGAS al
gorithm
are v
e
ry
close to
that
of the H_T
D
S algorithm
and are obvi
ously le
ss th
an that of the HEFT algo
rithm. Overall
,
as
comp
ared to
other t
w
o alg
o
rithm
s
, the
HGAS_O
T
al
gorithm
effect
ively improve
s
the
sched
ul
ing
efficien
cy of the O
u
t-Tree t
a
sk g
r
ap
hs in
HCSs
, a
nd t
he mo
re th
e
numbe
r of
no
des in the
ta
sk
grap
h is, the
more p
r
omi
n
ence the sup
e
rio
r
ity in
the major p
e
rfo
r
mances
over other comp
a
r
ed
algorith
m
s i
s
.
4. Conclusio
n
Effective task
sched
uling
is im
porta
nt to a
c
hieve
high p
e
rfo
r
m
ance in
HCS
s
. In thi
s
pape
r, we prese
n
t
a ne
w gree
dy sched
uling algo
rith
m
for
sch
edu
ling
O
u
t-T
r
ee
task gra
p
h
s
in
HCS
s, calle
d
the HGAS_OT algo
rithm
.
It sched
ule
s
tasks a
c
co
rding to a ne
w prio
rity wh
en
sched
uling e
a
ch l
eaf nod
e to co
rresp
ondin
g
pr
oce
s
sor, an
d m
e
rge
s
the li
st
and d
uplication
based
strate
gy to a
s
sign
each le
af ta
sk to
the
mo
st suita
b
le
processor
while
g
uara
n
teein
g
t
h
e
0
20
40
60
80
100
120
140
10
40
80
120
160
200
Number of used processors
Number of tasks
H_TDS
HEFT
HGAS_OT
0
50
100
150
200
250
300
10
40
80
120
160
200
Schedule Length
Number of tasks
H_TDS
HEFT
HGAS_OT
0
0.1
0.2
0.3
0.4
0.5
0.6
10
20
40
60
80
100
120
140
160
180
200
Efficiency
Number of tasks
H_TDS
HEFT
HGA
S
_OT
Evaluation Warning : The document was created with Spire.PDF for Python.
TELKOM
NIKA
ISSN:
2302-4
046
A Heuri
s
tic G
r
eed
y Algo
rithm
for Sched
uli
ng Out-Tre
e
Task Graph
s (Jian
Jun Z
hang
)
4875
sho
r
ter
sched
ule length an
d less numb
e
r
of used
p
r
o
c
e
s
sors. Experime
n
tal re
sults validate the
HGAS_O
T
algorithm outp
e
rform
s
the
H_T
D
S
and
HEFT algo
rit
h
m whe
n
schedul
e length
,
th
e
numbe
r of used pro
c
e
s
so
rs, sched
ule le
ngt
h ratio an
d
efficiency are con
c
e
r
ne
d.
One
plan
ned
future
resea
r
ch
i
s
to
ana
lytically investigate the
tra
de-off
betwe
en the
quality of schedul
es of the algo
rithm
s
, i.e., average
ma
k
e
sp
an
values, an
d the numbe
r of
pro
c
e
s
sor av
ailable. Thi
s
extensio
n m
a
y come
up
with some b
ound
s on th
e deg
rad
a
tio
n
of
m
a
kesp
an
gi
ven that the numbe
r of pro
c
e
s
sors a
v
ailable may
not be sufficient. It is also
plann
ed to extent the algori
t
hm for more
gene
ral
targ
e
t
computing e
n
vironm
ents
by consi
d
e
r
ing
the link conte
n
tion.
Ackn
o
w
l
e
dg
ements
This
wo
rk i
s
sup
porte
d pa
rtially by the
Nation
al Natural S
c
ien
c
e
Found
ation o
f
Chin
a
unde
r Grant No. 7117
119
8 to Yexin Song, the Nat
u
ral Scie
nce Found
ation o
f
Naval Unive
r
sity
of Enginee
ri
ng und
er G
r
ant No. HG
DYDJJ
131
51
to Jianjun
Zhang a
nd
unde
r Grant
No
.
HG
DQ
NJJ1
3
153 to Meini
Yang, re
spe
c
tively.
Referen
ces
[1]
F
oad L
o
tfifar, Hadi S
h
a
h
riar
Shah
hos
ein
i
.
Co
mpl
e
xity T
a
sk Sched
uli
ng
Algorit
h
m
for Hetero
gen
eo
u
s
Co
mp
uting Sy
stems
. T
h
ird Asia Intern
atio
n
a
l Co
nfere
n
ce
on Mo
del
lin
g
& Simulati
on.
Bali. 2
009; 5
:
596-
601.
[2]
Jianj
un Z
h
ang,
Yexin So
ng,
Den
gbi
n Hu
an
g. T
a
sk schedulin
g al
gor
ithm
for F
o
rk-Join task grag
hs i
n
hetero
g
e
neo
us
enviro
n
ment.
Co
mp
uter Engi
neer
ing & D
e
si
gn
. 201
0; 31(3)
: 486-49
0. (In Chin
ese)
[3]
Samanth
a
Ra
na
w
e
era, D
h
a
rma P Agra
w
a
l.
A T
a
sk D
uplic
atio
n Bas
ed Sch
e
d
u
li
ng
Algorit
hm f
o
r
Hetero
gen
eo
u
s
Systems
. Pr
ocee
din
g
s of t
he 1
4
th Intern
ation
a
l P
a
ral
l
el
and
Distrib
ute
d
Process
i
n
g
S
y
mp
osi
u
m. F
l
orid
a. 200
0; 44
5-45
0.
[4]
H T
opcuogl
u, S Hariri, MY
W
u
. Performance-
effective
and l
o
w
-
com
p
le
xit
y
task s
c
hed
uli
ng fo
r
hetero
g
e
neo
us
computin
g.
IEEE Trans. Parallel an
d Distributed System
s
. 2002; 13(
3): 260-2
74.
[5]
T
Hagras, J Janec
ek.
An a
ppro
a
ch to co
mp
ile-ti
m
e tas
k
schedu
lin
g i
n
hetero
g
e
n
e
o
u
s co
mputi
n
g
system
s
. Proc
eed
ings of th
e 33rd Intern
ation
a
l Co
nfer
ence o
n
Para
llel Proc
essi
ng
W
o
rkshops.
Can
ada. 2
004;
182-1
89.
[6]
Sang C
h
e
o
l
Kim, Sungg
u
Lee, Ja
eg
yo
o
n
Hahm. Pus
h
-Pull: D
e
term
inistic Se
arch-
B
ased DAG
Sched
uli
ng for
Hetero
gen
eo
u
s
Cluster.
IEEE Trans. Parallel and Distributed System
s
. 2
007; 1
8
(11
)
:
148
9-15
02.
[7]
F
a
tma A Omar
a, Mon
a
M Ar
a
f
a. Genetic A
l
g
o
rithms for T
a
sk Sche
dul
in
g P
r
obl
em.
Jour
na
l of P
a
ral
l
e
l
and D
i
stribute
d
Computi
n
g
. 20
10; 70(1): 1
3
-2
2.
[8]
Jiad
ong Y
a
n
g
, Hua
Xu, Pe
ifa
Jia.
T
a
sk Sche
duli
ng for H
e
te
roge
neo
us Co
mp
utin
g bas
ed
on Bayes
i
a
n
Optim
i
z
a
t
i
on Algorithm
. Intern
ation
a
l C
onfer
ence
on C
o
mp
utation
a
l Inte
lli
genc
e an
d Sec
u
rit
y
. Be
iji
n
g
.
200
9; 1: 112-1
17.
[9]
B Demiroz,
H
R
T
opcuogl
u.
Static task sched
uli
ng
w
i
th
a un
ified
ob
je
ctive on tim
e
and r
e
so
urce
doma
i
ns.
T
he Co
mp
uter Jour
nal
. 20
06; 49(
6
)
: 731–7
43.
Evaluation Warning : The document was created with Spire.PDF for Python.