Int
ern
at
i
onal
Journ
al of Ele
ctrical
an
d
Co
mput
er
En
gin
eer
ing
(IJ
E
C
E)
Vo
l.
11
,
No.
1
,
Febr
uar
y
2021
, pp.
375
~
382
IS
S
N: 20
88
-
8708
,
DOI: 10
.11
591/
ijece
.
v11
i
1
.
pp
375
-
382
375
Journ
al h
om
e
page
:
http:
//
ij
ece.i
aesc
or
e.c
om
Strag
gler handli
ng approache
s i
n mapr
ed
uce fra
mework:
a comp
arativ
e st
ud
y
An
w
ar
H. K
atrawi
1
, Rosn
i
Ab
d
ull
ah
2
, M
ohammed
An
ba
r
3
,
Ibra
him
A
lS
h
ou
rb
aj
i
4
,
A
m
mar
Ka
m
al
A
ba
si
5
1,3
Nati
ona
l
Adva
nce
d
IPv6 C
ente
r
(Nav6), Unive
r
siti
Sains
Malays
ia
,
Ma
lay
si
a
2
,5
School
of
Co
m
pute
r
Scie
n
ce
s
,
Univer
si
ti Sain
s Mal
a
y
sia
,
Ma
l
a
y
si
a
4
Depa
rtment
of
Com
pute
r
and
E
ngine
er
ing
,
Jaz
a
n
Univer
sit
y
,
Sa
udi
Arabi
a
Art
ic
le
In
f
o
ABSTR
A
CT
Art
ic
le
history:
Re
cei
ved
Ma
r 22,
2020
Re
vised
Ma
y
21, 2
020
Accepte
d Aug
5
, 2
020
The
pro
li
fer
at
io
n
of
informat
ion
technolog
y
pro
duce
s
a
huge
amount
of
d
ata
ca
l
le
d
big
d
at
a
tha
t
c
annot
b
e
proc
essed
b
y
tr
adi
ti
on
al
d
at
ab
a
se
s
y
stems
.
The
se
Vari
ous
t
y
pes
of
dat
a
come
from
d
iffe
ren
t
source
s.
How
eve
r
,
straggl
ers
are
a
m
aj
or
bott
le
n
ec
k
in
big
dat
a
proc
essing
,
and
hence
the
ea
r
l
y
det
e
ct
ion
and
ac
cur
ate
ide
n
ti
f
ic
a
ti
on
of
str
a
ggle
rs
ca
n
hav
e
important
impact
s
on
the
p
erf
orm
anc
e
of
bi
g
dat
a
proc
essing
.
Thi
s
work
aim
s
to
asses
s
five
stragglers
ide
nti
f
icati
on
m
et
hods:
Hadoop
nat
ive
sche
du
l
er,
LAT
E
S
che
dule
r
,
Ma
ntri
,
MonTool
,
and
Dol
l
y
.
The
p
erf
orm
an
ce
of
th
ese
te
chn
ique
s
was
eva
lu
at
ed
b
ase
d
on
thre
e
b
enc
h
m
ark
ed
m
et
hods:
Sort,
Grep
and
W
ordCount
.
The
result
s
s
how
tha
t
th
e
LATE
S
che
du
l
er
per
form
s
the
best
and
it
would
be
eff
ic
ie
n
t
to
obta
in
bet
t
er
res
ult
s
for
straggl
er
s
ide
nti
f
icati
on
.
Ke
yw
or
d
s
:
Bi
g
data
Hado
op
Ma
pRed
uce
Sp
a
rk
Strag
gler
This
is an
open
acc
ess arti
cl
e
un
der
the
CC
B
Y
-
SA
l
ic
ense
.
Corres
pond
in
g
Aut
h
or
:
Anwar H
.
Katr
awi,
Nati
on
al
A
dv
a
nced I
Pv6 Ce
nt
er (Nav6
),
Un
i
ver
sit
i Sai
ns M
al
ay
sia
,
11800 U
SM, P
enang,
Mal
ay
sia
.
Em
a
il
:
akatraw
i@st
ud
e
nt.
us
m
.m
y
1.
INTROD
U
CTION
W
it
h
the
exce
ssive
gro
wth
in
inf
or
m
at
ion
and
data,
their
analy
sis
becom
es
a
chall
eng
e
an
d
m
or
e
com
plex
due
to
the
i
ncr
ease
d
vo
l
um
e
of
str
uctu
red
a
nd
un
structu
re
d
data
that
are
pr
oduc
ed
by
t
he
inte
rn
et
of
t
h
ings
(
I
oT),
s
ocial
m
edia,
m
ultim
edia
et
c.
Applic
at
ion
suc
h
as
Ma
pRe
duce
is
a
fa
ult
tolera
nt,
scal
a
ble
a
nd
si
m
ple
fr
am
e
work
f
or
data
proces
sin
g
th
at
ena
bl
es
it
s
us
ers
to
pr
oce
ss
these
m
assiv
e
am
ounts
of
data
eff
ect
ively
[
1
,
2].
Ma
pRe
du
ce
is
a
sig
nif
ic
ant
m
od
el
of
pre
par
in
g
a
nd
ge
ner
at
in
g
a
set
of
en
orm
ou
s
inf
or
m
at
ion
.
T
his
is
beca
us
e;
it
giv
es
sim
ple
util
iz
at
ion
en
vir
on
m
ent,
off
er
so
l
utio
n
to
ad
ho
c
a
nd
to
m
isses
li
ke
Data
s
or
ti
ng
,
We
b
i
ndexi
ng
am
on
g
seve
ral
oth
e
rs.
Ma
pRed
uc
e
is
util
iz
ed
in
Bi
g
I
nform
at
ion
Applic
at
ion
s
in
b
ig
ge
r
Com
pan
ie
s s
uch as
Y
ahoo
a
nd
Goo
gl
e a
m
on
g se
veral
o
the
rs.
The
Ma
pRed
uc
e
is
un
li
ste
d
as
a
sect
ion
of
one
str
uct
ure
or
t
he
oth
e
r
.
The
reas
on
f
or
c
reati
ng
strag
glers
is
the
div
e
rsity
in
acce
ssibil
it
y
in
the
CP
U,
I/
O
disco
rd
or
netw
ork
tr
af
fic.
Wh
e
n
t
he
m
ap
a
nd
reduce
are
c
om
ple
te
d,
that
is
wh
e
n
the
Ma
pRed
uce
F
ram
ewo
rk
is
a
ccom
plished
[
3,
4].
In
Ma
pR
edu
ce
Fr
am
ewo
r
k
t
he
job
is
not
acc
om
pli
sh
ed
ti
ll
ve
ry
reduce
a
nd
m
ap
un
der
ta
ki
ng
s
are
com
plete
d.
More
ov
e
r,
t
he
qu
a
ntit
y
of
t
he
stra
gg
le
rs
weak
e
ns
with
the
wide
-
ra
nge
of
the
ti
m
e
occupati
on
[5
-
8
]
.
In
a
heter
og
e
ne
ous
e
nv
i
ronm
ent,
s
om
e
com
pu
te
no
des
a
re
faster
tha
n
t
he
ot
her
.
Sl
ow
e
r
c
om
pu
te
no
de
s
are
cal
le
d
strag
gle
rs
no
de
a
nd
t
he
se
fast
no
des
will
finish
t
heir
ta
sk
s
ea
rly
a
nd
wait
f
or
t
he
strag
glers
t
o
f
inish.
Evaluation Warning : The document was created with Spire.PDF for Python.
IS
S
N
:
2088
-
8708
In
t J
Elec
&
C
om
p
En
g,
V
ol. 11,
No.
1,
Febr
uar
y
2021 :
37
5
-
382
376
So
m
et
i
m
es
nodes
fail
due
to
hard
war
e
or
s
oft
war
e
fail
ur
e
s.
Ther
e
f
or
e,
it
is
sign
ific
a
nt
to
detect
strag
gl
ers
in
an
ea
rly
stage.t
o
a
vo
i
d p
er
for
m
ance d
isc
a
rio
n
in
the
syst
e
m
s
Nowa
days
org
anizat
ion
s
with
a
la
rg
e
am
ou
nt
of
data
ha
ve
com
plexity
in
processi
ng
and
a
naly
zi
ng
us
in
g
t
rad
it
io
na
l
databas
e
m
anag
em
ent
syst
e
m
s.
By
design
i
ng
Ma
pRe
duce,
G
oogle
ha
s
m
ade
m
il
l
i
on
s
of
us
ers
a
rou
nd
t
he
w
or
l
d
fin
d
con
te
nt
f
ro
m
m
i
ll
ion
s
of
pa
ges
withi
n
a
hundre
dth
of
a
second;
therefore
,
the b
ul
k
proces
sing
pr
ob
le
m
h
as b
ecom
e a b
i
g
chall
en
ge
an
d
their analy
sis
techn
ol
og
ie
s a
re ch
a
ng
i
ng
r
a
pid
ly
.
On
t
he
oth
er
s
ide,
stra
ggle
rs
are
well
rec
og
nized
as
a
m
ajo
r
bott
le
nec
k
i
n
bi
g
data
pro
cessi
ng
a
nd
th
ey
can
hav
e
sig
nifica
nt
im
pacts
on
big
data
proce
ssing.
T
his
w
ork
ai
m
s
to
ev
al
uate
five
stra
gg
le
rs
ide
ntifi
cat
ion
m
et
ho
ds
:
Ha
doop
native
s
ch
edu
le
r,
l
ongest
ap
prox
im
at
e
t
i
m
e
to
end
(L
ATE
)
Sc
he
du
l
er,
Ma
ntri,
M
onT
oo
l,
and
D
olly
Th
e
perform
ance
of
t
hese
te
c
hn
i
qu
e
s
was
assessed
base
d
on
th
ree
be
nch
m
ark
e
d
m
et
hods
:
So
rt
, Gre
p
a
nd
Wor
dCoun
t.
The
rem
ai
nd
er
of
this
pa
per
is
structur
e
d
a
s
fo
ll
ows:
The
second
sect
io
n
de
picts
Ma
pRedu
ce
a
nd
struggles
.
I
n
th
e
third
se
gm
ent,
five
stra
gg
le
r
s
ide
ntific
at
ion
ap
proac
hes
a
re
presente
d.
I
n
the
f
ort
h
se
gem
ent,
exp
e
rm
ental
resu
lt
s
are
present
ed
This
sect
io
n
is
fo
ll
owe
d b
y t
he
co
nclu
sio
n
a
nd futu
re
w
ork of t
his st
udy.
2.
MAPRE
DUC
E FR
AM
E
W
ORK A
N
D
S
TRAGGL
ER
S
Ma
pRed
uce
is
the
m
at
ching
inf
or
m
at
ion
pr
e
par
i
ng
pe
r
f
ect
m
od
el
s
propose
d
for
c
onside
rab
le
inf
or
m
at
ion
pr
epar
at
io
n
on
bunc
h
-
base
d
fig
ur
i
ng
desi
gns
[
9].
T
his
syst
e
m
is
us
e
d
insi
de
in
facil
it
at
ing
data
m
ining
,
sea
rch
app
li
cat
ion
s
a
nd
m
achine
le
arn
i
ng
at
ser
ver
centers.
T
he
phil
osophy
nee
ds
to
deal
with
broa
d
scal
e
web
sear
ch
ap
plica
ti
on
s
.
It
was
at
first
su
ggest
ed
by
Goo
gle
to
deal
with
tre
m
endou
s
scal
e
we
b
search
app
li
cat
io
ns
.
The
fo
c
us
is
li
censing
pr
ogram
m
ers
in
e
xtracti
ng
from
pro
blem
s
su
ch
as
pa
rall
el
iz
at
i
on,
bookin
g,
al
loc
at
ing
,
th
us
al
lowi
ng
them
to
fo
cu
s
on
de
ve
lop
in
g
ap
plic
at
ion
s.
I
n
m
od
ern
or
gan
iz
at
i
on
s
a
nd
enter
pr
ise
s,
t
he
re
are
f
our
f
act
or
s,
incl
uding
processi
ng,
storing,
vis
ua
li
zat
ion
,
an
d
analy
zi
ng
la
r
ge
data.
Ma
pRed
uce
c
an
r
un
the
a
ppli
cat
ion
s
on
a
par
al
le
l
cl
ust
er
of
ha
rdwa
r
e
autom
at
ic
all
y.
In
a
ddit
ion
;
it
can
process
terabyt
es an
d petabyt
es of
data m
or
e ra
pid
ly
.
Re
centl
y,
it
ha
s
gai
ned
po
pula
rity
in
a
w
ide
ra
nge
of
a
pp
li
cat
io
ns
due
to
it
s
a
bili
ty
to
pro
vid
e
a
highly
eff
ect
ive
an
d
ef
fici
ent
fr
am
ework
for
the
par
al
le
l
execu
ti
on
of
t
he
ap
plica
ti
ons,
data
al
locat
i
on
i
n
distrib
uted
dat
abase
syst
em
s
,
an
d
fa
ult
to
le
ran
ce
netw
ork
c
omm
un
ic
at
ion
s.
As
il
lus
trat
ed
in
Fi
gure
1,
par
al
le
l
m
ap
assignm
ents
are run
as o
ne
in
pu
t
data
as
a
gat
he
rin
g
of
<
key, val
ue
>
set
s w
hi
ch
is
that
is di
vi
ded
into
fixe
d
produce
a
nd
siz
e
blo
ck
s
tra
ns
it
iona
l
ou
t
pu
t.
T
he
m
od
el
of
Ma
pR
edu
ce
Pro
gr
a
m
m
ing
com
pr
ise
s
of
inf
or
m
at
ion
pr
epar
i
ng capacit
ie
s:
m
ap
an
d re
du
ce
.
Figure
1
.
Ma
pR
edu
ce
f
ram
e
work
Evaluation Warning : The document was created with Spire.PDF for Python.
Int
J
Elec
&
C
om
p
En
g
IS
S
N:
20
88
-
8708
Strag
gler ha
ndli
ng
appro
ach
e
s in
M
apRe
du
c
e fra
mework
: a
compar
ative
s
tud
y (
A
nwa
r
H. Katr
awi)
377
In
the
Ma
p
-
ph
ase, w
he
n
t
he
us
er
re
quest
s
t
o
perform
a
jo
b,
t
he
ta
s
ks
a
re
sent
t
o
t
he
Ma
p
m
achines
t
o
run.
T
he
Com
bin
e
r
reduces
the
a
m
ou
nt
of
data
transm
is
sion
in
the
net
work
in
the
R
edu
ce
ph
a
se.
So
rt
or
Me
rg
in
g
pa
rt
is
a
par
t
of
the
Re
du
ce
-
phase
.
The
tim
e
is
us
ed
to
integ
rat
e
Ma
p
ou
t
pu
ts
fr
om
diff
ere
nt
nodes
and
this
inte
grat
ion
is
con
si
der
e
d
as
a
Re
du
ce
ti
m
e.
The
Re
du
ce
-
pa
rt
is
the
la
st
s
t
ep
to
r
un
the
j
ob
i
n
a
Ma
pRedu
ce way.
T
he
ef
fec
t
of
each
pa
rt
of
this
process
on
runtim
e
is
di
ff
e
ren
t
an
d
to
e
stim
at
e
the
end
tim
e
of
each
j
ob,
a
ppr
opriat
e
weigh
ts
(t
he
eff
ec
t
of
each
par
t
on
e
xecu
ti
on
proces
s
is
ob
ta
ined
f
r
om
the
rati
o
of
the tim
e o
f
eac
h part t
o
the
tot
al
tim
e) s
hould be
us
e
d for ea
ch part
.
F
or
m
or
e
d
et
ai
ls see
[9
]
.
Strag
glers
ar
e
the
end
ea
vor
s
that
set
asid
e
longe
r
effor
t
in
exec
ution
oth
e
r
than
com
par
at
ive
assignm
ents.
Be
hind
the
a
ssign
m
ent,
th
ere
are
va
rio
us
purpose
t
o
set
asi
de
lo
ng
e
r
e
ffor
t,
s
uch
as,
i
m
per
fect
m
ac
hin
es
,
pr
opor
t
ion
of
inf
orm
at
ion
to
proc
ess,
f
ram
ewo
r
k
blo
c
kag
e
,
he
te
rogen
ei
ty
a
m
on
g
equ
i
pm
ent,
an
d
co
ntenti
on
f
or
t
he
cu
rr
e
nt
resou
rces
[
10
,
11
]
.
I
n
a
ny
case,
it
is
no
t
sign
ific
a
nt
f
or
an
assignm
ent
to
be
slo
we
r
al
l
arou
nd
it
s
e
xe
cution.
I
n
a
ddit
ion
,
i
n
the
e
ve
nt
that
one
ta
sk
r
unni
ng
slo
w
on
a
giv
e
n
m
achine,
it
isn'
t
signi
ficant
f
or
th
e
com
plete
ly
fu
t
ur
e
a
nd
prese
nt
assignm
ent
to
r
un
slo
wer
on
that
par
ti
cula
r
m
ac
hin
e.
In
deali
ng
with Ma
ntri, i
t i
s essen
ti
al
to keep
in
m
ind t
hr
ee
prim
ary
m
echan
ism
s:
-
In
t
he
eve
nt
it
is
found
that
t
he
ex
pecte
d
re
m
ai
nin
g
ti
m
e
e
xceeds
the
nor
m
al
ru
ntim
e,
t
hen
t
he
pr
oces
s
will
b
e
restarte
d up to t
hr
ic
e.
-
In
the
eve
nt
that
the
resource
m
easur
em
e
nt
decr
eases
unde
sirably
,
the
n
sche
du
li
ng
of
a
sp
ec
ulati
ve
duplica
te
ta
kes
place.
T
he
e
xpect
ed
rem
ai
ni
ng
ti
m
e
(tre
m
)
and
t
he
norm
al
runtim
e
(tnew
)
are
est
im
at
ed
as
il
lustrate
d
i
n t
he follo
wing
al
gorithm
s.
-
te
rm
= (
te
la
ps
ed *
d/dread
)
+
twra
pup
-
tnew
=
pr
ocess
Ra
te
∗
locat
ionF
act
or
∗
d+sc
he
dL
ag.
Ma
ny
reasons
fo
r
s
uc
h
stra
gg
le
rs
to
occ
ur
inclu
ding
lo
ad
im
balance,
scheduli
ng
in
eff
ic
ie
ncies,
data
local
it
y,
com
m
un
ic
at
ion
ov
e
r
he
ads
ha
rdwar
e
hete
r
og
e
neity
[
12,
13
]
.
The
re
ha
ve
al
s
o
been
effor
ts
lookin
g
to
ad
dress
on
e
or
m
or
e
of
t
hese
c
oncern
s
to
m
it
iga
te
the
pro
blem
[14
-
16]
.
Se
sip
it
e
al
l
of
these
pr
i
or
effor
ts
are
im
po
rta
nt
an
d
us
e
f
ul
in
over
com
i
ng
this
pro
ble
m
,
we
belie
ve
that
a
r
ig
orous
set
of
a
naly
ti
ca
l
too
ls
is
need
ed
in
or
der
to
bette
r
unde
rstan
d
the
conseq
ue
nces
of
stra
gg
le
rs
on
the
perform
a
nce
slowd
own
in
bi
g
data
[17
,
18]
.
3.
S
TRU
GGLE
RS
I
DE
NTIFI
CA
TI
ON
TE
CHNIQ
UES
V
ari
ou
s
m
et
ho
ds
a
re
pr
opos
e
d
for
stra
gg
le
rs
ide
ntific
at
ion
.
I
n
thi
s
sect
ion
,
fiv
e
struggler
s
identific
at
ion
m
et
ho
ds
i
nclu
ding
Hado
op
na
ti
ve
sche
dule
r
,
LA
TEsc
hedu
le
r,
Ma
ntri
,
M
onTo
ol,
a
nd
D
olly
to
assess thei
r
s
utabilt
uy for
stra
gg
le
rs
ide
ntific
at
ion
.
3.1.
H
adoop
na
ti
ve
sc
hedul
er
Hado
op
native
sche
dule
r
al
l
oc
at
es
a
p
r
ogres
sion
sc
or
e
so
m
ewh
e
re
i
n
t
he
range
of
0
a
nd
1
to
each
ta
sk
.
F
or
le
sse
n
assig
nm
ents,
to
exec
ute
is
isolat
ed
into
3
phases
,
ev
er
y
on
e
of
t
hat
record
s
f
or
1/
3
of
the
sco
re.
In
t
he
case
of
a
m
ap,
the
progr
ession
c
ount
a
ddresses
t
he
pro
portio
n
of
in
pu
t
in
f
or
m
at
ion
[
1
9
].
The
sta
ges
a
re:
-
At the
first sta
ge (co
py stage
)
, th
e a
ssig
nm
e
nt g
et
s
the m
ap
outp
ut.
-
In the sec
ond s
ta
ge
(s
ort
ar
range)
, th
e
m
ap
outp
ut is ar
ra
nged by key.
-
The decl
ine
sta
ge,
t
he
c
us
tom
er
-
c
har
act
e
rized ca
pacit
ie
s are
conn
ect
e
d
t
o t
he
r
undo
wn of
guide yie
ld
.
Fo
r
e
ver
y
ta
s
k,
it
gras
ps
up
per
m
inu
te
an
d
t
he
assi
gn
m
ent
headway
s
cor
e
is
fou
nd
to
be
s
hort
of
ty
pical
fo
r
the
gro
up
le
ss
0.2,
t
he
un
it
is
ste
pp
e
d
as
a
s
trag
gler.
I
n
a
ddit
ion
,
at
eve
r
y
sta
ge,
th
e
sc
or
e
is
the
capaci
ty
of
data
pr
e
par
e
d.
At
that
point,
Hado
op
fi
gure
s
the
norm
al
procedu
re
value
to
each
cl
assifi
cat
ion
of
assig
nm
ents
fo
r
a
desc
ription
of
a
far
t
hes
t
po
int
for
the
or
et
ic
al
com
pl
et
ion
.
Tas
k
bel
o
w
the
fa
rthest
po
int
are
rec
ognized
com
par
at
ively
le
ssen
an
d
th
e
ti
es
between
them
are
con
trolle
d
by
data
local
e.
I
n
any
case,
the
sch
ed
uler
e
ns
ures
t
hat
just
a
sin
gle
the
oret
ic
al
du
plica
t
e
of
eve
ry
one
assig
nm
ent
is
run
ning
at
w
ha
te
ver
po
i
nt,
howe
ve
r
re
sche
dule
s
scor
e
s
wh
il
e
consi
der
i
ng
th
ei
r
aut
hen
ti
c
pro
gr
essi
on.
A
few
of
t
he
pivotal
su
s
picions t
hat
break
in virt
ua
li
zed,
hete
ro
ge
neous
bu
nch
es
are
-
-
In
dec
rease act
ivit
y, the s
or
t,
cop
y,
and lesse
n ph
a
ses e
ver
y
on
e
pr
oceed
s
a
rou
nd 1
/
3 of
t
he
total
tim
e.
-
Nodes
e
xp
ect
e
d
by
Ha
doop
can
exec
ute
th
e
j
ob
at
a
harshly
equ
i
valen
t
rate.
In
ad
di
ti
on
,
these
a
re
ho
m
og
e
neous
, while
unde
rtak
ing
s
propel at
a
r
el
ia
ble
rate al
l arou
nd tim
e.
Task
in
a
si
m
il
ar
char
act
erizat
ion
perform
s
relat
ive
pro
portion
of
work.
This
pr
es
um
ptio
n
par
ti
cula
rly
fa
il
s
fo
r
re
du
ce
unde
rtakin
gs
in
ha
ving
tre
m
end
ous
m
odific
at
ion
in
ke
ys
appor
ti
on
e
d
to
a
n
exp
li
ci
t re
du
ce
r.
Evaluation Warning : The document was created with Spire.PDF for Python.
IS
S
N
:
2088
-
8708
In
t J
Elec
&
C
om
p
En
g,
V
ol. 11,
No.
1,
Febr
uar
y
2021 :
37
5
-
382
378
3.2.
LATE
sc
heduler
LATE
co
ns
ide
rs
a
no
t
her
f
ra
m
ewo
r
k
t
hat
e
stim
at
es
the
finish
i
ng
tim
e
fo
r
assig
nm
ents
in
m
at
ching
cl
ass
to
a
ntici
pate
pr
e
dicti
on
to
t
he
st
raggl
ers
t
hat
ha
ve
pote
ntial
.
I
n
ad
diti
on
,
L
ATE
addresses
in
er
rors
i
n
the H
a
doop
sc
heduler
ach
ie
ve
d
in
a
var
ie
d
c
onditi
on [
20
].
-
LATE
use
s
a
c
ap
on
the
m
eas
ur
e
of
t
heoreti
cal
assig
nm
ent
s
that
ca
n
kee
p
r
unning
at
on
ce,
to
deal
wit
h
the m
ann
er i
n,
wh
ic
h
that t
he
or
et
ic
al
assig
nm
ents co
st
res
ources.
-
LATE
r
em
ai
ns
edu
cat
e
d
in
re
gards
to
direct
nodes
in
t
he
f
ram
ewo
rk
a
nd
nev
e
r
r
un
s
t
he
or
et
ic
al
cop
ie
s
on th
os
e
nodes
.
-
Assignm
ents
with
headway
rates
undern
eat
h
25
perce
nt
ed
ge
of
s
umm
arized
un
der
ta
kings
a
re
recog
nized
as
s
trag
glers.
-
Score
f
or
the
pro
gr
ess
is
f
ou
nd
out
us
in
g
t
he
Hado
op
sc
heduler
local
ly
.
T
he
r
at
e
of
Adva
ncem
ent
is
determ
ined
as
pro
gr
essi
on sc
or
e/
T
w
her
e
by
T r
e
pr
ese
nts t
he
runnin
g
ti
m
e
of the as
sig
nme
nt.
-
Com
pleti
on
ti
m
e is est
i
m
at
e
d by the
f
or
m
ula (
1
-
P
rogr
e
ss
Score)/P
rogr
e
s
s Rat
e.
LATE
appreci
at
es f
ollo
wing
favor
a
ble circ
um
s
ta
nces:
-
Likewise,
by
m
on
it
or
ing
sur
veyed
le
ft
ti
m
e
rathe
r
tha
n
he
adw
ay
rate,
L
ATE
s
pec
ulati
vely
exec
ute
just
assignm
ents,
w
hich
a
dvanc
e
job react
io
n
ti
m
e, as c
ontradict
ed
to
an
y
slo
w
er task
s.
-
LATE
conside
rs nodes
heter
ogeneit
y w
hile c
hoos
i
ng where
to h
y
po
t
hesize
assignm
ents
.
-
It
is
str
ong
to
heter
og
e
neity
of
no
de
c
onsid
erin
g
that
that
re
-
im
pelli
ng
only
slo
ws
t
he
assignm
ents
a
nd
on
ly
a
fe
w
am
ounts
of un
der
t
akin
gs
.
Howe
ver, L
AT
E Sc
heduler
ha
s the
fo
ll
owin
g wea
kn
e
ss
[21]:
-
Ti
m
e
req
ui
red
by
L
ATE
Sc
heduler
is
f
ound
to
be
highe
r,
usual
ly
on
e
m
inu
te
,
in
sta
r
ti
ng
e
valuati
on
,
pr
i
or
t
o
a
n
act
i
vity
b
ei
ng ste
pped
as a
stra
ggle
r.
-
In
the
fi
nal
act
ivit
y,
t
i
m
e
fo
r
a
ta
sk
is
dete
rm
ined
us
ing
t
he
determ
ined
ou
t
-
pro
gr
essi
on
rate
m
idd
le
value
of the ou
t
-
hea
dw
ay
r
at
e
b
esi
de
the p
r
e
sent p
r
ogressi
on r
at
e, the
en
din
g
ti
m
e
f
or
ese
en
is i
nclined
to
m
ixed
up.
-
En
or
m
ou
s
unde
rtakin
g
will
ten
d
to
ta
ke
to
a
gr
eat
er
degree
of
ris
k
than
the
rests
to
pr
oces
s,
al
ong
the
s
e
li
nes
it
is possi
ble to be la
belle
d
as
a
po
s
sibil
it
y t
o
be
c
onj
ec
ture
d br
in
ging
about s
quan
deri
ng
res
ources.
If
t
her
e
is
no
ex
planati
on
of
the
te
m
perat
e
natu
re
of
the
ac
knowl
edg
e
d
strag
gle
rs
sea
rch
e
d,
the stra
gg
le
r
as
su
ra
nce m
ay
b
e wro
ng. T
his
essenti
al
ly
p
r
om
pts
m
uch
r
es
pons
e
tim
e.
3.3.
M
antri
Ma
ntri
is
s
upe
rior
c
on
ce
rn
i
ng
ou
tl
ie
rs
sinc
e
it
us
es
c
on
ti
nuous
pro
gr
es
s
ion
re
ports;
M
antri
fo
ll
ows
up
a
nd
disti
nguis
hes
t
he
st
ragglers
f
ro
m
the
sta
rt.
Ea
r
ly
op
e
rati
ons
per
m
it
the
ass
et
s
us
e
d
by
c
om
ing
assignm
ents
and
acce
le
rate
t
he
ge
ner
al
ly
the
em
plo
ym
ent
[
21
]
.
Ma
nt
ri
is
giv
en
a
n
op
portu
nity
to
up
gra
de
ov
e
r
previ
ou
s
work
that
just
duplica
te
s
the
slou
c
hes
by
t
he
act
ing
de
pe
ndently
upon
th
e
ef
f
ect
s
a
nd
the
asset
and op
portu
nity
co
st
of
acti
viti
es. I
t uti
li
zes
the accom
pan
y
ing
m
et
ho
ds:
-
Re
sta
rting
e
xc
eption t
asks
a
war
e
of a
dv
a
nt
age im
per
at
ives and
wor
k un
e
venness
att
rib
ut
es,
-
Ensuri
ng yi
el
d of assi
gnm
ents subor
din
at
e
upon a
cost
b
e
ne
fit analy
sis,
a
nd
-
Netw
ork
ca
reful p
os
it
io
n of
a
ssign
m
ents.
Ma
ntri
pe
rceiv
es
focuse
s
at
wh
ic
h
assi
gn
m
ents
are
unfit
to
ga
in
gro
und
at
the
cust
om
a
ry
rate
an
d
execu
te
s
f
ocused
on
res
ults.
Ma
ntri
s
po
ts
unde
rtakin
gs
subor
din
at
e
in
the
a
rea
of
t
heir
in
f
orm
at
ion
so
urces
,Th
e
c
ontr
olli
ng
ga
uges
that
per
cei
ve
Acc
ordin
g
to
Ma
ntri
previ
ous
a
nom
aly
reli
ef
outl
ie
r,
pla
ns
a
re
eff
ect
s
t
o
m
ind
f
uln
es
s
an
d
asset
discer
nm
ent.
Re
m
ark
able
exe
rcises
are
require
d
for
var
i
ous
r
easo
ns
.
Key Me
cha
nis
m
in Mantri:
-
A
the
or
et
ic
al
c
op
y
is
sc
he
du
l
ed
j
us
t
if
the
pro
portio
n
of
as
set
ob
li
ge
d
is
dim
inished
al
ong
these
li
nes
.
At m
os
t t
hr
ee c
ou
l
d be th
ree c
op
ie
s
of sim
il
a
r
er
ra
nd
s
(
ta
s
ks).
-
If
t
he
antic
ipa
te
d
resi
du
al
ti
m
e
fo
r
a
n
e
rrand
is
m
uch
higher
t
han
th
e
ordina
ry
r
unni
ng
tim
e
of
the assig
nm
ent after
restarti
ng
, th
e a
ssig
nm
e
nt is starte
d
a
ga
in unti
l t
he hi
gh
e
st val
ue
is a
chieve
d.
The
a
ppr
ox
im
at
ion
of trem
an
d
t ca
n be
wr
it
te
n
as:
=
(
∗
)
+
(1)
wh
e
re
d=
e
ntir
e inform
at
ion
t
o
process
, f
ea
r
=data o
ff
ic
ia
ll
y pr
e
par
e
d
a
nd
twr
a
pup=
tim
e
to recal
l res
ults
=
∗
∗
+
ℎ
(2)
Evaluation Warning : The document was created with Spire.PDF for Python.
Int
J
Elec
&
C
om
p
En
g
IS
S
N:
20
88
-
8708
Strag
gler ha
ndli
ng
appro
ach
e
s in
M
apRe
du
c
e fra
mework
: a
compar
ative
s
tud
y (
A
nwa
r
H. Katr
awi)
379
3.4.
M
on
t
ool
It
assem
bles
i
nfor
m
at
ion
on
the
err
a
nds
by
su
ccee
ding
fr
am
ewo
r
k
cal
ls
and
in
vestigat
ing
t
hem
.
In
addit
io
n,
MonTo
ol d
isc
ov
ers
the straggle
rs,
stra
gg
le
rs
eff
ect
s and their
r
easo
ns
u
si
ng
the d
at
a. Th
e
da
e
m
on
run
ning
of
Mo
ntool
at
m
ast
ers
giv
es
a
struct
ur
e
purs
ue
cal
l
from
all
wo
r
ke
rs
cente
r
poin
t.
More
ov
e
r,
it
ta
kes
the
sta
te
m
a
chines
for
va
rio
u
s
en
dea
vors
in
fi
gures
and
run
ning
li
ken
ess
sta
te
m
achines;
score
.
Mon
t
oo
l
sifti
ng
ap
proach
e
nt
er
rest
m
od
e
for
2
seco
nds
and
in
the
in
returnin
g
act
ive
m
od
e,
it
s
et
up
the
fr
am
ewo
r
k
cal
ls
gather
e
d
in
2
sec
onds
[
22
]
.
The
m
ast
er
receives
the
sent
inf
or
m
at
ion
at
that
po
i
nt.
The
n
the
Mo
nt
oo
l
gathe
rs
i
nfo
rm
ation
c
on
cern
i
ng
the
a
s
sign
m
ent
by
s
ucceedi
ng
the
fr
am
ewo
r
k
cal
ls
an
d
lookin
g
at
t
he
m
.
In
ad
diti
on
,
it
pro
duces
fr
am
ewo
r
k
cal
l
sta
te
m
achin
es
f
or
al
l
f
ra
m
ewo
r
k
cal
ls
ever
y
10
m
seconds b
y gatheri
ng
plate
/org
a
nize rea
dings and c
ompo
s
e fr
am
ewor
k
cal
ls for guid
e and
dim
inish f
orm
s
m
ade b
y
Hado
op. S
im
ilit
ud
e
(S
im
i
la
riti
es)
Scor
e
for t
w
o proced
ures is
det
erm
ined
as:
=
1
−
(
1
∑
(
1
−
2
)
=
1
1
⁄
)
(3)
wh
e
re
Ni
1
f
ur
t
her
m
or
e,
N
i
2
sta
nd
s
f
or
the
qu
a
ntit
y
of
progress
st
at
e
fo
r
ith
pair
sta
te
fo
r
firs
t
and
seco
nd sep
a
rat
el
y.
The
Th
res
hold
fo
r
str
ag
gler
i
s
arr
a
ng
e
d
to
85
per
ce
nt
to
s
uch
a
n
exte
nt
that
on
t
he
off
chan
ce
one
proce
dure
m
akes
<8
5p
e
rc
e
ntagefram
ewo
r
k
cal
ls,
it
is
patte
rn
e
d
as
a
str
agg
le
r
w
hile
the
hypotheti
cal
copy
is l
aun
c
he
d.
Con
fi
ne
me
nts:
-
Associ
at
ing
f
r
a
m
ewo
r
k
cal
ls
can'
t
be
accom
pl
ished
with
ou
t
in
form
at
io
n
ab
ou
t
the
ke
ys.
In
ad
diti
on
,
the case
of k
ey
s is consist
entl
y un
a
vaila
ble
i
n Hado
op.
-
It
ack
nowled
ge
s
al
l
the
m
aps
or
reduces
as
s
ign
m
ents
work
upon
com
par
a
ble
est
i
m
at
ed
re
m
ai
nin
g
ta
sks
and
get
to
data
in
a
com
par
at
i
ve
patte
r
n.
I
n
a
ny
case,
this
s
upposit
ion
le
ss
ens
ta
sk
s
as
da
ta
siz
e
per
us
e
d
by lessen
assig
nm
ents
m
igh
t be
par
ti
cula
r
f
or eac
h
under
ta
king
3.5.
D
OLL
Y
DO
L
LY
overs
ees
this
syst
em
in
m
anag
in
g
the
st
raggle
rs
in
a
proact
ive
m
ann
er
.T
he
pri
m
ary
chall
enge
of
cl
on
i
ng
was
c
om
ing
up
with
i
nterm
ediat
e
inform
ation
tra
nsfer
e
ff
ect
ive
s
uch
a
s
m
ai
ntain
in
g
a
strat
egic
dista
nce
from
var
iou
s
assig
nm
ents
dow
ns
tream
in
the
act
ivit
y
from
battl
in
g
f
or
the
eq
ui
valent
up
st
ream
yi
el
d
.
Cl
onin
g
of
li
tt
le
occupati
ons
can
be
acc
ompli
sh
e
d
with
a
couple
of
a
dd
it
ion
al
as
set
s
in
view
of
the
ov
e
r
wh
e
l
m
ing
ta
il
c
irculat
ion
of
oc
cu
pation
est
im
at
e
s
[
23
,
24]
;
m
os
t
of
the
em
plo
ym
ents
are
li
t
tle
an
d
can
be
cl
one
d
with
nea
rly
nothing
ov
e
r
head.
As
restrict
ed
to
ho
l
d
up
an
d
end
ea
vorin
g
to
f
or
esee
stra
ggle
rs
,
then
it
ta
kes
th
eor
et
ic
al
i
m
plem
entat
ion
to
it
s
ou
t
rag
e
ous
a
nd
disp
at
c
hes
var
i
ou
s
cl
on
es
of
al
l
unde
rtak
ing
s
of
t
he
releva
nt
oc
cup
at
io
n
a
nd
si
m
ply
uti
li
ze
the
con
se
que
nce
of
the
firs
t
cl
on
e
to
fini
sh
[
25
]
.
I
nter
m
ediary
inf
or
m
at
ion
acce
ss
with
D
ol
ly
char
act
erize
s
it
s
m
e
tho
dol
og
ie
s
f
or
m
oderati
ng
dis
pu
t
e
w
hile
getti
ng
to
interm
ediary
i
nfor
m
at
ion
fro
m
diff
ere
nt
m
ap
proc
e
dures
com
pleti
ng
at
the
sa
m
e
time.
Table
1
prov
i
des
com
par
ison bet
ween st
ra
gg
le
r
s ide
ntific
at
ion m
e
tho
ds
us
ed
in this
w
ork.
-
Con
te
ntion
a
voida
nce
cl
onin
g
(C
AC):
Her
e
wh
e
n
a
n
up
st
r
ea
m
assignm
e
nt
cl
one
com
plete
s,
the
yi
el
d
is
directed
to
pre
ci
sel
y a si
ng
le
ta
sk
on the
do
wn
st
ream
o
rd
e
r
f
or clo
ne
-
cl
one c
ollec
ti
on
.
Ѱ (
n,
c
, d)
=
Prob
a
bili
ty
[
n u
pst
ream
err
ands
of c clo
nes wit
h
> =
d cl
ones
per coll
ect
ion.]
wh
e
re,
p =l
ikeli
hood of a
n
e
rrand st
ra
gg
li
ng
as:
Ψ
(
,
,
)
=
∑
(
⁄
−
=
0
)
(
1
−
)
−
(4)
Do
ll
y ch
a
racte
rized likel
ih
oo
d for
occupati
on stra
ggli
ng w
i
th CAC
a
s:
P
=
1
−
∑
[
Ψ
(
,
,
)
−
Ψ
(
,
,
−
1
)
(
1
−
)
]
=
1
(5)
-
Con
te
ntion
cl
on
i
ng
(CC):
Af
te
r
the
c
om
pleti
on
of
the
upstream
cl
on
e
assi
gn
m
ent,
al
l
dow
ns
tre
a
m
err
a
nds
read
t
he
cl
on
e
s’
up
stream
ou
tp
ut,
wh
ic
h
eases
the
issue
of
confli
ct
.
Do
l
ly
char
act
erized
li
kelihood f
or task st
raggli
ng
with CC as:
P
=
1
−
∑
[
Ψ
(
,
,
1
)
(
1
−
)
]
=
1
(6)
-
Ever
y
dow
ns
tr
ea
m
cl
on
e
wai
ts
fo
r
a
n
id
eal
per
i
od
(ώ)
to
s
ee
wh
et
her
it
c
an
cat
ch
a
n
el
it
e
duplica
te
of
interm
ediary
i
nfor
m
at
ion
.
I
n
the
even
t
that
the
down
st
rea
m
cl
on
es
do
not
get
it
s
restrict
ive
duplica
te
Evaluation Warning : The document was created with Spire.PDF for Python.
IS
S
N
:
2088
-
8708
In
t J
Elec
&
C
om
p
En
g,
V
ol. 11,
No.
1,
Febr
uar
y
2021 :
37
5
-
382
380
even
s
ubseq
ue
nt
to
hangin
g
ti
gh
t
for
ώ,
it
peruses
dis
p
ute
to
on
e
of
th
e
com
plete
d
up
stream
cl
on
es
yi
el
d.
Th
e
idea
l t
i
m
e o
f
ώ c
on
siders
ty
pical
varie
ti
es am
on
gst
upstream
clon
es.
Table
1
.
Me
t
hods
c
om
par
iso
n of
stra
gg
le
r
s
id
entifi
cat
ion
Co
m
p
a
riso
n
Had
o
o
p
nativ
e sch
ed
u
ler
LAT
E
Sch
ed
u
ler
Mantri
Mon
T
o
o
l
Do
lly
Metr
ic
f
o
r
Sp
ecul
a
tiv
e
Execu
tio
n
W
eig
h
t equ
iv
alen
ce fo
r
Red
u
ce a
n
d
M
ap
j
o
b
s
Ti
m
e
taken
to
co
m
p
let
e task
Bas
es Iden
tif
icatio
n
an
d
corre
ctiv
e
m
ea
su
re
Sy
ste
m
Call
Un
so
p
h
isticated
Clo
n
in
g
Cap
on
Nu
m
b
e
r
o
f
Sp
ecul
ativ
e E
x
ecut
io
n
Neg
ativ
e
Neg
ativ
e
Neg
ativ
e
Neg
ativ
e
Af
f
ir
m
ativ
e
Data P
rocess
in
g
Techn
iq
u
e
MapR
ed
u
ce
MapR
ed
u
ce
MapR
ed
u
ce /
Dryad
MapR
ed
u
ce
MapR
ed
u
ce/
Dryad
Heterog
en
eity
a
m
o
n
g
Netwo
rk No
d
es
Neg
ativ
e
Af
f
ir
m
ativ
e
Af
f
ir
m
ativ
e
Af
f
ir
m
ativ
e
Af
f
ir
m
ativ
e
Priority
W
ise
Sch
ed
u
lin
g
Af
f
ir
m
ativ
e
Neg
ativ
e
Neg
ativ
e
Neg
ativ
e
Neg
ativ
e
Ov
erhead
Increased
Mediu
m
Mediu
m
Increased
Increased
In
T
a
ble
1,
t
he
m
et
ric
fo
r
sp
eculat
ive
e
xe
cution
for
the
five
stra
gg
le
r
detect
ion
te
ch
niques
we
re
fou
nd
t
o
be
dif
fer
e
nt
f
or
eac
h
on
e
the
te
ch
ni
qu
e
s.
Howe
ver,
this
was
not
the
case
for
the
Ca
p
on
nu
m
ber
of
sp
ec
ulati
ve
ex
ecuti
ons.
All
the
te
chn
i
qu
e
s
did
no
t
ex
hi
bit
any
c
har
ac
te
risti
c
fo
r
this
cat
ego
ry.
T
he
dat
a
processi
ng
te
c
hn
i
qu
e
was
f
ound
to
be
si
m
il
ar
fo
r
H
adoo
p
native
scheduler,
L
ATE
an
d
Mo
ntool.
The
he
te
roge
ne
it
y
a
m
on
g
ne
twork
nodes
wer
e
fou
nd
t
o
be
sim
il
ar
for
al
l
te
chn
iq
ue
s
excep
t
the
Hado
op
native sc
he
du
l
er tech
nique.
T
his was al
s
o
f
ound t
o be the
s
a
m
e case for
pri
or
it
y wise sc
he
du
li
ng.
4.
E
X
PER
MEN
TAL
RES
UL
TS A
ND DIS
CUSSIO
N
To
assess
t
he
process
of
di
agnosin
g
stra
gg
le
r
ta
sk
s
a
nd
as
sig
ning
them
to
oth
e
r
nodes
i.e
.
sp
ec
ulati
ve
e
xe
cution
f
or
th
e
Ha
doop
native
sc
he
du
le
r,
LATE
sc
hedul
er,
Ma
ntri,
MonTo
ol,
a
nd
Do
ll
y
,
three
be
nc
hm
a
rk
e
d
m
et
ho
ds
are
us
e
d
w
hich
include
S
or
t,
Gr
e
p
an
d
Wo
r
dCou
nt.
W
e
m
anu
al
ly
slow
e
d
dow
n
8
virt
ual
m
achi
nes
(
VMs)
i
n
a
cl
us
te
r
of
90
with
bac
kgr
ou
nd
processe
s
to
sim
ulate
stra
gg
le
rs.
–
fa
ulty
nodes
.
The
ot
her
m
ac
hin
es
wer
e
al
lo
cat
ed
betwee
n
1
an
d
10
VMs
per
ho
st.
As
ou
r
wor
klo
a
d,
we
evaluated
the
work
with
a
"s
or
t"
t
ask
in
a
total
da
ta
set
of
40
G
B
and
each
of
128
MB
for
ea
ch
ho
st.
Eac
h
job
has
57
5
ta
s
ks
on
the
m
ap
and
510
re
du
ce
d
ta
s
ks
(it
s
houl
d
b
e
note
d
that
Ha
doop
le
aves
som
e
fr
ee
cap
aci
ty
fo
r
s
pec
ulati
on
a
nd
fail
ed
m
issi
on
s).
T
he
strag
gl
ers
wer
e
ge
nerat
ed
by
run
ning
f
our
CPU
-
in
te
ns
ive
proces
ses
(ti
ghte
ne
d
loop
s
m
anipu
la
ti
ng
900
KB
ar
ray
s)
a
nd
f
our
disk
-
i
ntensi
ve
processes
(FDS
ta
sk
s
for
ge
ne
rati
ng
hu
ge
fil
es
in
a
loo
p).
Th
e
aver
a
ge
res
ults
of
5
e
xp
e
rim
e
nts
for
the
us
e
d
strug
geles
m
et
hods
are
sho
wn
in
F
ig
ure
2
(see
in
Appe
nd
i
x)
.
Ba
sed
on
t
he
obta
ined
re
su
lt
s,
the
L
ATE
S
cheduler
m
et
ho
ds
out
perfor
m
ed
Ha
doop
native
sche
du
le
r
, Ma
ntri,
M
onTo
ol,
and
D
olly
.
As
it
ca
n
be
s
een
in
the
F
ig
ur
e
2,
the
res
ul
ts
at
ta
ined
by
the
L
ATE
Sc
heduler
by
Grep,
is
sm
al
le
r
than
the
res
ults
achieve
d
by
W
or
dCoun
t
a
nd
Sort.
T
his
can
be
e
xpla
ined
by
co
ns
i
de
rin
g
the
wor
klo
a
d.
Wor
dCoun
t
a
nd
S
or
t
wr
it
e
hu
ge
qua
ntit
ie
s
of
data
ac
ro
s
s
the
net
wor
k
an
d
to
t
he
com
pute
r.
I
n
the
oth
e
r
sid
e
,
Gr
e
p
pro
vid
es
each
r
ed
ucer
just
a
li
m
it
ed
am
ou
nt
of
byte
s
i.e,
a
c
ount
for
each
w
ord
.
T
her
e
fore,
it
co
uld
be
con
cl
ud
e
d
that
LATE
Sche
dule
r
pe
rfor
m
ed
the
best
co
m
par
ed
to
othe
r
m
et
ho
ds
w
hen
S
or
t,
Gr
e
p
an
d
Wor
dCoun
t a
r
e u
se
d wit
h res
pecti
vely
60%,
50% a
nd
80
%
.
5.
CONCL
US
I
O
N
AND
F
UT
U
RE W
ORK
W
it
h
the
exce
s
sive
gr
ow
t
h
in
the
inf
orm
ation
a
nd
data
t
hat
pro
du
ce
d
by
t
he
inte
rn
et
of
t
h
in
gs
(IoT),
so
ci
al
m
edia,
m
ul
tim
edia
an
d
et
c.
ap
plica
ti
on
s
,
their
a
nal
ysi
s
b
ecam
e
a
chall
enge
an
d
m
or
e
co
m
plex
du
e
to
the inc
reasin
g vo
l
um
e o
f
str
uc
ture
d
a
nd
unst
ru
ct
ur
e
d data.
Ma
pRed
uce is
a fau
lt
tole
ran
t,
scala
ble and
sim
ple
fr
am
ewo
r
k
for
data
proc
essing
t
hat
ena
ble
s
it
s
us
ers
t
o
proces
s
these
m
assive
am
ou
nt
s
of
data
e
ff
ec
ti
ve
ly
.
This
work
a
i
m
ed
to
eva
luate
five
strag
glers
i
de
ntific
at
ion
m
eth
ods:
Hado
op
native
sc
he
du
le
r,
LATE
Sc
he
dule
r
,
Ma
ntri,
MonTo
ol
,
an
d
Do
ll
y
us
in
g
S
or
t,
G
re
p
an
d
Wor
dCoun
t
be
n
c
hm
ark
e
d
m
e
thods
.
Accor
ding
to
exp
e
rim
ental
resu
lt
s,
L
ATE
sche
du
le
r
outp
erfor
m
ed
the
oth
e
r
m
et
ho
ds
us
e
d
in
this
work.
We
ca
n
c
oncl
ude
that
LA
TE
Sche
du
le
r
wou
ld
be
ef
fici
ent
t
o
ob
ta
in
be
tt
er
resu
lt
s
for
st
ra
gg
le
rs
ide
ntific
at
ion
.
Fo
r
t
he
f
utur
e
work,
m
achine
le
arn
i
ng
m
et
ho
ds
ca
n
enab
le
us
to
know
the
sto
ry
beh
i
nd
t
he
data
,
f
or
insta
nce,
de
ep
le
ar
ning
a
ppr
oach
ca
n
be
us
e
d
to
ide
nt
ify
the
pro
per
nodes
for
r
unni
ng
the
strag
gl
er
ta
sk
s
and
al
s
o
it
can
prov
i
de
us
m
or
e
in
form
at
io
n
ab
ou
t
the
num
ber
of
fail
ur
e
s
in
diff
e
re
nt
phases
an
d
c
orr
el
at
ion
betwee
n diff
e
r
ent f
eat
ur
es
t
o ob
ta
i
n m
or
e ac
cur
at
e
res
ults
.
Evaluation Warning : The document was created with Spire.PDF for Python.
Int
J
Elec
&
C
om
p
En
g
IS
S
N:
20
88
-
8708
Strag
gler ha
ndli
ng
appro
ach
e
s in
M
apRe
du
c
e fra
mework
: a
compar
ative
s
tud
y (
A
nwa
r
H. Katr
awi)
381
APPE
ND
I
X
(a)
(b)
(c)
Figure
2. A
verage tas
k
e
xecu
t
ion
ti
m
e u
sing
(a)
Sort,
(b) Gr
ep,
(c)
Word
C
ount
REFERE
NCE
S
[1]
G.
Anantha
nar
a
y
an
an
,
e
t
al.,
“
Scar
l
et
t
:
copi
ng
with
skewed
cont
e
nt
popula
rity
in
m
apr
educ
e
cl
ust
ers
,
”
Proceedi
ng
s
of
th
e
six
th conf
e
renc
e
on
Compu
te
r syste
ms
(
EuroSys
'11
)
,
pp.
28
7
-
300,
2011
.
[2]
I.
A.
T.
Hashem
,
et
al.,
“
MapRe
duce
sche
dul
ing
al
gorit
hm
s:
a
rev
ie
w,
”
The
Jou
rnal
of
Superc
omputing
,
vol.
76
,
pp.
4915
-
4945
,
20
18
.
[3]
G.
Anantha
n
aray
an
,
et
al.
,
“
Eff
e
ct
iv
e
Stragg
le
r
Miti
gation:
At
tack
of
the
C
lone
s
,
”
Proceedi
ngs
o
f
th
e
10th
USEN
IX
conf
ere
n
ce on
N
et
worked
Syst
ems Design
and
Im
ple
menta
ti
on
,
pp
.
185
-
198
,
2013
.
[4]
P.
D'
Ancona
and
M.
Rei
ss
ig,
“
Ne
w
tre
nds
in
th
e
t
heor
y
of
nonl
inear
wea
kl
y
h
y
per
boli
c
equa
t
ions
o
f
sec
ond
orde
r
,
”
Nonli
near
Anal
y
sis: Theory,
M
ethods
&
Appl
ic
at
ions
,
vol
.
30
,
no
.
4,
pp.
2507
-
251
5,
1997
.
[5]
R.
Gandhi,
e
t
al
.
,
“
Yoda
:
a
highly
availa
b
le
lay
er
-
7
lo
ad
ba
la
nc
er
,
”
Pro
ce
ed
ings
of
the
El
e
ve
nth
Europea
n
Confe
renc
e
on
C
omputer
Syste
ms
(
EuroSys
'16
)
,
pp.
1
-
16
,
2016
.
[6]
L.
Zha
o
,
e
t
al.,
“
Eff
ective
Stragg
le
r
Mi
ti
ga
ti
on
w
it
h
Cross
-
Lay
e
r
Inte
rfe
r
ence
-
Aw
are
Opt
imization
,”
i
n
2019
IEEE
25th
Int
ernati
on
al
Conf
ere
nce o
n
Parallel
and
D
istribut
ed System
s (
ICPA
DS
)
,
pp.
177
-
182
,
201
9.
[7]
J.
Alwidia
n
and
A.
AlAhm
ad,
“
Hadoop
MapReduc
e
Job
Sched
uli
ng
Algorit
hm
s
Surve
y
and
Us
e
Cases
,”
Mod
ern
Appl
ie
d
Sc
ie
n
ce
,
vol.
13
,
no.
7,
p
p.
38
-
48
,
2019
.
[8]
A.
Ghods
i,
et
al
.
,
“
Choosy
:
m
ax
-
m
in
fai
r
sharing
for
dat
acent
er
jo
bs
with
constra
int
s
,
”
Proceedi
ng
s
of
the
8th
ACM
European
Conf
e
re
nce
on
Compu
te
r Sy
st
ems
(
EuroSys
'13
)
,
pp.
36
5
-
378,
2013
.
[9]
S.
N.
Khez
r
and
N.
J.
Na
vimipour,
“
MapReduc
e
and
I
ts
Applic
a
ti
ons,
Chal
l
enge
s,
a
nd
Archi
tectur
e
:
a
Com
pre
hensiv
e
Review
and
Dire
ctions
for
Future
Rese
arc
h
,
”
Journal
of
Gr
id
Computing
,
vol.
15
,
no
.
3
,
pp.
295
-
321
,
20
17.
Evaluation Warning : The document was created with Spire.PDF for Python.
IS
S
N
:
2088
-
8708
In
t J
Elec
&
C
om
p
En
g,
V
ol. 11,
No.
1,
Febr
uar
y
2021 :
37
5
-
382
382
[10]
“
OSDI
(9th
Use
nix
S
y
m
posiu
m
on
Opera
ti
ng
S
y
stems
Design
and
Im
plem
ent
at
ion)
[
adv
ert
isement
]
,
”
IE
EE
Sec
urit
y
&
Priv
acy
Magazin
e
,
vo
l.
8
,
no
.
4
,
pp
.
c4
-
c4,
2010
.
[11]
M.
Za
har
ia,
e
t
al.,
“
Im
proving
MapReduce
Perfor
m
anc
e
in
He
te
ro
gene
ous
Envi
ron
m
ent
s
,”
8th
USE
NIX
Symposium
on
Operating
S
y
stems Design
an
d
Imple
menta
ti
o
n
,
pp
.
29
-
42
,
20
08
.
[12]
J.
Re
y
,
e
t
al.
,
“
E
ffic
i
ent
Protot
y
p
ing
of
Fault
To
lerant
Map
-
Redu
c
e
Applicati
ons
with
Docke
r
-
Hado
op,
”
2015
I
EEE
Inte
rnational
Co
nfe
renc
e
on
Clo
ud
Engi
n
ee
ring
,
pp.
369
-
376
,
20
15.
[13]
U.
Kum
ar
and
J.
Kum
ar,
“
A
Com
pre
hensive
Revi
ew
of
Straggler
Handling
Algorit
hm
s
for
MapReduc
e
Fram
ework,
”
Int
.
J.
Grid
Distr
ib.
Com
put.
,
vol
.
7
,
no.
4
,
pp
.
139
–
1
48,
2014
.
[14]
Y.
Chen,
e
t
a
l
.
,
“
Design
insi
ghts
for
MapR
educ
e
from
div
erse
produc
ti
on
workloads,
”
T
ec
hni
ca
l
Repor
t
UCB/EE
CS
-
2012
-
17,
E
ECS De
p
.
,
Univer
sit
y
of C
al
ifornia, Ber
k
el
e
y
,
2012
.
[15]
H.
Zhou,
Y.
Li,
H.
Yang,
J.
Ji
a,
and
W
.
Li,
“
BigRoot
s:
An
Eff
ec
t
ive
Approa
ch
for
Root
-
Ca
use
Anal
y
sis
of
Straggl
ers in Bi
g
Data S
y
st
em,”
I
EE
E
Acce
ss
,
vol
.
6
,
pp
.
41966
–
4
1977,
2018
.
[16]
M.
F.
Aktas,
e
t
al
.
,
“
Eff
e
ct
i
ve
st
rag
gle
r
m
it
igatio
n:
W
hic
h
cl
ones
should
atta
ck
a
nd
when
?
”
ACM
SIGMETRICS
Pe
rform
ance
Evaluati
on
Revie
w
,
vol.
45
,
no.
2,
p
p.
12
-
14
,
2017
.
[17]
A.
K.
Abasi,
e
t
al
.
,
“
A
Te
xt
Feat
ur
e
Selecti
on
Te
chn
ique
bas
ed
on
Bina
r
y
Multi
-
Verse
Op
ti
m
iz
er
for
Te
x
t
Cluste
ring
,
”
20
19
IEE
E
Jorda
n
Inte
rnational
Joi
nt
Confe
re
nce
on
El
e
ct
ric
al
Engi
ne
ering
and
Information
Technol
ogy
(
JEEIT)
,
Amm
an,
Jordan,
pp
.
1
-
6,
2
019.
[18]
A.
H.
Katr
awi,
et
al
.
,
“
Ea
rl
ier
stage
for
straggl
e
de
tecti
on
and
handl
ing
using
combined
CP
U
te
st
and
LATE
m
et
hodo
l
og
y
,
”
In
t
ernat
io
nal
J
ournal
of
El
e
ct
r
ic
a
l
and
Comput
er
Eng
i
nee
ring
(
IJE
C
E
)
,
vol.
10
,
no
.
5,
pp.
4910
-
4917
,
2020
.
[19]
F.
Nza
n
y
w
a
y
ing
om
a
and
Y.
Yang,
“
Ta
sk
sche
d
uli
ng
and
vir
tual
resourc
e
op
tim
ising
in
Hadoop
YA
RN
-
base
d
cl
oud
computing
envi
ronm
ent
,”
I
nte
rnational
Jou
rnal
of
C
loud
Co
mputing
,
vo
l.
7
,
no.
2
,
p
p.
83
-
10
2
,
2018
.
[20]
M.
Us
ama,
et
al.
,
“
Job
sche
dule
r
s
for
Big
dat
a
pr
oce
ss
ing
in
Had
oop
envi
ronm
ent
:
Te
st
ing
real
-
life
sche
dul
e
with
benc
hm
ark
prog
rams
,”
Digi
tal
C
omm
unic
ati
ons
and
Net
works
,
v
ol.
3
,
no
.
4
,
pp
.
2
60
-
273,
2017
.
[21]
M.
N.
A
khta
r
,
e
t
al.
,
“
Map
-
Red
uce
bas
ed
ti
pp
in
g
point
sch
edul
e
r
for
par
a
llel
im
age
pro
ce
ss
ing
,”
Ex
p
ert
Syst
ems
wit
h
Ap
p
licati
on
s
,
vol.
139
,
2020
.
[22]
J.
L
in
and
M.
L
ee
,
“
Perform
anc
e
eva
lu
at
ion
of
job
sche
dule
rs
on
Hadoop
YA
RN
,”
Concurrency
an
d
Computati
on:
Pract
i
ce
and
E
x
perie
nc
e
(
CCPE)
,
vol.
28
,
no
.
9
,
pp.
2711
-
2728
,
201
6.
[23]
M.
Hanif
and
C.
Le
e
,
“
Jargon
of
Hadoop
MapReduc
e
sc
h
edul
ing
t
ec
hniq
ues:
a
scie
nt
ific
ca
t
egor
i
za
t
ion
,”
The
Knowle
dg
e Engi
ne
ering
R
evie
w
,
vol
.
34
,
201
9.
[24]
Q.
Chen,
et
al
.
,
“
Li
bra
:
Li
ghtw
ei
ght
d
ata
skew
m
it
iga
t
ion
in
m
apr
educ
e
,
”
I
E
EE
Tr
ansacti
on
s
on
Parall
e
l
&
Distribute
d
S
yst
ems
,
vol
.
26
,
no
.
9,
pp.
2520
-
253
3,
2015
.
[25]
G.
Li
u
,
et
al
.
,
“
Spparti
ti
on
er:
A
novel
par
t
it
ion
m
et
hod
to
han
dle
in
te
rm
ediat
e
data
skew
in
s
par
k
stre
aming,
”
Fut
ure
Gen
eration Compute
r Sy
stems
,
vol
.
86
,
p
p.
1054
-
1063
,
2
01
8.
Evaluation Warning : The document was created with Spire.PDF for Python.