Int
ern
at
i
onal
Journ
al of Ele
ctrical
an
d
Co
mput
er
En
gin
eeri
ng
(IJ
E
C
E)
Vo
l.
10
,
No.
5
,
Octo
be
r
2020
,
pp.
4910
~
4917
IS
S
N:
20
88
-
8708
,
DOI: 10
.11
591/
ijece
.
v
10
i
5
.
pp
4910
-
49
17
4910
Journ
al h
om
e
page
:
http:
//
ij
ece.i
aesc
or
e.c
om/i
nd
ex
.ph
p/IJ
ECE
Ear
li
er sta
ge for str
aggler
detecti
on
and h
an
dli
ng
usin
g
co
mb
ine
d CPU t
est and
LATE
m
ethod
ology
An
w
ar
H. K
atrawi
1
, Rosn
i
Ab
d
ull
ah
2
, M
ohammed
An
ba
r
3
,
A
m
mar
Ka
m
al
Abasi
4
1,3
Nati
ona
l
Adva
nce
d
IPv6 C
ente
r
(Nav6), Unive
r
siti
Sains
Malays
ia
,
Ma
lay
si
a
2,4
School
of
Co
m
pute
r
Scie
n
ce
s
,
Univer
si
ti Sain
s Mal
a
y
sia
,
Ma
l
a
y
si
a
Art
ic
le
In
f
o
ABSTR
A
CT
Art
ic
le
history:
Re
cei
ved
Oct
15
, 201
9
Re
vised
Ma
r
17
, 2
020
Accepte
d
Ma
r
30
, 202
0
Us
ing
Map
Reduc
e
in
Hadoop
hel
ps
in
lower
ing
the
execut
i
on
ti
m
e
and
power
consum
pti
on
for
la
rge
sca
l
e
da
ta
.
How
eve
r
,
the
r
e
c
an
b
e
a
d
el
a
y
in
job
proc
essing
in
circum
stanc
es
where
t
asks
are
assigned
to
bad
o
r
conge
ste
d
m
ac
hine
s
ca
l
led
"s
tra
ggle
r
tas
ks"
;
which
in
cre
as
es
the
t
i
m
e,
power
consum
pti
ons
and
the
r
efo
re
in
cre
asing
the
co
sts
and
l
ea
ding
to
a
poo
r
per
form
anc
e
of
computing
s
ystems
.
Thi
s
re
sea
rch
propose
s
a
h
y
bri
d
MapReduc
e
fr
a
m
ework
ref
err
ed
to
as
the
combinat
or
y
l
at
e
-
m
a
c
hine
(CLM)
fra
m
ework.
Im
ple
m
ent
ation
of
th
is
fra
m
ework
will
facil
i
tate
e
arly
and
ti
m
e
l
y
det
e
ct
ion
and
ide
nti
f
icati
on
o
f
straggl
ers
th
ere
b
y
facil
i
tati
ng
prom
pt
appr
opriate
and effe
c
ti
ve
a
ct
ions
.
Ke
yw
or
d
s
:
Bi
g
d
at
a
Com
bin
at
or
y
l
at
e
-
m
achine
Hado
op
Ma
p
r
e
duce
Strag
gler
Copyright
©
202
0
Instit
ut
e
o
f Ad
vanc
ed
Engi
n
ee
r
ing
and
S
cienc
e
.
Al
l
rights re
serv
ed
.
Corres
pond
in
g
Aut
h
or
:
Anwar H
. Kat
r
awi
,
Nati
on
al
A
dv
a
nced I
Pv6 Ce
nt
er (Nav6
),
Un
i
ver
sit
i Sai
ns M
al
ay
sia
,
11800 U
SM,
P
enang,
Mal
ay
sia
.
Em
a
il
: akatraw
i@st
ud
e
nt.
us
m
.m
y
1.
INTROD
U
CTION
A
si
gn
i
ficant
am
ou
nt
of
data
(
bi
g
data
)
is
store
d
an
d
trans
f
err
e
d
onli
ne
by
tho
us
a
nds
of
com
pan
ie
s,
orga
nizat
ion
s
,
an
d
in
div
id
ua
ls.
This
la
r
ge
ly
un
str
uctu
r
ed
data
is
dif
ficult
to
a
naly
ze
an
d
pr
oces
s
us
in
g
conve
ntion
al
da
ta
base
m
anag
e
m
ent
too
ls
w
hich
create
s
ne
w
chall
en
ges
in
the
analy
sis
and
the
st
orage
of
data
[
1].
Re
ce
ntly
,
there
has
bee
n
an
inc
r
easi
ng
i
nterest
in
key
a
reas
su
c
h
as
real
-
ti
m
e
data
extra
ct
ion
,
wh
ic
h
re
veals
an
ur
gen
t
nee
d
f
or
bulk
a
nd
stric
t
perform
ance
c
on
st
rain
ts.
Conseq
ue
ntly
,
the
ada
pta
ti
on
of
huge
data
to
i
m
ple
m
entat
ion
s
on
distri
bu
te
d
com
pu
ti
ng
pl
at
fo
rm
s
is
necessary.
On
e
w
ay
of
do
i
ng
thi
s
is
to
adopt
an
d
im
plem
ent
the
po
pula
r
pro
gr
a
m
m
ing
m
od
el
known
a
s
Ma
pRed
uce
[
2].
Its
su
cces
s
li
es
i
n
si
m
plici
t
y,
scal
abili
ty
,
eff
ic
ie
nc
y
and
exte
ns
i
b
il
it
y
that
pu
s
he
s
the
IT
in
dus
try
le
ader
s
su
c
h
as
G
oogle,
Y
ahoo,
Faceb
ook
a
nd
Am
azon
to
e
xtensi
vely
ad
opt
Ma
pRe
du
ce
as
a
po
werful
and
reli
abl
e
too
l
for
Bi
g
Data
processi
ng.
Th
ere
are
four
fa
ct
or
s,
i
nclu
ding
pro
cessi
ng,
storing,
vis
ualiz
at
ion
,
a
nd
a
na
ly
zi
ng
la
rg
e
data
i
n
m
od
ern
or
gan
i
zat
ion
s
a
nd
e
nt
erprises.
Ma
pR
edu
ce
ca
n
r
un
the
a
pp
li
cat
ion
s
on
a
pa
rall
el
cl
us
te
r
of
ha
rdware
autom
at
ic
ally.
In
ad
diti
on;
it
can
process
te
r
abyt
es
and
pet
abyt
es
of
data
m
or
e
rap
idly
[3
,
4]
.
Re
centl
y,
it
has
gaine
d
po
pu
la
r
it
y
in
a
wide
r
ang
e
of
ap
plica
ti
on
s
du
e
t
o
it
s
abili
ty
to
pro
vid
e
a
highly
eff
ect
ive
a
nd
ef
fici
ent
fr
am
ewo
r
k
for
the
par
al
le
l
ex
ecuti
on
of
t
he
app
li
cat
io
ns
,
da
ta
al
locat
ion
i
n
distrib
uted
da
ta
base
syst
em
s,
an
d
fau
lt
tolera
nce
networ
k
com
m
un
ic
at
ion
s
[
5
]
.
Fo
r
insta
n
ce
,
Goo
gle
runs
m
or
e
than
10,
000
disti
nct
pr
ogram
s
us
in
g
Ma
pRe
duce
inclu
ding
gr
a
ph
processi
ng
[
6]
,
te
xt
pro
cessi
ng,
m
achi
ne
le
arn
i
ng,
an
d
sta
ti
sti
cal
machine
translat
ion.
M
or
e
over,
the
f
a
m
ou
s
open
-
s
ource
Ha
doop
softwa
re
f
ra
m
ewo
r
k
for
di
stribu
te
d
sto
r
age
a
nd
processi
ng
of
big
data
F
i
gure
1
set
s
us
es
Ma
pr
e
duce
as
a
central
to
ol
to
sp
li
t
the
data
,
process
it
a
nd
m
ake
it
no
t
only
m
anag
eable
but
al
so
a
vaila
ble
f
or
use
rs
’
c
on
s
um
pt
ion
or
f
ur
t
her
proces
sin
g.
Als
o,
t
he
Ha
doop
Ma
pRed
uce
e
nv
i
ronm
ent
prov
i
des
fa
ult
-
tole
ran
t
so
l
utions
in
case
of
ha
rdwar
e
fail
ures
or
s
of
t
war
e
errors
durin
g
t
he
e
xe
cution o
f
ta
s
ks
[
7
]
.
Evaluation Warning : The document was created with Spire.PDF for Python.
In
t J
Elec
&
C
om
p
En
g
IS
S
N:
20
88
-
8708
Earlier st
age f
or
str
aggler
de
te
ct
ion
an
d h
andlin
g usi
ng c
ombin
e
d
CP
U t
est
…
(
Anwa
r
H.
K
atra
wi
)
4911
Figure
1
.
Ha
do
op
f
ram
ewo
r
k
Accor
ding
to
the
w
ork
prese
nt
ed
by
[
8
]
,
Ha
doop
Ma
pRe
duce
has
the
abili
ty
to
tolerat
e
s
ever
al
ty
pes
o
f
f
a
ults an
d
t
he
y are as
f
ollo
ws:
a.
Nodes
fail
ur
e:
A
no
de
in
a
Ma
pRed
uce
cl
us
t
er
m
a
y
fail
at
a
ny
tim
e.
In
this
case,
the
J
obTrack
e
r
rem
oves
this
sla
ve
node
from
the
li
st
of
no
des
av
ai
la
ble
and
re
-
execu
te
s
t
he
ta
sk
s
on
oth
e
r
nodes
.
It
can
be
con
cl
ud
e
d
that
a
no
de
is
de
cl
ared
fail
in
g
if
at
le
ast
on
e
ta
sk
la
un
c
hed
on
it
has
fail
ed.
At
the
tim
e,
the
JobT
rack
e
r
chec
ks
if
t
he
node
i
n
quest
i
on
sho
uld
not
be
blackli
ste
d.
If
a
sla
ve
node
is
"blackli
ste
d"
,
the
JobT
rack
e
r
will
no
lo
nge
r
assig
n
it
m
a
p
or
re
duce.
It
can
be
rem
oved
f
ro
m
this
list
if
his
be
ha
vi
or
beco
m
es nor
m
al
an
d d
oes
not
co
m
m
it
f
aults
dur
in
g
a
certai
n
ti
m
e interval.
b.
So
ft
war
e
Fail
ure:
A
ta
sk
m
a
y
stop
beca
us
e
of
a
n
error
or
e
xcep
ti
on
in
the
m
app
ing
or
re
du
ct
io
n
pr
ogra
m
.
In
this
case,
t
he
JobT
rack
e
r
order
s
the
re
-
execu
ti
on
of
the
fail
ed
ta
sk
to
a
lim
it
ed
nu
m
ber
of
at
te
m
pts
(four
by
def
a
ult), bey
ond w
hi
ch
the
task
and
the
j
ob
of the
task is c
onside
r
ed fault
y.
c.
Stoppe
d
Task:
As
an
e
xam
ple
,
the
process
of
runn
i
ng
a
ta
sk
m
ay
s
top
unex
pectedly
due
to
a
transient
bu
g
in the und
e
rly
ing
virtu
al
m
ac
hin
e.
In
t
his case, th
e JobT
rac
ker
s
how
in Fi
gure 2
will
b
e n
otifie
d
an
d
it
w
il
l
reord
e
r
t
he job
as d
esc
ribe
d
a
bove
.
d.
Bl
ock
e
d
Tas
k:
It’s
co
ns
ide
re
d
fa
ulty
if
after
so
m
e
tim
ea
m
app
in
g
or
r
edu
ct
io
n
ta
s
k
rem
ai
ns
blo
ck
e
d
without a
ny pr
ogress; i
n
this
case, the
JobT
r
acker
ord
e
rs
t
o kil
l t
he pr
oces
s runnin
g
t
his t
ask.
e.
Delay
ed
Task
s:
W
he
n
so
m
e
ta
sk
s
are
unex
pectedly
ta
king
longe
r
execu
ti
on
ti
m
e
com
par
ed
with
the av
e
ra
ge
e
xe
cution t
im
e, t
hese tas
ks
call
ed
stra
ggle
rs
.
Figure
2
.
Job
t
r
acker
Seve
ral
stud
ie
s
ha
ve
bee
n
de
vo
te
d
to
im
pr
ov
i
ng
t
he
toler
ance
of
syst
e
m
s
to
fau
lt
s.
Fo
r
i
ns
ta
nce
,
Mi
cro
s
of
t
re
ve
al
s
that
wh
en
the
CPU
an
d
cor
e
par
ts’
e
rror
s
occur
in
one
m
il
l
ion
cus
tom
ers’
com
p
uters
,
Hado
op
does
not
ha
ve
the
ca
pa
ci
ty
to
deal
w
it
h
these
ty
pe
s
of
pote
ntial
er
r
or
s
.
T
his
al
s
o
i
nclu
des
oth
e
r
t
ypes
of
e
rrors
relat
ed
to
t
he
ta
sk
it
sel
f
(when
one
or
m
or
e
of
these
ta
sk
s
ta
ke
lo
nger
or
s
top
s
befo
re
th
e
wor
k
ou
tc
om
es
are rea
li
zed)
.
Pr
oble
m
sta
teme
nt.
The
del
ay
ed
ta
sk
s
a
re
cal
le
d
strag
gl
ers
an
d
play
a
key
r
ole
in
increasi
ng
the
exec
ution
tim
e
of
big
da
ta
and
ene
r
gy
consum
ption
.
Fo
ll
owin
g
the
Ma
pRed
uce
f
r
a
m
ewo
r
k,
stra
gg
le
rs
ref
e
r
to
the
ta
s
ks
that
ta
ke
a
longe
r
tim
e
to
be
execu
te
d
co
m
par
ed
to
oth
e
r
ta
sk
s.
T
here
are
va
rio
us
te
chn
i
qu
e
s
of
detect
ing
a
nd
ha
nd
li
ng
st
ragglers
s
uc
h
as
Do
ll
y,
the
Hado
op
native
schedule
r
,
MonTo
ol,
L
AT
E
and
Ma
ntri.
Re
ga
r
dless
of
the
te
chn
i
qu
e
s
al
rea
dy
in
place,
st
ra
gg
le
r
detect
io
n
rem
a
ins
pr
obl
e
m
at
ic
in
the
fiel
d
of
data analy
ti
cs.
Evaluation Warning : The document was created with Spire.PDF for Python.
IS
S
N
:
2088
-
8708
In
t J
Elec
&
C
om
p
En
g,
V
ol.
10
, No
.
5
,
Oct
ob
e
r
2020
:
49
10
-
49
17
4912
The pr
opos
e
d
s
olu
ti
on
.
We
propose
a
n
al
go
rithm
that faci
li
t
at
es the calc
ula
ti
on
of stra
ggle
r
tolera
nc
e
thres
ho
l
d
us
i
ng
CPU
te
st
an
d
LAT
E
m
et
h
odology.
O
ur
appr
oach
c
onsiders
c
ru
ci
al
issues
s
uch
as
t
he
Q
oS
with
a
m
ajo
r
fo
c
us
on
th
e
tim
ing
con
st
raints,
the
pro
gress
of
ta
s
k
ex
ecuti
on,
an
d
t
he
us
a
ge
of
c
luster
resou
rces.
E
ve
n
thou
gh
ta
s
k
sel
ect
ion
does
not
app
ea
r
to
be
a
big
pro
blem
in
the
init
ia
l
s
t
ages,
we
ha
ve
bee
n
able
to
show
t
hat
it
is
a
big
issue
that
requ
ires
cl
os
e
at
te
nt
ion
.
T
her
e
fore
,
we
rec
omm
e
nd
ide
ntifyi
ng
tho
s
e
ta
sk
s
th
at
le
ad
to
t
he
lo
ngest
res
pons
e
ti
m
e
s.
We
al
s
o
rec
omm
end
t
hat
this
need
s
to
be
done
as
earl
y
as
po
s
sible
so
th
at
there
cou
l
d
be
no
la
te
r
su
r
pr
ise
s
.
Takin
g
these
into
account,
it
is
po
s
sible
to
see
that
our
LATE
m
et
ho
dolo
gy
is
based
on
the
est
im
a
te
of
the
ti
m
e
that
is
le
ft
with
the
go
al
of
early
det
ect
ion
of
the
ta
sk
s
that
are
r
unning
sl
ow
ly
.
I
n
s
umm
ary,
the
m
eth
od
ology
is
ba
sed
on
m
aking
decisi
ons
earl
y,
us
ing
the
fi
nish
i
ng
ti
m
es
and
not
th
e
pro
gr
e
ss
rates,
no
t
as
sig
ning
s
pec
ulati
ve
t
asks
t
o
sl
ow
node
s
a
nd
opti
m
iz
at
ion
of r
es
ource
util
iz
at
ion
.
S
ign
i
ficance
of
the
resea
rch
.
I
n
the
cu
rr
e
nt
stu
dy,
a
m
et
hod
is
pro
pose
d
with
the
pur
pose
of
addressi
ng
the
strag
gler
pr
oble
m
.
W
e
pr
opose
a
n
al
go
rithm
wh
ic
h
ca
n
e
na
ble
cal
cu
la
ti
on
of
thre
s
ho
l
d
of
strag
gler
tole
r
ance
earli
er
usi
ng
CP
U
te
st
and
Lo
ngest
Appro
xim
at
e
Ti
m
e
to
End
(
LATE
)
m
et
ho
do
l
og
y.
Our
a
ppr
o
ach
fo
c
us
es
on
t
he
ti
m
ing
c
onstrai
nts,
pro
gress
of
ta
s
k
e
xecu
ti
on,
an
d
the
us
age
of
cl
us
te
r
resou
rces.
In
this
w
ork
,
our
m
ajo
r
co
ntribu
ti
on
is
that
unli
ke
ot
her
stu
dies
that
ass
um
e
that
it
is
ha
rd
to
ha
ve
a
co
rr
el
at
io
n
betwee
n
ta
s
k
execu
ti
on
an
d
node
sta
tu
s,
we
s
how
t
hat
it
i
s
possible
an
d
feasible
to
ha
ve
a
cor
relat
io
n
and
detect
stragg
le
rs
us
in
g
L
ATE
m
et
ho
dolog
y
an
d
CPU
te
st.
Ou
r
m
e
t
hodolo
gy
is
si
m
ple
enou
gh and ea
sy t
o
acc
omm
o
date with
lo
w ov
e
r
head
s
.
Orga
nizat
ion
of
the
resea
rc
h
.
The
re
st
of
this
pa
per
is
orga
nized
as
f
ollow
s:
Sect
ion
2
co
ve
rs
th
e
li
te
ratur
e
r
eview,
sect
io
n
3
co
ve
rs
t
he
m
et
hodo
l
og
y,
se
ct
ion
4
th
e
r
e
s
ults
an
d
discu
s
sion,
sect
io
n
5
cov
e
rs
the pr
opos
e
d
s
olu
ti
on/rec
omm
end
at
ion, a
nd se
ct
io
n 6
c
ov
ers
the
conclu
s
ion
.
2.
LIT
ERATUR
E REVIE
W
Hado
op
an
d
Ma
p
Re
duce
a
re
am
on
g
t
he
m
os
t
co
m
m
on
ly
us
ed
f
ram
ewo
r
ks
when
it
c
om
es
to
ta
sk
execu
ti
on
ac
r
oss
seve
ral
no
de
s
for
op
ti
m
al
per
f
or
m
ance.
E
ven
t
houg
h
the
fr
am
ewo
r
ks
ha
ve
beco
m
e
popu
la
r,
they
sti
ll
face
sever
al
chall
eng
e
s
wh
e
n
it
com
es
to
the
eff
ect
ive
ness
of
ta
sk
s
e
xecut
ion
[
8
]
.
S
pecifica
ll
y,
achievin
g
pre
dicta
ble
exec
ut
ion
has
be
co
m
e
pr
oble
m
a
tic
becau
se
of
strag
glers.
Due
to
stragg
le
rs
,
ta
sk
s
execu
ti
on
ta
ke
s
lon
ge
r
to
co
m
ple
te
than
ori
gin
al
ly
antic
ipate
d
[
9]
.
S
uch
delay
s
are
unde
sirable
beca
use
they
resu
lt
in
re
duc
ed
ser
vice
pe
rfor
m
ance
an
d
can
al
so
po
te
ntial
ly
vio
la
te
QoS
(Quali
ty
of
S
erv
ic
e)
re
qu
ire
m
ents
con
ce
r
ning
ti
m
e
ta
ken
to
c
om
plete
ta
sk
s.
F
or
ser
vice
pr
ov
i
de
rs,
ta
s
ks
t
hat
ta
ke
m
or
e
ti
m
e
to
com
plete
le
ad
to
reduce
d
avail
a
bili
ty
of
syst
em
s
and
cause
j
obs
to
c
onsum
e
m
or
e
tim
e
.
Strag
glers
ha
ve
bec
om
e
com
m
on
especial
ly
in
cl
oud data ce
nter
s [
10
]
.
Ther
e
f
or
e,
it
is
vital
to
de
te
ct
and
m
iti
gate
them
pr
om
ptly
.
Ad
diti
on
al
ly
,
center
s
with
la
r
ge
com
pu
ti
ng
in
frast
ru
ct
ure
can
al
so
ex
per
ie
nc
e
delay
s
that
c
an
le
ad
to
inef
fecti
ve
job
e
xe
cution.
Als
o,
la
rge
data
cente
rs
ha
ve
a
high
i
ntake
of
ser
vice
creati
on
w
hic
h
m
ake
them
vu
l
ner
a
ble
to
strag
glers.
T
he
re
are
sever
al
r
oot
causes
to
stra
ggle
rs
inclu
ding
r
eso
ur
ce
c
on
te
nt
ion
,
hardw
a
re
heter
og
e
neity
,
backg
rou
nd
ne
twor
k
tr
aff
ic
,
an
d
op
e
rati
ng
syst
em
r
el
at
ed
-
le
vel
ca
us
es
[
11
]
.
C
onsidera
ble
ef
fort
has
bee
n
m
ade
to
st
udy
stra
ggle
rs
.
Ov
e
r
the
ye
ars
,
the
siz
e
of
c
om
pu
ti
ng
in
fr
a
structu
re
an
d
job
s
e
xec
uted
hav
e
c
on
ti
nu
e
d
to
gro
w
w
hi
ch
has
dr
am
at
ic
ally
increased
the
im
pact
of
stra
g
glers.
St
ragglers
are
kn
own
t
o
exten
d
j
ob
e
xe
cution
substa
ntial
ly
wh
ic
h
ne
gativ
el
y
aff
ect
s
t
he
“C
ons
um
er
Serv
ic
e
Level
Agreem
ent”
an
d
Q
oS
perfor
m
ance
re
qu
i
re
m
ents.
In
a
st
ud
y
c
ondu
ct
e
d
by
B
ort
nik
ov,
Fr
a
nk,
Hill
el
,
and
Ra
o,
the
a
utho
rs
pro
po
se
t
wo
w
ay
s
of
d
eal
ing
with
strag
glers
na
m
el
y
toleranc
e
and
a
voida
nce
[
12
]
.
H
oweve
r,
a
voidi
ng
strag
glers
is
diff
ic
ult
since
it
is
i
m
pr
act
ic
al
to
purs
ue.
T
he
r
efore,
str
ag
gle
r
tolera
nce
is
the
ap
proac
h
adopte
d
by
m
o
st
sta
keh
ol
der
s.
In
stra
gg
le
r
to
le
ran
ce,
the
ex
ecuti
on
prog
re
ss
of
a
ta
sk
is
m
on
it
or
ed
us
ing
a
per
c
enta
ge
sco
re
m
ade
up
of
values
ra
ng
i
ng
from
0
to
1
w
hich
re
pr
ese
nt
sta
rt
an
d
c
om
pleti
on
.
C
urren
tl
y,
the
a
ppr
oac
hes
us
e
d
for
str
agg
le
r
detect
ion can
e
it
her
be descri
bed as
offli
ne or
onli
ne
a
naly
ti
cs
[1
3
]
.
Nonetheless
,
it
is
w
or
th
noti
ng
that
onli
ne
de
te
ct
ion
can
oc
cur
t
oo
la
te
du
rin
g
the
e
xec
ut
ion
cy
cl
e
of
a
ta
sk
.
The
refor
e
,
strag
glers
cannot
be
preven
te
d
f
ro
m
run
ning
slowe
r
eve
n
after
t
he
im
ple
m
entat
ion
of
sp
ec
ulati
ve
co
pies.
On
t
he
oth
e
r
ha
nd,
offli
ne
a
pp
r
oac
hes
ar
e
norm
al
ly
app
li
ed
to
av
oi
d
stra
ggle
rs.
This
ap
proac
h
is
seen
as
le
ss
feasible
an
d
th
us
it
is
un
c
omm
on
.
Howe
ver,
bette
r
res
ults
can
be
ac
hieve
d
by
com
bin
ing
bot
h
on
li
ne
a
nd
offli
ne
a
ppr
oac
hes.
Wh
e
n
us
e
d
to
geth
er,
the
y
ca
n
si
gn
i
ficantl
y
help
to
i
m
pr
ov
e
the ef
fecti
ve
ne
ss of “str
ag
gler
d
et
ect
io
n”.
2.1
.
Rela
ted w
orks
Fo
r
Stra
ggle
r
Detect
ion
,
m
any
te
chn
i
qu
es
ha
ve
bee
n
dev
el
op
e
d.
O
uyan
g et
al
.
pro
poses an
al
go
rithm
base
d
on
t
he
pro
gr
ess
sco
re
of
ta
sk
exec
ution
that
e
na
bles
dynam
ic
thres
hold
det
ect
io
n
for
str
agg
le
r
ta
sk
s
[
14
]
.
T
hi
s
strat
egy
has
i
m
pr
oved
pe
rfor
m
ance
sig
nif
ic
antly
by
re
duci
ng
j
ob
exec
ution
by
44
pe
rcen
t.
Evaluation Warning : The document was created with Spire.PDF for Python.
In
t J
Elec
&
C
om
p
En
g
IS
S
N:
20
88
-
8708
Earlier st
age f
or
str
aggler
de
te
ct
ion
an
d h
andlin
g usi
ng c
ombin
e
d
CP
U t
est
…
(
Anwa
r
H.
K
atra
wi
)
4913
In
a
st
udy
co
nducte
d
by
Zah
aria
et
al
.,
the
auth
or
s
pro
pos
e
a
ne
w
Stra
ggle
r
Detect
ion
that
ta
kes
int
o
a
ccount
bo
t
h
the
pro
gress
sco
re
an
d
el
apsed
ti
m
e
with
the
obj
ec
ti
ve
of
im
pr
ov
ing
the
pro
gre
ss
sco
re
strat
e
gy
[
15
].
The
aut
hors
de
te
rm
ine
that
the
strat
egy
do
es
no
t
ha
ve
the
c
apacit
y
of
dete
rm
ining
how
f
ast
a
ta
sk
ru
ns
a
m
ong
diff
e
re
nt
ta
sk
s
sta
rting
at
different
ti
m
es
hen
ce
the
nee
d
f
or
im
pr
ov
em
ent.
Dea
n
a
nd
G
hem
awat
hav
e
a
dopte
d
a
te
chn
iq
ue
ca
ll
ed
Sp
ec
ulati
ve
Execu
ti
on
in
wh
ic
h
they
la
un
c
h
c
op
ie
s
of
the
strag
gler
on
al
te
r
native
node
s
with
the
ai
m
of
im
pr
ov
i
ng
pe
rfor
m
ance
[
16
]
.
G
oogle
ac
kn
ow
le
dg
e
s
that
Sp
ec
ulati
ve
E
xe
cution
im
pr
oves
job
execu
ti
on
by
44
per
ce
nt.
Howev
e
r,
t
his
te
c
hn
i
qu
e
can
re
duce
the
overall
thr
oughput
due
to
the
duplic
at
ion
of
ta
sk
s.
T
her
e
for
e,
so
m
e
Had
oop
a
dm
inist
rat
or
s
pr
e
fer
no
t
to
us
e
the
S
pe
culat
ive
Exec
ut
ion
opti
on
[
17
,
18
].
Yanfei
et
al
propose
d
a
no
t
he
r
te
ch
nique
f
or
Strag
gler
ha
ndli
ng
w
hich
c
on
sist
s
of
e
nding
t
he
de
la
ye
d
ta
sk
s
and
rea
ssig
ning
them
to
ano
t
her
no
de
with
out
strag
glers,
howe
ve
r,
G
uo,
Ra
o,
Jia
ng,
an
d
Zh
ou
disag
re
es
with
t
h
i
s
t
e
c
h
n
i
q
u
e
a
n
d
a
r
g
u
e
t
h
a
t
i
t
r
e
s
u
l
t
s
i
n
w
a
s
t
a
g
e
o
f
r
e
s
o
u
r
c
e
s
a
n
d
t
h
e
r
e
f
o
r
e
i
n
c
r
e
a
s
e
s
e
n
e
r
gy
c
o
n
s
um
p
t
i
o
n
[
19
].
In
a
sim
il
ar
st
ud
y,
Z
hou,
Li,
Yang,
Jia
,
a
nd
Li
pro
pose
a
te
chn
iq
ue
known
as
“B
igR
oo
ts”
wh
ic
h
involves
inc
or
porati
ng
both
syst
e
m
featur
es
and
fr
am
ewo
r
k
f
or
t
he
an
al
ysi
s
of
root
causes
of
stra
gg
le
rs
especial
ly
in
big
data
syst
e
m
s
[
20
]
.
The
au
thors
est
ablish
that
“B
igRoot
s”
is
eff
ect
ive
wh
e
n
it
co
m
es
to
identify
in
g
th
e
“r
oo
t
ca
us
e
s”
of
stra
ggle
rs
wh
ic
h
ca
n
sig
nificantl
y
help
in
opti
m
iz
ing
pe
rform
ance
.
An
e
xam
inati
on
by
P
ha
n
at
te
m
pted
to
com
e
up
with
“e
nergy
eff
ic
ie
nt
st
r
agg
l
e
r
m
i
ti
gatio
n”
te
c
hn
i
que
for
bi
g
data
ap
plica
ti
on
s
es
pecial
ly
i
n
the
cl
ou
d
en
vir
on
m
ent
[
21
-
23
]
.
T
he
f
ram
e
work
em
plo
ye
d
by
the
a
utho
r
ta
ke
s
into
acco
unt
how
hete
roge
neity
of
res
ources
a
ff
ect
the
pe
rfor
m
anc
e
and
e
nergy
con
s
um
ption
of
bi
g
data ap
plica
ti
ons.
I
n
a
no
t
her
st
ud
ie
s
by
Ha
rlap
et
al
.,
Kim
,
W.
,
the
a
utho
rs
s
ought
to
so
l
ve
the
strag
gler
pro
blem
fo
r
par
al
le
l
ML
[
24,
25
]
.
T
he
auth
or
s
c
om
bi
ned
a
m
or
e
“
flexible
sync
h
ronizat
ion
m
od
el
”
to
gethe
r
with
t
h
e
e
x
p
e
r
i
m
e
n
t
s
i
n
v
o
l
v
i
n
g
r
e
a
l
s
t
r
a
g
g
l
e
r
b
e
h
a
v
i
o
r
s
a
n
d
s
y
n
t
h
e
t
i
c
s
t
r
a
g
g
l
e
r
b
e
h
a
v
i
o
r
s
t
o
c
o
m
e
u
p
w
i
t
h
n
e
a
r
-
i
d
e
a
l
run
ti
m
es
acro
ss
al
l
the
strag
gler
patte
rn
s
t
he
y
te
ste
d.
Sim
i
la
rly
,
Yadwa
dkar
et
al
.
cam
e
up
with
a
fr
a
m
ewo
r
k
cal
le
d
“
W
ra
ng
le
r”
w
hich
c
ould
predict
w
he
n
stra
gg
le
rs
w
ere
goin
g
to
oc
cur
a
nd
ai
d
i
n
m
aking
sc
he
du
li
ng
decisi
ons
[
26]
.
The
form
ulatio
ns
em
plo
ye
d
by
the
auth
or
s
captu
red
the
s
ha
red
str
uctu
re
in
their
data
so
that
i
t
cou
l
d
im
pr
ove
the g
e
ne
rali
zat
i
on p
e
rfo
rm
ance of th
ei
r
d
at
a
.
3.
METHO
DOL
OGY
As
note
d
ea
rlie
r,
we
est
im
ate
the
tim
e
re
maining
for
eac
h
ta
sk
ba
sed
on
the
pro
cess
s
cor
e
der
i
ved
from
Had
oo
p.
In
pract
ic
e,
thi
s
he
ur
ist
ic
wor
ks
well
.
Howe
ver,
we
wan
t
to
po
i
nt
ou
t
t
ha
t
there
are
i
nci
den
ce
s
wh
e
n
it
can
ba
ckf
ire
.
Wh
e
n
this
ha
pp
e
ns,
th
e
heur
ist
ic
can
prov
i
de
inc
orr
ect
est
i
m
a
te
s
a
nd
giv
e
res
ults
that
a
ta
sk
la
unche
d
la
te
r
fi
nish
es
earli
er.
T
o
de
m
on
strat
e
the
delay
,
we
ass
um
e
that
the
progress
of
a
ta
s
k
gr
ow
s
by
fi
ve
pe
rce
nt
duri
ng
t
he
firs
t
ph
a
se.
We
as
su
m
e
that
during
this
first
phase,
the
to
ta
l
s
cor
e
is
fifty
pe
rcent
and
t
hat
the
rat
e
reduces b
y
one
pe
rce
nt
in
the
sec
ond
ph
as
e.
I
n
the
fi
rst
phase,
it
is
ex
pe
ct
ed
that
the
ta
sk
will
ta
ke
te
n
sec
onds
a
nd
fifty
se
conds
in
t
he
se
cond
phase
to
pro
du
ce
a
t
otal
of
si
xty
seco
nds.
Wh
e
n
t
wo
cop
ie
s
of
th
e
sam
e
ta
s
k
a
re
la
un
c
he
d
at
the
sam
e
tim
e,
the
first
ta
sk
is
de
note
d
by
T1,
the
seco
nd
is
no
te
d
by
T2
a
nd
the
fi
rst
ta
sk
st
arts
at
ti
m
e
0
wh
il
e
t
he
nex
t
sta
rts
after
te
n
seco
nd
s
.
T
he
pro
gr
e
ss
rate
is
check
e
d
after
t
wen
ty
seco
nd
s
w
her
e
by
twenty
seco
nd
s
,
it
is
exp
ec
te
d
that
T1
will
hav
e
fi
nish
e
d
the
first
phase
and
will
be
th
rou
gh
a
fifth
of
the
s
econd
phase
.
T
her
e
fore,
it
will
hav
e
a
pro
gr
e
ss
scor
e
of
sixt
y
per
cent.
Its
r
at
e
of
pro
gr
es
s
wil
l
be
60
%/
20s=3
%/
s.
O
n
t
he
oth
er
ha
nd,
T
2
w
il
l
j
us
t
be
th
rou
gh
with
th
e
firs
t
and
it
s
sco
re
will
be
50%.
Its
rat
e
will
,
therefo
re,
be
50%/
10
s=
5%/s.
T
he
est
im
at
ed
tim
e
re
m
ai
nin
g
f
or
T
1
will
be
(10
0%
-
60%
)/(
3%/s
)=13.
3s
.
Fo
r
T
2,
t
he
est
i
m
at
ed
tim
e
lef
t
will
be
(10
0%
-
50%
)/(
5%/s
)
=1
0s
.
T
he
refo
re,
the
heurist
ic
will
illustrate
that
T1
will
ta
ke
a
longer
tim
e
t
o
r
un
c
om
par
ed
to
T2
.
Howe
ver,
in
reali
ty
,
T2
will
finis
h
seco
nd
com
par
ed
t
o
T1.
W
e
al
so
de
te
rm
ined
the
cr
it
eria
that
we
c
ou
l
d
us
e
t
o
i
de
ntify
stra
gg
le
rs
.
We
to
ok
the
f
inish
ti
m
e
fo
r
e
ver
y
ta
sk
to
b
e
r
e
pr
e
sented
b
y
(
1
)
.
EF
=t
k
+
1
−
PS
(
t
a
sk
)
PS
(
t
a
sk
)
(t
k
-
t
0
)
(1)
In
this
insta
nce
,
E
F
re
pr
ese
nts
the
est
im
at
ed
finish
ti
m
e.
PS
is
us
e
d
to
re
pr
esent
t
he
pro
gr
ess
sc
or
e
for
a
a
s
k
wh
erea
s
t0
is
sta
rting
tim
e
wh
il
e
tk
is
the
tim
est
a
m
p
reco
rd
i
ng
for
PS
(
t
ask
)
.
Our
pro
pose
d
LATE
m
et
ho
dolo
gy
ta
kes
i
nto
acc
ount
the
sco
pe
of
data
and
t
he
s
peed
of
processi
ng
data.
By
ta
king
thes
e
into
acco
unt,
we
can
deter
m
ine
the
patte
rn
of
stra
ggle
r
detect
ion
a
nd
correl
at
e
this
with
the
at
tribu
te
s
of
a
syst
e
m
no
rm
al
ly
hypo
thesi
zed
to
giv
e
the
m
the
ability
t
o
cause
stra
gg
l
ers.
S
om
e
of
the
at
tribu
te
s
in
cl
ud
e
resou
rce
util
iz
at
ion
(m
e
m
or
y,
CPU,
dis
k),
hard
war
e
fa
ults,
unha
nd
le
d
r
equ
e
sts,
am
on
g
oth
e
rs.
Wh
e
n
w
e
Evaluation Warning : The document was created with Spire.PDF for Python.
IS
S
N
:
2088
-
8708
In
t J
Elec
&
C
om
p
En
g,
V
ol.
10
, No
.
5
,
Oct
ob
e
r
2020
:
49
10
-
49
17
4914
com
bin
e
these
runtim
e
featur
es
with
dy
na
m
ic
info
rm
at
i
on,
we
can
c
om
e
up
wit
h
b
et
te
r
pre
dicti
on
an
d
le
arn
in
g
m
od
el
s
that
can
disc
r
i
m
inate
stragg
le
rs
prom
ptly
a
nd
l
ow
e
r
the
num
ber
of
fail
ures
occ
urrin
g
due
to
la
te
tim
ing
.
4.
RESU
LT
S
AND DI
SCUS
S
ION
This
researc
h
has
i
den
ti
fie
d
the
nee
d
f
or
im
pr
ov
in
g
t
he
eff
ect
ive
ness
of
Ma
pRe
duce
processes
t
o
facil
it
at
e
costs
re
du
ct
io
n
an
d
m
axi
m
u
m
utilizat
ion
of
res
ources
.
Pe
rfo
rm
ance
im
pr
ove
m
ent
can
be
a
chieve
d
by
el
i
m
inati
ng
ineff
ic
ie
ncies
broug
ht
ab
out
by
the
existe
nce
of
stra
ggle
rs.
Stra
gg
le
r
s
hav
e
t
he
effe
ct
of
resu
lt
in
g
in
po
or
us
e
r
co
de
a
nd
unev
en
a
gg
reg
at
io
n
of
w
orkl
oad
s
.
P
oor
us
er
c
ode
is
th
e
pro
du
ct
of
l
oopi
ng
conditi
ons
tha
t
are
desi
gn
e
d
ine
ff
ect
ively
and
une
ven
aggre
gation
of
wor
klo
a
ds
re
su
lt
s
from
extrem
e
co
-
al
locat
io
n
of
w
orkl
oad
s
due
to
i
neffici
ent
sche
duli
ng.
In
the
case
of
Ma
pRe
du
ce
ta
sk
s
wit
h
a
m
assive
nu
m
ber
of
wri
te
and
rea
d
queries,
it
is
c
omm
on
for
file
re
qu
e
sts
to
be
overloa
de
d
there
by
le
a
din
g
t
o
ineff
ic
ie
ncy
in
handlin
g
the
r
equ
e
sts.
It
is
nota
ble
that
on
c
e
the
thre
shold
of
t
he
m
ast
er
node
is
s
urpa
ssed
,
strag
glers
set
in
w
hich
im
pli
es
that
the
wait
ing
queue
f
or
req
ue
sts
becom
es
lon
g.
I
f
the
requests
co
ntinu
e
increasin
g,
the
n
the
m
ast
er
no
de
bec
om
es
overl
oad
e
d
t
hereby
f
ur
t
her
slo
wing
dow
n
the
ha
nd
li
ng
proc
ess
of
the
re
qu
e
sts.
This
e
xp
e
rim
e
nt
f
ocused
on
inv
est
igati
ng
how
t
he
occ
u
r
ren
ce
of
stra
ggle
rs
is
a
ff
ect
ed
by
con
te
ntion
of
resou
rces.
An
ob
s
er
vation
of
the
occ
urrenc
e
of
st
ragglers
t
hat
took
pla
ce
ov
e
r
a
per
i
od
of
20 d
ay
s
yi
el
ded
the
r
es
ults
presented
in
F
ig
ure
3.
Figure
3
.
G
raphical
presentat
i
on
of t
he
fi
nd
i
ng
s
In
t
he
analy
sis
,
m
il
li
on
s
of
ta
sk
s
processe
d
in
five
hund
re
d
ser
ve
rs
in
Cl
oud
Datace
nte
r
[
27
]
wer
e
inv
est
igate
d.
T
he
incl
us
io
n
c
r
it
eria i
nvolv
e
d:
a.
Ser
ver
s
wh
os
e
ta
sk
s
had DoS
-
Inde
x values
great
er tha
n o
r
e
qu
al
t
o 10
b.
Ser
ver
s
wh
os
e
util
iz
at
ion
of C
PU
was g
reater
tha
n o
r
e
qual
to 80%
c.
Lat
ency
from
file
processin
g
gr
eat
er
tha
n
400m
s
wh
ic
h
translat
es
to
slow
handlin
g
of
wr
it
e
an
d
re
ad
requests
Fr
om
the
in
ve
sti
gation,
it
ca
m
e
to
our
at
te
ntion
t
hat
42%
of
stra
gg
l
ers
a
re
broug
ht
ab
out
by
ov
e
rloa
ding
of
disk
s
wh
il
e
59%
of
st
ra
ggle
rs
exist
un
de
r
high
se
rv
e
r
CPU
co
ndit
io
ns
.
T
he
fi
nd
i
ngs
al
so
rev
eal
e
d
that
slow
ha
ndli
ng
of
re
quest
s
was
res
pons
ib
le
for
th
e
oc
currence
of
34.
3%
of
stra
ggle
rs
.
Fr
om
the
find
i
ng
s
,
it
is
eviden
t
that
the
existe
nce
of
stra
gg
le
rs
is
sign
ific
antly
caused
by
hig
h
util
iz
at
ion
of
resou
rces.
It
was
al
s
o
obse
rv
e
d
that
strag
glers
ca
n
be
c
ause
d
by
oth
e
r
fact
or
s
su
c
h
as
the
c
ondit
ion
s
of
the n
et
wor
k.
5.
RECOM
ME
NDATIO
N/P
ROP
OSED
S
OLUTIO
N
In
t
his
pa
per, w
e
pro
pose
a
c
om
bin
ed
strat
e
gy
cal
le
d
CO
MB
IN
A
TOR
Y
LATE
-
MAC
HINE
that
te
st
s
the
CPU
m
achine.
T
he
CP
U
m
achine
is
te
s
te
d
to
determ
i
ne
it
s
vulne
ra
bi
li
t
y
to
strag
gl
ers.
T
o
ac
hiev
e
this,
the
CPU
a
nd
t
he
RAM
are
te
ste
d
to
determ
i
ne
m
achines
th
at
hav
e
a
CP
U
us
a
ge
le
ss
tha
n
85.
Th
os
e
fou
nd
t
o
m
eet
this
con
di
ti
on
are
dr
oppe
d
an
d
the
on
e
s
that
exceed
85
are
sel
ect
ed
and
placed
i
n
a
perform
ance
char
t
Evaluation Warning : The document was created with Spire.PDF for Python.
In
t J
Elec
&
C
om
p
En
g
IS
S
N:
20
88
-
8708
Earlier st
age f
or
str
aggler
de
te
ct
ion
an
d h
andlin
g usi
ng c
ombin
e
d
CP
U t
est
…
(
Anwa
r
H.
K
atra
wi
)
4915
sta
rting
with
t
ho
s
e
that
ha
ve
the
hi
gh
est
pe
rfor
m
ance.
I
f
for
any
reas
on
the
CP
U/Me
m
or
y
is
lower
than
a
certai
n
thre
s
ho
l
d,
t
he
jobs
will
autom
at
icall
y
be
re
direct
ed
t
o
a
no
t
her
m
achine
with
a
hi
gh
e
r
CP
U/
Mem
or
y
perform
ance.
This
sta
ge
is
run
once
a
nd
helps
in
creati
ng
a
li
st
of
pe
rfor
m
ances
f
or
al
l
the
m
achines
.
Af
te
r
this
ste
p,
Stra
ggle
r
De
te
ct
ion
a
nd
ha
nd
li
ng
a
re
i
niti
at
ed
at
the
sa
m
e
tim
e
us
ing
LAT
E
al
go
rithm
.
The
al
go
rithm
has
se
veral
be
nef
it
s
that
m
ake
it
su
it
able
for
heter
og
e
ne
ous
jo
bs
sin
c
e
it
re
-
exec
ute
s
only
the
slowe
st
ta
sk
s.
T
he
pr
im
ary
adv
a
ntage
of
the
pro
posed
m
et
ho
dolo
gy
is
the
increa
sin
g
pro
ba
bili
ty
t
o
detect
the
strag
gler
m
achines
a
nd
str
agg
le
r
ta
sk
s
wi
th
the
sam
e
alg
ori
thm
in
earl
ie
r
sta
ges.
O
ur
LATE
sch
ed
ul
er
is
desig
ne
d
s
uc
h
that
it
incl
udes
al
l
the
feat
ur
es
ne
ede
d
f
or
it
to
f
un
ct
i
on
well
i
n
a
r
eal
ist
ic
env
ir
onm
ent.
The
m
ajo
r
i
ns
i
gh
t
be
h
in
d
the
LATE
al
gorith
m
is
that
ta
sk
s
belie
ved
to
fini
sh
la
st
can
be
e
xecu
te
d
at
any
tim
e
in
the
fu
t
ur
e
be
cause
t
his
is
t
he
best
way
t
hro
ugh
w
hich
t
he
r
esp
onse
ti
m
e
can
be
im
p
rove
d.
T
he
fr
a
m
ewo
r
k
of the
pro
pose
d
c
om
bin
at
or
y
Lat
e
-
Ma
ch
ine
strat
egy is il
lus
tra
te
d
in
F
ig
ur
e 4
.
Figure
4
.
Pro
pose
d
c
om
bin
at
or
y
l
at
e
-
m
achine
fr
am
ewo
r
k
5.1
.
Li
mi
t
at
i
on
s
and
fu
t
ur
e researc
h
Our
obser
vatio
n
did
not
f
ocus
on
the
co
ndit
ion
s
of
the
m
e
m
or
y
capaci
ty
as
well
as
ove
rlap
ping
of
conditi
ons.
T
he
refor
e
, th
is
w
arr
a
nts the
n
ee
d for
furthe
r
in
vestigat
io
ns
.
6.
CONCL
US
I
O
N
In
this
pa
per
,
we
pro
pose
d
the
Com
bin
at
ory
Lat
e
-
Ma
chin
e
(CLM)
strat
egy
to
ide
ntify
(in
earli
e
r
sta
ges)
the
str
agg
le
r
for
bot
h
node
s
an
d
ta
sk
s,
an
d
the
op
ti
m
a
l
no
de
fo
r
re
-
e
xecu
t
ion
of
slo
w
ta
sk
s
.
The
overall
e
xecu
ti
on
ti
m
e
is
im
pr
ov
e
d
sign
ific
a
ntly
usi
ng
this
te
ch
nique
c
om
par
ed
to
tra
diti
on
al
jo
b
sche
du
le
r
s.
I
n
the
fu
t
ur
e
,
t
he
CLM
strat
egy
will
be
te
ste
d
with
Ha
doop
to
evaluate
th
e
eff
ic
ie
ncy
of
this
te
chn
iq
ue.
The
find
i
ngs
pro
vi
de
ne
w
insi
gh
t
s
into
early
str
agg
le
r
detect
io
n.
Howe
ver
,
t
he
current
strat
e
gy
has
it
s
sh
ort
com
ing
s
a
nd
the
re
is
a
need
f
or
e
xhau
sti
ve
a
naly
ti
cs
to
es
ta
blish
the
relat
ion
s
hi
p
betwee
n
st
ra
gg
le
rs
and
the
c
on
te
nt
ion
of
res
ourc
es.
T
he
propos
ed
al
gorithm
cou
l
d
be
use
d
f
or
resea
rch
a
s
well
as
in
the
industry
to im
pr
ov
e
the
tim
e and
c
os
t f
or b
i
g data p
rocessi
ng.
ACKN
OWLE
DGME
NT
The
a
uthors w
ou
l
d
li
ke
to
tha
nk
al
l t
he
resea
rch
pa
rtic
ipant
s f
or thei
r
ti
m
e
, effort, a
nd
c
ontrib
utio
n
to
the r
esea
rc
h
Evaluation Warning : The document was created with Spire.PDF for Python.
IS
S
N
:
2088
-
8708
In
t J
Elec
&
C
om
p
En
g,
V
ol.
10
, No
.
5
,
Oct
ob
e
r
2020
:
49
10
-
49
17
4916
REFERE
NCE
S
[1]
E.
A.
Moham
me
d,
et
al.
,
“
Appl
ic
a
ti
ons
of
th
e
MapReduc
e
pro
gra
m
m
ing
fra
m
ework
to
clini
ca
l
big
data
ana
l
y
si
s:
cur
ren
t
l
andsc
ap
e
and
futur
e tre
n
ds,
”
B
io
Data
M
ini
ng
,
vol. 7
,
no
.
1,
pp.
22
-
44,
20
14.
[2]
S.
Valvå
g,
et
a
l.
,
“
Cogset:
a
high
per
form
anc
e
MapReduc
e
eng
ine
,
”
Con
currenc
y
and
Computati
on:
Pract
i
ce
a
nd
Ex
peri
enc
e
,
vo
l.
25,
no
.
1
,
pp
.
2
-
23,
2012
.
[3]
Sum
al
at
ha,
S.
,
a
nd
Subram
an
y
a
m
,
R.
B.
V.
“
D
istri
bute
d
m
ini
n
g
of
high
uti
li
t
y
ti
m
e
int
erv
a
l
seque
nt
ia
l
pa
tt
e
r
ns
using mapre
duc
e
appr
oa
ch,”
Ex
p
ert
Syst
ems wi
th
Appl
ic
a
ti
ons
,
vo
l.
141
,
112967
,
Mar.
2020
.
[4]
Medda
h,
I.
H.,
and
Bel
kadi,
K.
“
Para
ll
el
Distri
bute
d
Patt
ern
s
Mining
U
sing
Hadoop
MapReduc
e
Fram
ework,
”
Inte
rnationa
l
Jo
urnal
of
Gr
id
an
d
High
P
erformance
Comput
ing
,
vol
.
9
,
no
.
2
,
pp
.
70
-
85
,
2017
.
[5]
S.
Khez
r
and
N.
J.
Navimipour,
“
MapReduc
e
and
Its
Applic
at
ions
,
Chal
l
enge
s,
an
d
Archi
te
c
ture
:
a
Com
pre
hensive
Revi
ew and
Dir
e
ct
ions f
or
Future
Resea
r
ch
,
”
Journal
of
Gr
id
Computing
,
vol
.
15
,
no.
3
,
pp
.
295
-
3
21,
2017
.
[6]
Pram
ee
la
De
'
v
i.
Chil
la
kuru
,
T.
K
um
ana
n,
CH.
Sa
rad
a
Dev
i,
“
Conte
nt
b
ase
d
Re
tri
e
val
Man
age
m
ent S
y
stems
in
W
eb
Engi
ne
eri
ng,
”
In
te
rnational
Jour
nal
of
R
ecent
Te
chnol
ogy
and
E
ngine
ering
(
IJRT
E)
,
vol.
8,
no
.
2S11,
p
p.
81
-
93,
Sep
.
2019
[7]
A.
Y.
Pigul,
“
Com
par
at
ive
St
ud
y
Para
ll
e
l
Join
Algor
it
hm
s
f
or
MapReduc
e
envi
ronm
ent
,
”
P
roce
edi
ngs
of
t
he
Instit
ute f
or S
yst
em
Program
ming
of RA
S
,
vol
.
23
,
pp
.
285
-
306
,
2
012.
[8]
I.
Hashem
,
et al.
,
“
MapReduc
e
sc
hedul
ing
al
gor
ithm
s: a
rev
ie
w,
”
The
Journal
o
f
S
uperc
omputing
,
2018.
[9]
K.
M
it
suzuka
,
e
t
al.
,
“
Prox
y
R
e
spons
es
b
y
FP
GA
-
Based
Sw
it
ch
for
MapRedu
ce
Straggl
e
rs
,
”
IEI
CE
Tr
ansacti
ons
on
Information
a
nd
Syste
ms
,
vol.
101,
no
.
9
,
pp
.
2
258
-
2268,
2018
.
[10]
J.
Rogoff,
“
Strag
gle
rs,
”
Sewan
ee
Re
v
ie
w
,
vo
l
.
124
,
no
.
3
,
pp
.
397
-
397,
2016
.
[11]
M.
F.
Aktas,
et
al
.
,
“
Straggler
Miti
gation
b
y
Delay
ed
Re
la
un
ch
of
Ta
sks
,
”
A
CM
SIGMETRICS
Pe
rform
ance
Ev
aluation
Revi
ew
,
vo
l. 45, no.
2,
pp
.
248
-
248
,
2018.
[12]
E.
Bortn
ikov,
et
al.
,
“
Predicting
execut
ion
bo
tt
l
ene
cks
in
m
ap
-
re
duce
cl
ust
ers
,”
Proceedi
ngs
of
the
4
th
USENI
X
con
fe
ren
ce on
H
ot
Topics
in
Clo
ud
C
omputing
,
p
p.
1
-
18
,
2012
.
[13]
A.
K.
Abasi
,
e
t
al.,
“
Li
nk
-
base
d
m
ult
i
-
ver
se
o
pti
m
iz
er
for
t
ext
documents
c
luste
ring
,”
Appl
i
ed
Soft
Comput
ing
,
vol.
87
,
2019
.
[14]
X.
Ouy
ang,
e
t
al
.
,
“
Straggl
e
r
Dete
ction
in
Para
ll
el
Com
puti
ng
S
y
stems
through
Dy
n
amic
Thre
shol
d
Cal
culat
ion,
”
20
16
IEE
E
30th
I
nte
rnational
Co
nfe
renc
e
on
Ad
vanc
ed
In
formation
Net
working
and
Appl
ic
a
ti
on
s
(
AINA
)
,
pp.
414
-
421,
2016
.
[15]
J.
Xie
,
et
al
.
,
“
Im
proving
MapReduc
e
p
e
rform
anc
e
thro
ugh
dat
a
p
lace
m
ent
i
n
hetero
gene
ous
Hadoo
p
cl
usters
,
”
2010
I
EE
E
Inte
rnat
ion
al
Symposium
o
n
Parallel
&
Di
stribute
d
Proce
s
sing,
Workshops
and
Phd
Forum
(
IPDP
SW)
,
pp.
1
-
9
,
2010
.
[16]
J.
Dea
n
and
S.
Ghem
awa
t,
“
Map
Reduc
e
:
si
m
pli
fie
d
data
p
roc
essing
on
large
cl
uste
rs
,
”
C
omm
unic
ati
ons
of
the
ACM
,
vol. 5
1,
no
.
1
,
p
.
107
,
2008.
[17]
H.
W
u,
et
al
.
,
“
A
Heuri
stic
Specul
a
ti
ve
Ex
ec
u
tion
Strat
eg
y
in
Hete
rog
en
eous
Distribut
ed
Env
i
ronm
ent
s
,
”
2014
Six
th
Inte
rnat
ion
al
Symposium o
n
Parallel Archi
t
ec
tures,
A
lgorit
h
ms
and
Program
ming
,
pp
.
268
-
2
73,
2014
.
[18]
A.
K.
Abasi,
e
t
al
.
,
“
A
Te
xt
Feat
ur
e
Selecti
on
Te
chn
ique
bas
ed
on
Bina
r
y
Multi
-
Verse
Op
ti
m
iz
er
for
Te
x
t
Cluste
ring
,
”
201
9
IEEE
Jordan
Inte
rnat
ional
Joi
nt
Confe
ren
c
e
on
Elec
tri
cal
Engi
n
ee
ring
a
nd
Informatio
n
Technol
ogy
(
JEEIT)
,
Amm
an,
Jordan,
pp
.
1
-
6
,
2
019.
[19]
Y.
Guo,
et
al.
,
“
FlexSlot
:
Moving
Hadoop
Into
the
Clou
d
with
Flexi
ble
Slot
Mana
gement
,
”
S
C14:
Inte
rnatio
nal
Confe
renc
e
for
High
Pe
rform
an
ce
Comput
ing, N
et
working
,
Stora
ge
and
Analysis
,
2014.
[20]
H.
Zhou,
et
a
l
.
,
“
BigRoot
s:
An
Eff
ective
Approac
h
for
Root
-
Cause
An
aly
s
is
o
f
Strag
gle
rs
in
B
ig
D
at
a
S
y
stem
,
”
IE
EE
Ac
c
ess
,
vol
.
6
,
p
p.
41966
-
41977
,
2018.
[21]
T.
Phan,
et
a
l.
,
“
A
New
Frame
work
for
Eva
lua
ti
ng
Straggler
De
te
ct
ion
Me
c
hani
sm
s
in
MapReduc
e
,
”
ACM
Tr
ansacti
ons on Modeli
ng
and
P
erfo
rm
ance
E
valuation
of
Compu
ti
ng
S
yste
ms
,
vo
l
.
4
,
no
.
3
,
pp
.
1
-
23,
2019
.
[22]
Chaowe
i
Yang
,
Qun
y
ing
Huang
,
Zhe
nlon
g
L
i,
Kai
Liu
&
Fei
Hu,
“
Big
Data
and
c
loud
comp
uti
ng:
innova
t
io
n
opportuni
ties
an
d
cha
l
le
ng
es,
”
I
nte
rnational
Jo
urnal
of
Digit
a
l
Earth
,
vo
l.
10
,
no
.
1,
pp.
13
-
53,
2017.
DO
I:
10.
1080/175389
47.
2016.
123977
1
[23]
L.
Gre
eshm
a,
Prade
ep
ini
Ger
a
,
“
Big
Data
Anal
y
t
ic
s
with
Apa
che
Hadoop
MapRe
duce
Fram
ework
,
”
Indi
an
Journa
l
of
Sc
ie
nc
e
and
Technol
og
y
,
Vol
9,
no
.
26
,
Jul
y
2
016
.
[24]
A.
Ha
rla
p
,
et
a
l.
,
“
Address
ing
the
straggler
proble
m
for
i
terati
v
e
conv
erg
e
nt
par
al
l
el
ML
,
”
Proc
ee
d
ings
o
f
the
S
event
h
AC
M
Symposium o
n
Cloud
Comput
ing
-
SoCC
'16
,
pp.
9
8
-
111
,
201
6.
[25]
Kim
,
W
.
,
Kim
,
Y.,
and
Shim
,
K.
,
“
Para
l
le
l
computat
ion
of
k
-
nea
r
est
ne
ig
hbor
joi
ns
usin
g
MapRe
-
duc
e,”
In
Proceedi
ngs
of
th
e
I
EE
E
Int
e
rnational
Conf
ere
nce on Bi
g
Da
t
a
,
pp
.
696
-
705
,
2016
.
[
2
6
]
N.
Yadwadka
r
,
e
t
a
l.
,
“
Multi
-
Ta
s
k
Learni
ng
for
S
tra
ggl
er
Avoidin
g
Predictive Job
Schedul
ing
,
”
Jo
urnal
of
Ma
chi
n
e
Learning
Re
sear
ch
,
vo
l. 17, pp.
1
-
37
,
2016
.
[
O
n
l
i
n
e
]
.
A
v
a
i
l
a
b
l
e
:
h
t
t
p
:
/
/
j
m
l
r
.
o
r
g
/
p
a
p
e
r
s
/
v
o
l
u
m
e
1
7
/
1
5
-
1
4
9
/
1
5
-
1
4
9
.
p
d
f
.
[27]
Gigas
the cloud
computing
comp
an
y
.
[Onl
ine
]
.
A
vai
l
abl
e
:
ht
tps:/
/
giga
s.c
om
/e
n
/c
lo
ud
-
dat
a
ce
nt
er
.
Evaluation Warning : The document was created with Spire.PDF for Python.
In
t J
Elec
&
C
om
p
En
g
IS
S
N:
20
88
-
8708
Earlier st
age f
or
str
aggler
de
te
ct
ion
an
d h
andlin
g usi
ng c
ombin
e
d
CP
U t
est
…
(
Anwa
r
H.
K
atra
wi
)
4917
BIOGR
AP
HI
ES OF
A
UTH
ORS
An
w
ar
H.
Kat
ra
w
i
re
ce
iv
ed
B.
Sc.
In
comp
ute
r
sci
ence
fro
m
al
Mus
ta
nsiri
y
a
univ
ersity
,
Ira
q
and
M.Sc
.
in
the
comput
er
informati
on
s
y
stem
from
Ara
b
Aca
dem
y
for
Mana
gement,
Banki
ng
and
Fi
nanc
i
al
Scie
n
ce
s
of
Jordan.
He
i
s
cur
ren
tly
a
PhD
Candi
dat
e
in
the
school
of
Com
pute
r
Scie
nce
s
(NA
V6)
at
Univer
siti
Sains
Malay
si
a.
His
rese
arc
h
in
te
r
est
s
inc
lude
Big
dat
a
,
d
at
a
ware
h
ouse,
m
ac
h
ine
L
ea
rning
and
data
an
aly
t
ic
s
.
Ros
ni
Ab
d
ullah
is
a
profe
ss
or
in
par
al
l
el
comput
ing
and
one
of
the
nat
ion
al
pion
ee
rs
in
the
said
dom
ai
n.
She
w
as
appoi
nte
d
Dea
n
of
the
School
of
Com
pute
r
Scie
nce
s
at
U
nive
rsiti
Sains
Malay
s
ia
(US
M)
in
June
2004,
aft
er
h
avi
ng
ser
ved
as
it
s
Deput
y
Dea
n
(R
ese
ar
c
h)
since
1999.
She
is
al
so
the
Hea
d
of
the
Par
al
l
el
and
Distrib
ute
d
Proce
ss
ing
Resea
rch
Group
at
the
Schoo
l
since
it
s in
ce
p
ti
o
n
in
1994
.
Mohamme
d
F.R.
An
bar
rec
ei
ved
his
bac
helor
of
Com
pute
r
Sy
st
em
Engi
nee
ri
ng
from
Al
-
Azha
r
Univ
er
sit
y
,
Pal
esti
ne
and
M.Sc.
in
I
nform
at
ion
Tec
hnolog
y
f
rom
Univer
siti
Utar
a
Malay
s
ia,
Mal
a
y
sia
(UU
M).
He
obta
in
ed
h
is
PhD
.
in
Advanc
ed
In
te
rn
et
Secur
ity
and
Monitori
ng from
Univer
sit
y
Sa
in
s
Malay
sia
(US
M).
He
is
cur
r
en
tly
a
sen
ior
l
ec
tu
rer
at
Nati
on
al
Advanc
ed IPv6
Cent
re
(NA
v6),
Univer
siti
Sains
Malay
s
ia.
Ammar
Kamal
Ab
asi
rec
ei
v
ed
B.
Sc.
in
computer
informa
ti
on
s
y
stem
from
Jordan
univers
ity
of
scie
nc
e
and
technolog
y
,
and
M.Sc.
in
the
int
e
rna
ti
on
al
busine
ss
from
the
unive
rsit
y
of
Jordan
.
He
is
cur
ren
tly
a
PhD
Candi
dat
e
in
th
e
scho
ol
of
Com
pute
r
Scie
nce
s
a
t
Univer
siti
Sa
ins
Malay
s
ia.
His
rese
arc
h
int
er
ests
inc
lude
evol
ut
i
onar
y
al
gor
it
hm
s,
nat
ure
-
inspired
computat
ion
,
and
th
ei
r
applic
a
ti
ons t
o
opti
m
iza
ti
on
prob
le
m
s.
Evaluation Warning : The document was created with Spire.PDF for Python.