T
E
L
KO
M
NI
K
A
,
V
o
l.
1
4
,
N
o.
3
,
S
ept
em
ber
20
1
6
,
pp.
11
23
~
112
7
I
S
S
N
:
1
693
-
6
930
,
ac
c
r
edi
t
ed
A
b
y
D
IK
T
I,
D
e
c
r
e
e
N
o
:
58/
D
I
K
T
I
/
K
ep/
2013
D
O
I
:
10.
12928/
T
E
LK
O
M
N
I
K
A
.
v
1
4
i
3
.
3771
11
23
R
ec
ei
v
ed
J
anu
ar
y
1
0
,
201
6
;
R
ev
i
s
ed
A
p
r
il
2
0
,
201
6
;
A
c
c
ept
ed
M
ay
8
,
201
6
M
a
pR
e
duc
e
I
nte
gr
a
te
d M
ul
ti
-
a
l
g
or
i
th
m f
or
HP
C Ru
n
n
in
g
S
t
at
e A
n
al
y
si
s
Sh
u
R
e
n
L
i
u
*
,
C
h
ao
M
i
n
F
e
n
g
,
H
o
n
g
W
u
L
u
o
,
L
i
n
g
W
e
n
N
or
t
hw
es
t
B
r
an
c
h of
P
et
r
oC
h
i
na
R
es
ear
c
h I
n
s
t
i
t
ut
e
of
P
et
r
ol
eum
E
x
pl
or
at
i
on a
nd D
ev
el
op
m
ent
,
Lanz
hou 730
020,
C
hi
na
*
C
or
r
es
po
ndi
ng a
ut
hor
,
e
-
m
a
i
l
:
1648
9379
49@
qq
.
c
o
m
A
b
st
r
act
Hi
g
h
-
per
f
or
m
anc
e c
om
put
er
c
l
us
t
er
s
ar
e m
aj
or
s
ei
s
m
i
c
pr
o
c
es
s
i
n
g pl
at
f
or
m
s
i
n t
he oi
l
i
n
dus
t
r
y
and ha
v
e a
f
r
equ
ent
o
c
c
ur
r
en
c
e of
f
ai
l
ur
es
.
I
n t
h
i
s
s
t
ud
y
,
K
-
m
eans
an
d t
he N
ai
v
e B
ay
es
al
gor
i
t
hm
w
er
e
pr
ogr
am
m
ed i
n
t
o M
apR
educ
e
and r
u
n on H
a
doop
.
T
he
ac
c
um
ul
at
ed
h
i
gh
-
per
f
or
m
anc
e c
om
put
er
c
l
us
t
e
r
r
unni
n
g
s
t
at
us
dat
a w
er
e f
i
r
s
t
c
l
us
t
er
ed
by
K
-
m
ean
s
,
and
t
hen
t
he
r
es
ul
t
s
w
er
e u
s
ed
f
o
r
N
ai
v
e B
a
y
e
s
t
r
ai
ni
ng.
F
i
nal
l
y
,
t
he t
e
s
t
da
t
a w
er
e di
s
c
r
i
m
i
nat
e
d f
or
t
h
e k
now
l
edg
e bas
e and
equ
i
pm
ent
f
ai
l
ur
e
.
E
x
per
i
m
en
t
s
i
nd
i
c
at
e t
h
a
t K
-
m
eans
r
et
ur
n
ed
good
r
e
s
ul
t
s
,
t
he
N
ai
v
e B
a
y
e
s
a
l
gor
i
t
hm
h
ad a
hi
gh r
at
e
o
f
di
s
c
r
i
m
i
na
t
i
on
,
an
d t
h
e m
ul
t
i
-
al
gor
i
t
hm
u
s
ed
i
n M
apR
edu
c
e a
c
hi
e
v
ed
an
i
nt
e
l
l
i
gent
pr
ed
i
c
t
i
o
n m
ec
ha
ni
s
m
.
Ke
y
w
o
rd
s
:
h
ig
h
-
p
er
f
or
m
an
c
e
c
l
u
s
t
er
s
(
hp
c
)
,
h
ado
op,
m
apr
e
duc
e
,
k
-
m
ea
ns
,
nai
v
e bay
es
C
o
p
y
r
i
g
h
t
©
20
16 U
n
i
ver
si
t
a
s A
h
mad
D
ah
l
an
.
A
l
l
r
i
g
h
t
s r
eser
ved
.
1
.
I
n
tr
o
d
u
c
ti
o
n
S
ei
s
m
i
c
pr
oc
es
s
i
ng
t
ec
hno
l
og
y
i
s
one
of
t
he
pr
i
m
ar
y
m
eans
f
or
oi
l
and
gas
ex
p
l
or
at
i
o
n
and
de
v
e
l
opm
ent
.
A
t
pr
es
ent
,
hi
g
h
-
per
f
or
m
anc
e
c
o
m
put
er
c
l
us
t
er
s
ar
e
m
a
j
or
s
e
i
sm
i
c
pr
oc
es
s
i
ng
pl
at
f
or
m
s
i
n t
he o
i
l
i
ndus
t
r
y
.
H
o
w
e
v
er
,
t
he
c
l
us
t
er
s
i
z
es
ar
e
e
x
pand
i
ng
w
i
t
h
i
nc
r
eas
i
ng am
ount
s
of
dat
a pr
oc
es
s
i
ng;
m
ean
w
h
i
l
e,
v
ar
i
ous
s
of
t
w
ar
e
app
l
i
c
a
t
i
o
ns
ar
e bei
ng
us
ed
i
n
t
er
c
han
gea
bl
y
,
l
ea
di
n
g
t
o
f
r
equen
t
c
l
us
t
er
f
ai
l
ur
es
.
T
her
ef
or
e,
s
t
abi
l
i
t
y
f
ac
t
or
s
ha
v
e
bec
om
e
i
nc
r
eas
i
ng
l
y
i
m
por
t
ant
.
H
er
e,
a
n
i
n
t
el
l
i
gen
t
pr
edi
c
t
i
on
m
ec
hani
s
m
i
s
i
nt
r
o
duc
ed
t
o
bu
i
l
d
a
k
now
l
e
dge
b
as
e
f
r
o
m
hi
s
t
or
i
c
al
d
at
a
an
d
det
ec
t
hi
dd
en
f
aul
t
s
i
n
t
he
c
l
us
t
er
us
i
n
g
dat
a
m
i
ni
ng
t
ec
hni
ques
bef
or
e t
he m
ai
nt
e
nanc
e
no
de c
r
as
h
es
.
T
hi
s
m
et
hod
w
i
l
l
m
i
ni
m
i
z
e
nod
e f
ai
l
ur
e
i
m
pac
t
s
on oi
l
an
d g
as
ex
pl
or
at
i
o
n pr
oj
ec
t
s
.
H
ado
op
i
s
a
n
o
pen
-
s
our
c
e
c
l
oud
c
om
put
i
n
g
m
odel
[
1]
t
hat
us
es
M
apR
educ
e
[
2]
f
or
t
he
par
al
l
e
l
c
om
put
at
i
on
of
bi
g
dat
a.
O
w
i
ng
t
o
i
t
s
hi
g
h r
e
l
i
a
bi
l
i
t
y
,
dat
a pr
oc
es
s
i
ng
c
a
p
a
c
it
y
,
f
le
x
ib
il
it
y
,
and s
c
al
ab
i
l
i
t
y
,
t
h
i
s
m
odel
has
gr
ad
ua
l
l
y
b
ec
om
e pop
ul
ar
f
or
c
om
put
er
r
es
ear
c
h and
i
s
w
i
de
l
y
us
ed
b
y
s
ear
c
h
e
ngi
nes
,
m
ac
hi
n
e
l
ear
ni
n
g
and
s
o
on
[
3
-
5]
.
H
o
w
e
v
er
,
H
ado
op
ha
s
not
y
et
be
en
us
ed t
o
m
oni
t
or
hi
gh
-
p
er
f
or
m
anc
e c
l
us
t
er
r
unni
ng
c
on
di
t
i
ons
.
T
oget
her
,
H
ado
op an
d Map
R
educ
e m
a
k
e i
nt
el
l
i
g
ent
pr
edi
c
t
i
on m
ec
hani
s
m
s
pos
s
i
bl
e f
or
h
ig
h
-
per
f
or
m
anc
e
c
l
us
t
er
r
unn
i
ng
a
na
l
y
s
es
.
R
el
a
t
ed
w
or
k
has
been
c
ar
r
i
ed
o
ut
on
k
-
m
eans
[
6]
and B
a
y
es
i
a
n [
7]
Mapr
e
du
c
e par
al
l
e
l
i
z
at
i
on
i
m
pr
ov
e
m
ent
s
,
but
no c
om
pr
ehens
i
v
e us
e
,
i
n t
hi
s
paper
, m
u
l
ti
-
al
g
or
i
t
hm
w
as
app
l
i
e
d f
or
H
P
C
r
unn
i
ng
s
t
at
e a
na
l
y
s
i
s
.
2.
R
e
sea
r
ch
M
et
h
o
d
IN
t
he
ar
c
hi
t
ec
t
ur
e
,
t
he
ent
i
r
e
c
l
us
t
er
i
s
des
c
r
i
b
ed
b
y
eac
h
L
i
nux
s
y
s
t
em
s
t
at
e
q
uant
i
t
y
c
o
m
ponent
,
w
hi
c
h c
har
ac
t
e
r
i
z
es
t
he c
l
us
t
er
s
t
at
e.
A
s
t
at
e
dat
a ana
l
y
s
i
s
pl
at
f
or
m
w
as
bui
l
t
bas
ed
on H
ado
op
pl
at
f
or
m
c
har
ac
t
er
i
s
t
i
c
s
a
nd
hi
gh
-
per
f
or
m
anc
e c
l
us
t
er
s
y
s
t
em
s
t
at
us
dat
a.
T
he
pl
at
f
or
m
c
o
m
pr
i
s
es
t
hr
ee p
ar
t
s
(
F
i
g
ur
e
1)
:
a)
A
s
t
at
e
c
ol
l
ec
t
i
on m
odul
e c
ol
l
ec
t
i
n
g t
h
e h
i
gh
-
per
f
or
m
anc
e c
l
us
t
er
r
u
nni
ng s
t
a
t
us
d
at
a
.
b)
A
s
t
at
e
dat
a
s
t
or
ag
e
m
odul
e
t
h
at
us
es
H
B
as
e
t
o
ef
f
i
c
i
ent
l
y
ac
h
i
e
v
e
hug
e
d
y
n
am
i
c
t
i
m
i
ng
of
t
he h
i
s
t
or
i
c
a
l
s
t
at
us
dat
a.
c)
A
dat
a
ana
l
y
s
i
s
m
odul
e
,
t
h
e
c
or
e
c
ont
ent
of
t
hi
s
ar
t
i
c
l
e,
t
hat
i
nc
l
u
des
t
w
o
a
l
gor
i
t
hm
s
bas
ed
on M
apR
educ
e
and
t
he
K
-
Means
and N
a
i
v
e
B
a
y
e
s
al
gor
i
t
hm
s
.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
SSN
:
1
6
9
3
-
6
930
T
E
L
KO
M
NI
K
A
V
o
l.
1
4
,
N
o
.
3
,
S
ept
em
ber
201
6
:
11
23
–
1
127
112
4
Li
n
ux
c
o
m
m
ands
ar
e us
ed f
or
c
l
us
t
er
s
r
unni
ng t
he
s
t
at
e dat
a c
o
l
l
ec
t
i
on
,
w
hi
c
h ar
e
em
bedded i
n a J
av
a pr
og
r
am
.
A
f
t
er
t
he ac
qui
s
i
t
i
o
n
i
s
c
o
m
pl
et
e,
t
he H
B
as
e A
P
I
i
n
t
er
f
ac
e i
s
c
al
l
e
d t
o s
t
or
e t
he
dat
a.
R
unn
i
ng
s
t
at
us
c
h
ar
ac
t
er
i
s
t
i
c
s
ar
e d
i
v
i
d
ed
i
nt
o h
eal
t
h,
ge
ner
a
l
,
and
f
aul
t
,
a
nd d
i
f
f
er
ent
c
at
eg
or
i
es
ar
e r
ef
i
ne
d as
t
h
e k
now
l
edge
bas
e
ex
pan
ds
.
2.
1
.
I
m
p
l
e
m
e
n
ta
ti
o
n
o
f th
e
K
-
m
ean
s
A
l
g
o
r
i
th
m
i
n
M
ap
R
ed
u
ce
T
he
K
-
m
eans
[
8
]
al
gor
i
t
hm
us
es
di
s
t
a
nc
e
as
t
he
s
i
m
i
l
ar
i
t
y
ev
al
uat
i
on
i
n
dex
a
nd
o
ut
put
s
t
he k
c
l
us
t
er
c
ent
er
s
(
F
i
g
ur
e
2
)
.
T
he s
t
eps
of
t
h
e i
m
pl
e
m
ent
at
i
on pr
oc
es
s
ar
e des
c
r
i
bed b
el
o
w
.
a)
K
da
t
a c
en
t
er
s
ar
e s
e
l
ec
t
e
d
f
r
o
m
t
he dat
a s
e
t
.
b)
A
l
l
of
t
he
d
at
a
ar
e
us
e
d
t
o
m
eas
ur
e
t
he
di
s
t
a
nc
e
be
t
w
een
e
ac
h
c
ent
er
t
o
f
i
nd
t
he
m
i
ni
m
u
m
di
s
t
anc
e
,
w
h
i
c
h i
s
i
nc
l
ude
d
i
n t
he m
i
ni
m
u
m
c
l
as
s
.
c)
A
l
l
t
y
p
es
of
c
ent
er
s
ar
e
r
ec
al
c
ul
at
e
d.
S
t
e
ps
2
an
d
3
ar
e
r
ep
eat
e
d
unt
i
l
t
h
e
t
hr
es
h
ol
d
i
s
m
et
.
T
he
m
ai
n f
unc
t
i
on f
or
an
appr
o
pr
i
at
e t
hr
es
h
ol
d
des
i
gn i
s
t
o
us
e an
i
t
er
at
i
v
e
pr
oc
es
s
t
o
ac
hi
e
v
e
t
he
Ma
p a
nd R
ed
u
c
e f
unc
t
i
ons
a
nd
t
o c
on
t
i
nu
e t
o f
unc
t
i
on c
al
l
s
unt
i
l
t
he
t
hr
es
hol
d i
s
m
et
.
….
M
a
p
R
ed
u
ce
A
n
a
ly
s
is
S
y
s
te
m
s
C
lu
s
t
e
r
s
R
u
n
n
in
g
S
t
a
tu
s
D
a
ta
H
ig
h
-
P
e
r
f
o
r
m
a
n
c
e
C
l
u
s
t
e
r f
o
r
O
il S
e
is
m
ic
p
r
o
c
e
s
s
in
g
N
o
d
e
1
N
o
d
e
N
C
o
lle
c
t
io
n
H
a
d
oop
F
ile
S
y
s
t
e
m
K
-
M
e
a
ns
C
l
us
t
e
r
i
ng
A
l
gor
i
t
hm
B
a
y
e
s
ia
n
D
is
c
r
imin
a
t
io
n
M
e
t
hod
F
a
ilu
r
e
W
a
r
n
in
g
K
n
o
w
l
e
d
g
e
B
a
se
C
l
us
t
e
r
i
ng
P
r
o
c
e
s
s
C
la
s
s
if
ic
a
t
io
n
P
r
o
c
e
s
s
O
p
t
imiz
a
t
io
n
/u
s
e
of
K
now
l
e
d
ge
F
ig
ur
e
1
.
T
he ar
c
hi
t
ec
t
ur
e
of
t
he Ma
pR
e
duc
e
I
nt
egr
a
t
ed Mu
l
t
i
-
al
g
or
i
t
hm
Center change threshold is met
?
yes
no
Map
Map
1
Map
2
Mapn
<
C
1
,
D
1
>
<
C
2
,
D
2
>
<
Cn
,
Dn
>
<
C
1
,
list
>
<
C
2
,
list
>
<
Cn
,
list
>
Reduce
1
Get data from HBase table
<
D
1
,
D
2
,
…
,
Dn
>
Initialize cluster centers
Reduce
2
Reducen
The new cluster centers
Replace the original
Cluster centers
End
Reduce
F
i
gur
e
2
.
T
he
I
m
pl
em
ent
at
i
on P
r
oc
es
s
of
t
he
K
-
m
eans
A
l
gor
i
t
hm
i
n M
ap
R
educ
e
2.
2
.
I
m
p
l
e
m
e
n
ta
ti
o
n
o
f th
e
N
a
i
v
e
B
a
y
e
s
a
l
g
o
r
i
th
m
i
n
M
a
p
R
e
d
u
c
e
F
i
gur
e
3
s
ho
w
s
t
h
e i
m
pl
em
ent
at
i
on pr
oc
es
s
f
or
t
he N
ai
v
e B
a
y
es
al
gor
i
t
hm
[
9
-
10
]
in
MapR
educ
e
,
w
h
i
c
h i
s
s
et
u
p
as
f
ol
l
o
w
s
:
a)
Let
=
{
1
,
2
,
…
,
}
f
or
an
i
t
em
t
o
be
c
l
as
s
i
f
i
ed,
w
her
e
eac
h
a
i
s
a
c
h
ar
ac
t
er
i
s
t
i
c
pr
oper
t
y
of
X
.
b)
Se
t
=
{
1
,
2
,
…
,
}
,
w
her
e
eac
h
y
is
a
c
at
e
gor
y
.
c)
C
a
lc
u
la
t
e
(
1
|
)
,
(
2
|
)
…
(
|
)
.
d)
If
(
|
)
=
ma
x
{
(
1
|
)
,
(
2
|
)
,
…
,
(
|
)
}
,
t
h
en
∈
.
T
he
k
e
y
i
s
ho
w
t
o
c
a
l
c
ul
at
e t
he pr
o
bab
i
l
i
t
y
of
eac
h c
ond
i
t
i
on i
n
S
t
ep 3 b
y
o
bt
a
i
ni
n
g
a
k
now
n
i
t
em
c
l
as
s
i
f
i
c
at
i
on
c
al
l
e
d
t
h
e
t
r
a
i
n
i
ng
s
e
t
.
C
ond
i
t
i
ona
l
pr
o
bab
i
l
i
t
y
es
t
i
m
at
es
of
eac
h
c
har
ac
t
er
i
s
t
i
c
pr
oper
t
y
i
n e
ac
h c
at
eg
or
y
ar
e c
ou
nt
e
d u
s
i
ng
E
qu
at
i
on
1:
Evaluation Warning : The document was created with Spire.PDF for Python.
T
E
L
KO
M
NI
K
A
I
S
S
N
:
1
693
-
6
930
MapR
educ
e
I
nt
egr
at
e
d Mul
t
i
-
al
gor
i
t
hm f
or
H
P
C
R
u
nn
i
n
g S
t
at
e
A
n
a
ly
s
is
(
Sh
u
R
en
Li
u
)
1125
(
1
|
1
)
,
(
2
|
1
)
,
…
,
(
|
1
)
…
(
1
|
)
,
(
2
|
)
,
…
,
(
|
)
(
1)
I
f
t
he pr
o
per
t
y
of
e
ac
h c
h
a
r
ac
t
er
i
s
t
i
c
c
on
di
t
i
o
n
i
s
i
n
de
pend
ent
,
t
h
e
y
c
an
be
c
al
c
ul
at
ed
us
i
ng
B
a
y
es
'
t
heor
em
:
(
1
|
)
=
(
|
)
(
)
(
)
.
(
2)
B
ec
aus
e t
he de
nom
i
nat
or
i
s
a
c
ons
t
ant
f
or
al
l
c
at
ego
r
i
es
,
w
e nee
d t
o m
ax
i
m
i
z
e
eac
h
c
o
m
ponent
.
E
ac
h at
t
r
i
bu
t
e
has
c
ond
i
t
i
ona
l
i
nd
epe
nde
n
c
e,
t
her
ef
or
e
:
(
|
)
(
)
=
(
1
|
)
(
2
|
)
…
(
|
)
(
)
=
(
)
∏
=
1
.
(
3)
s
a
mp
le
1
s
a
mp
le
2
s
a
mp
le
3
.
.
.
I
nput
S
p
lit
K
e
y:
va
l
ue
.
.
.
K
e
y:
va
l
ue
K
e
y:
va
l
ue
C
a
l
c
u
l
a
t
e
P
(
yi
)
f
o
r
E
a
c
h
C
a
t
e
g
o
r
y
M
a
p
T
as
k
C
a
l
c
u
l
a
t
e
P
(
x
|
y
)
P
(
yi
)
f
o
r
E
a
c
h
C
a
t
e
g
o
r
y
R
e
d
u
c
e T
as
k
T
h
e
ma
x
imu
m
P
(
x
|
y
)
P (
yi
)
f
o
r
X
C
a
t
e
g
o
r
y
P
r
ep
ar
at
i
o
n
S
t
ag
e
C
la
s
s
if
ie
r
t
r
a
in
in
g
S
t
a
g
e
C
la
s
s
if
ic
a
t
io
n
S
t
a
g
e
F
i
gur
e
3
.
T
he i
m
pl
em
ent
at
i
on pr
oc
es
s
f
or
t
he N
ai
v
e
B
a
y
es
al
g
or
i
t
hm
i
n MapR
edu
c
e
3.
R
e
su
l
t
s an
d
A
n
al
y
s
i
s
O
n f
i
v
e
B
C
L
460c
b
l
a
des
,
a
f
ul
l
y
di
s
t
r
i
but
e
d m
ode H
ad
o
op p
l
at
f
or
m
w
as
bui
l
t
,
i
nc
l
u
di
n
g a
nam
enode a
nd f
our
dat
a
n
odes
.
E
ac
h
nod
e h
ad
a 1
0
-
c
or
e C
P
U
,
6
4 G
B
R
A
M,
and
a 6
00 G
B
har
d
dr
i
v
e.
T
he
op
er
at
i
ng
s
y
s
t
em
w
as
R
edH
at
5.
8,
a
nd
w
e
us
ed
j
dk
1.
7.
0
_25,
H
adoo
p
v
er
s
i
on
2.
5.
0
,
and H
B
as
e v
er
s
i
on 0
.
98.
1
.
T
he
m
oni
t
or
ed obj
ec
t
w
as
a hi
gh
-
p
er
f
or
m
anc
e c
l
us
t
er
w
i
t
h 51
2
nodes
.
3.
1.
K
-
m
ean
s R
esu
l
t
s
St
a
n
d
-
al
one
K
-
m
eans
(
Mat
l
ab pr
o
gr
am
,
I
nt
el
i
5,
f
or
12
8 G
B
m
e
m
or
y
)
a
nd H
ado
o
p
we
r
e
r
un a
nd
pr
oper
t
y
i
t
em
20
of
t
he
10
,
00
0 r
un
ni
ng s
t
at
us
dat
a
i
s
s
h
o
w
n
i
n
F
i
g
ur
e
4.
I
t
c
an
be s
e
en
f
ro
m
F
i
g
ur
e
4 t
hat
H
ad
oop
r
uns
f
as
t
er
t
han t
he s
t
and
-
al
o
ne m
ode w
he
n t
he num
ber
of
i
t
er
at
i
ons
i
nc
r
eas
es
.
F
i
gur
e
4
.
T
he
R
e
la
t
io
n
s
h
ip
b
et
w
e
en
t
he
I
t
er
at
i
on
an
d t
he R
u
nn
i
ng T
i
m
e
f
or
S
t
and
-
A
l
o
ne
K
-
m
eans
(
bl
ue)
an
d H
a
doo
p (
r
ed)
0
5
0
0
0
0
1
0
0
0
0
0
1
5
0
0
0
0
2
0
0
0
0
0
2
5
0
0
0
0
3
0
0
0
0
0
3
5
0
0
0
0
1
2
3
4
5
6
T
im
e
I
t
e
r
a
t
i
o
n
s
M
a
t
L
a
b
S
i
n
g
l
e
N
od
e
(m
s
)
C
lu
s
t
er
(
m
s
)
Evaluation Warning : The document was created with Spire.PDF for Python.
I
SSN
:
1
6
9
3
-
6
930
T
E
L
KO
M
NI
K
A
V
o
l.
1
4
,
N
o
.
3
,
S
ept
em
ber
201
6
:
11
23
–
1
127
1126
F
i
gur
e 5 s
ho
w
s
r
es
ul
t
s
b
y
H
ado
op on 10
,
00
0 r
unn
i
ng
s
t
at
us
dat
a i
t
er
m
s
w
i
t
h 1
0,
20 and
30 pr
op
er
t
i
es
,
and t
hen
t
he abs
o
l
u
t
e d
i
s
t
anc
e
v
a
l
ues
of
t
he t
hr
ee c
l
us
t
er
c
ent
er
s
w
er
e
c
o
m
par
ed.
T
he m
or
e at
t
r
i
b
ut
e
i
t
em
s
,
t
he bet
t
er
t
h
e c
l
u
s
t
er
i
ng r
es
u
l
t
s
.
F
i
gur
e
5
.
R
es
ul
t
s
f
or
D
i
f
f
er
ent
A
t
t
r
i
b
ut
es
T
he ex
per
i
m
ent
s
i
ndi
c
at
e t
hat
t
h
e r
unn
i
ng t
i
m
e of
t
he MapR
e
duc
e pr
o
gr
am
i
s
s
hor
t
e
r
t
han
t
hat
of
t
he
s
t
and
-
a
l
o
n
e
pr
ogr
am
i
n t
he
c
as
e
of
a l
ar
ger
n
um
ber
of
i
t
er
at
i
o
n
s
and
t
h
at
t
he
m
or
e i
t
er
at
i
o
ns
and
at
t
r
i
but
e i
t
em
s
t
her
e ar
e,
t
he
be
t
t
e
r
t
he
K
-
m
eans
c
l
us
t
er
i
ng r
e
s
ul
t
s
.
3.
2.
N
ai
v
e B
a
yes
C
l
as
si
f
i
e
r
R
esu
l
t
s
F
i
gur
e
6 dem
ons
t
r
at
es
t
ha
t
t
he c
ol
l
ec
t
e
d h
i
gh
-
p
er
f
or
m
anc
e c
o
m
put
er
c
l
us
t
er
r
unn
i
ng
s
t
at
us
dat
a
w
er
e f
i
r
s
t
c
l
us
t
er
ed b
y
K
-
m
eans
,
and t
he
n t
he
r
es
ul
t
s
w
er
e us
e
d f
or
N
ai
v
e
B
a
y
es
t
r
ai
n
i
ng.
F
i
n
al
l
y
,
t
h
e t
es
t
dat
a
w
er
e
di
s
c
r
i
m
i
nat
ed f
or
t
h
e k
now
l
ed
ge b
as
e a
nd
equ
i
pm
ent
f
ai
l
ur
e.
K
-
m
ea
n
s
C
l
u
s
t
er
cen
t
er
T
r
a
in
in
g
C
a
t
e
gor
y
1
f
e
a
t
ur
e
D
a
t
a
d
is
c
r
imin
a
t
io
n
by
N
a
t
i
v
e
B
a
y
e
s
i
a
n
T
h
e
d
a
t
a
a
r
e
s
i
m
i
l
a
r
t
o
t
h
e
C
a
t
e
gor
y
1
ma
x
imu
m,
T
he
r
e
f
or
e
t
he
y be
l
ong
t
o
C
a
t
e
gor
y
1
C
a
t
e
gor
y
2
f
e
a
t
ur
e
C
a
t
e
gor
y
3
f
e
a
t
ur
e
F
i
gur
e
6
.
P
r
oc
es
s
F
l
o
w
of
t
he H
P
C
R
unn
i
ng
S
t
at
e
F
i
gur
e
7 s
ho
w
s
a
n ex
am
pl
e f
or
det
er
m
i
ni
ng a f
au
l
t
.
R
unn
i
ng
dat
a
w
er
e c
o
l
l
ec
t
ed
f
r
o
m
a
c
o
m
put
er
on J
une
9,
20
15
,
w
hi
c
h
w
as
g
i
v
en t
h
e c
l
as
s
i
f
i
c
at
i
on 3
11,
w
her
e
3
11
bel
ongs
t
o t
he
f
aul
t
c
l
as
s
i
f
i
c
at
i
o
n.
I
n ac
t
u
al
i
t
y
,
t
hi
s
c
om
put
er
ex
per
i
e
nc
ed a har
d dr
i
v
e f
ai
l
ur
e
.
T
her
ef
or
e,
t
hi
s
c
l
as
s
i
f
i
c
at
i
on
/
di
s
c
r
i
m
i
nat
i
on
w
as
ap
pr
opr
i
at
e.
M
a
c
1
201506090201237000
30.3
29.8
29.31
10.0
5.2
80.1
0
2.1
3203.1
3678
20
0.31
92
H
a
r
d di
s
k f
a
i
l
ur
e
C
la
s
s
if
ic
a
tio
n
=
3
1
1
(
w
h
ic
h
b
e
lo
n
g
s
to
f
a
u
lt
)
F
i
gur
e
7
.
An
E
x
am
pl
e
f
or
D
et
er
m
i
ni
ng
A
F
a
ul
t
I
n F
i
g
ur
e
8,
a
c
r
os
s
pl
ot
s
h
o
w
s
t
he
di
s
c
r
i
m
i
nat
i
on r
es
u
l
t
s
f
or
di
f
f
er
ent
at
t
r
i
but
es
,
r
unn
i
ng
K
-
m
eans
on H
ad
oop
f
or
a m
ax
i
m
u
m
o
f
10 i
t
er
at
i
o
ns
500
t
i
m
es
f
or
at
t
r
i
but
es
f
r
om
5 t
o 30 b
y
10,
0
00 s
t
at
us
dat
a.
I
f
eac
h c
l
us
t
er
c
ent
er
i
s
us
ed
as
a
s
a
m
pl
e,
t
he
n
s
i
x
k
now
l
e
dge
bas
es
ar
e
gener
at
ed
f
r
om
t
he
500
s
a
m
pl
es
.
E
ac
h
k
now
l
edg
e
ba
s
e
w
as
t
r
a
i
n
ed
b
y
m
eans
of
N
ai
v
e
B
a
y
es
ba
s
ed
o
n
M
apR
e
duc
e
and
c
o
m
par
ed
w
i
t
h
a
s
i
ng
l
e
n
o
de
us
i
ng
1
0,
00
0
r
a
w
d
at
a
f
or
t
he
1
00
t
es
t
dat
a c
l
as
s
i
f
i
c
at
i
o
ns
.
0
5
0
1
0
0
1
5
0
2
0
0
2
5
0
3
0
0
3
5
0
1
0
2
0
3
0
C
l
u
st
er
c
en
t
er
p
o
si
t
i
o
n
P
r
o
p
er
t
y I
t
em
s
H
e
a
l
th
G
e
n
e
r
a
l
F
a
u
l
t
Evaluation Warning : The document was created with Spire.PDF for Python.
T
E
L
KO
M
NI
K
A
I
S
S
N
:
1
693
-
6
930
MapR
educ
e
I
nt
egr
at
e
d Mul
t
i
-
al
gor
i
t
hm f
or
H
P
C
R
u
nn
i
n
g S
t
at
e
A
n
a
ly
s
is
(
Sh
u
R
en
Li
u
)
1127
F
i
gur
e
8
.
D
is
c
r
im
in
a
t
io
n
R
e
su
l
t
s f
or
D
i
f
f
er
ent
A
t
t
r
i
but
es
.
T
he
ex
per
i
m
ent
t
hat
us
ed
K
-
Mea
ns
i
nt
er
m
edi
at
e
dat
a
as
t
r
ai
ni
n
g
d
at
a
per
f
or
m
ed
bet
t
er
t
han
t
he
t
r
ad
i
t
i
ona
l
m
et
hod o
n a s
i
ng
l
e m
ode.
A
s
t
he
pr
op
er
t
y
i
t
em
s
i
nc
r
eas
ed,
t
he
di
s
c
r
i
m
i
nat
i
on
s
uc
c
es
s
r
at
e i
nc
r
eas
ed.
O
w
i
ng
t
o t
he
num
ber
of
s
a
m
pl
es
(
w
i
t
h
a m
ax
i
m
u
m
o
f
10,
0
00)
,
t
he s
am
pl
e at
t
r
i
b
ut
e i
t
em
s
(
up t
o 30 us
es
)
,
and t
he i
m
pac
t
of
a pos
s
i
bl
e c
or
r
el
at
i
on
bet
w
e
en
t
he
pr
op
er
t
i
es
of
t
he i
t
em
s
,
t
he s
uc
c
es
s
r
a
t
e
w
as
be
l
o
w
80%
;
ho
w
e
v
er
,
i
t
i
s
pr
ac
t
i
c
al
f
or
m
oni
t
or
i
ng hi
g
h
-
per
f
or
m
anc
e c
l
us
t
er
s
.
4
.
C
o
n
c
l
u
s
i
o
n
T
o enhanc
e
t
h
e s
t
a
bi
l
i
t
y
of
hi
g
h
-
per
f
or
m
anc
e c
l
us
t
er
s
i
n o
i
l
an
d g
as
ex
p
l
or
at
i
on
,
H
adoo
p
w
as
us
ed
t
o
ana
l
y
z
e
t
h
e
hi
gh
-
per
f
or
m
anc
e
c
l
us
t
er
r
un
ni
n
g
s
t
at
e.
A
n
a
l
y
s
i
s
f
or
equ
i
pm
ent
f
ai
l
ur
e
w
as
ac
hi
e
v
ed
v
i
a K
-
Me
a
ns
and N
ai
v
e
B
a
y
es
al
g
or
i
t
hm
s
pr
ogr
a
m
m
ed i
nt
o MapR
e
duc
e.
E
x
per
i
m
ent
s
i
nd
i
c
at
e t
hat
K
-
m
eans
r
et
ur
ned
goo
d r
es
u
l
t
s
,
t
he
N
ai
v
e
B
a
y
es
a
l
g
or
i
t
hm
had a hi
g
h
r
at
e of
di
s
c
r
i
m
i
nat
i
o
n,
an
d
t
he m
ul
t
i
-
al
gor
i
t
hm
t
hat
u
s
ed Map
R
ed
uc
e ac
h
i
e
v
ed
an i
nt
e
l
l
i
ge
nt
pr
edi
c
t
i
on
m
ec
hani
s
m
.
R
ef
er
en
ces
[1
]
A
ddai
r
T
G
,
D
odge D
A
,
W
a
l
t
er
W
R
,
et
a
l
.
Lar
ge
-
sca
l
e
se
i
smi
c
s
i
gn
al
a
nal
y
s
i
s
w
i
t
h
H
adoop
.
C
om
put
er
s
&
G
e
os
c
i
e
nc
e
s
.
201
4
;
66(
2)
:
1
45
-
15
4.
[2
]
D
ean J
,
G
hem
aw
at
S
.
M
apR
educ
e
:
S
i
m
pl
i
f
i
ed D
at
a
P
r
o
c
es
s
i
ng on Lar
ge C
l
us
t
er
s
.
I
n P
r
oc
eedi
n
gs
o
f
O
per
at
i
n
g S
y
s
t
e
m
s
D
e
s
i
gn a
n
d I
m
p
l
e
m
ent
at
i
o
n.
2
004
;
51(
1)
:
107
-
1
13.
[3
]
Londh
e S
,
M
ahaj
an
S
.
E
f
f
ec
t
i
v
e and
E
f
f
i
c
i
e
nt
W
ay
of
R
edu
c
e D
e
pen
den
c
y
on
D
at
a
s
et
w
i
t
h t
h
e
H
elp
of
M
apr
educ
e
on B
i
g D
at
a
.
T
e
l
k
om
ni
k
a I
n
done
s
i
a
n
J
our
n
al
o
f
E
l
ec
t
r
i
c
al
E
n
gi
ne
er
i
n
g
.
2
015
;
15(
1)
.
[4
]
J
ay
al
at
h C
,
S
t
ephe
n J
,
E
ug
s
t
er
P
.
F
r
o
m
t
he C
l
oud
t
o t
he A
t
m
os
pher
e:
R
un
ni
n
g
M
apR
educ
e ac
r
o
s
s
D
at
a C
ent
er
s
.
C
om
put
er
s
I
E
E
E
T
r
ans
ac
t
i
on
s
on
.
20
14
;
6
3
(1
):
74
-
87
.
[5
]
Li
u Y
,
W
e
i
W
,
Z
han
g Y
.
C
hec
k
poi
nt
a
nd R
e
pl
i
c
at
i
on O
r
i
ent
e
d F
a
ul
t
T
ol
e
r
ant
M
e
c
ha
n
i
s
m
f
or
M
apR
educ
e F
r
a
m
ew
or
k
.
T
e
l
k
om
ni
k
a
I
ndo
nes
i
an
J
o
ur
na
l
of
E
l
ec
t
r
i
c
a
l
E
ng
i
ne
er
i
n
g
.
20
14
;
1
2
(2
).
[6
]
A
l
j
ar
a
h
I
,
L
udw
i
g
S
A.
P
ar
a
l
l
el
G
l
ow
w
or
m
S
w
ar
m
O
pt
i
m
i
z
at
i
on
C
l
us
t
er
i
ng
A
l
gor
i
t
hm
bas
e
d
o
n
M
apR
educ
e
.
I
EEE Sy
m
p
o
s
i
u
m
Se
ri
e
s
on
C
o
m
put
at
i
o
nal
I
nt
el
l
i
gen
c
e
.
2014
:
1
-
8.
[7
]
V
i
l
l
a S
,
R
os
s
et
t
i
M
.
Lear
ni
n
g
C
ont
i
n
uou
s
T
i
m
e B
ay
e
s
i
a
n
N
et
w
or
k
C
l
as
s
i
f
i
er
s
U
s
i
ng
M
apR
edu
c
e
.
J
our
n
al
o
f
S
t
at
i
s
t
i
c
a
l
S
of
t
w
ar
e
.
2014
;
62(
3)
:
1
-
25
.
[8
]
K
anung
o T
,
M
ount
D
M
,
N
et
any
ahu N
S
, e
t a
l
. A
n
E
ffi
c
i
e
n
t k
-
M
eans
C
l
us
t
e
r
i
ng A
l
gor
i
t
h
m
:
A
n
a
l
y
s
is
and
I
m
pl
em
e
nt
at
i
on
.
I
E
E
E
T
r
an
s
a
c
t
i
o
ns
o
n P
at
t
er
n A
n
al
y
s
i
s
&
M
ac
hi
ne I
nt
e
l
l
i
gen
c
e
.
200
2
;
2
4
(7
):
8
81
-
892.
[9
]
A
hm
ed
S
E
.
B
ay
es
i
an N
e
t
w
or
k
s
an
d D
ec
i
s
i
on G
r
ap
hs
.
T
e
c
hn
om
et
r
i
c
s
.
20
02
;
50
(
1
):
362
.
[1
0
]
R
R
di
ng K
P
,
W
o
l
p
er
t
D
M
.
B
a
y
es
i
an i
nt
egr
at
i
o
n i
n s
e
ns
or
i
m
ot
or
l
ear
n
i
ng
.
N
at
ur
e
.
20
04
;
427(
697
1)
:
244
-
24
7.
Evaluation Warning : The document was created with Spire.PDF for Python.