I
nte
rna
t
io
na
l J
o
urna
l o
f
E
lect
rica
l a
nd
Co
m
pu
t
er
E
ng
ineering
(
I
J
E
CE
)
Vo
l.
15
,
No
.
1
,
Feb
r
u
ar
y
20
25
,
p
p
.
1
1
6
2
~
1
1
7
4
I
SS
N:
2088
-
8
7
0
8
,
DOI
: 1
0
.
1
1
5
9
1
/ijece.
v
15
i
1
.
pp
1
1
6
2
-
1
1
7
4
1162
J
o
ur
na
l ho
m
ep
a
g
e
:
h
ttp
:
//ij
ec
e.
ia
esco
r
e.
co
m
Ev
a
lua
ting
ma
chine learning
mo
del
s for pre
dictive a
na
ly
tics of
liv
er disea
se de
tec
tion usin
g
healt
hc
a
re big da
ta
O
s
a
m
a
M
o
ha
re
b K
ha
led
1
,
Ahm
ed
Z
a
k
a
re
ia
E
ls
herif
1
,2
,
Ahm
ed
Sa
la
m
a
1
,
M
o
s
t
a
f
a
H
er
a
j
y
1
,
E
ls
a
y
ed
E
l
s
edim
y
3
1
D
e
p
a
r
t
me
n
t
o
f
M
a
t
h
e
m
a
t
i
c
s
a
n
d
C
o
mp
u
t
e
r
S
c
i
e
n
c
e
,
F
a
c
u
l
t
y
o
f
S
c
i
e
n
c
e
,
P
o
r
t
S
a
i
d
U
n
i
v
e
r
si
t
y
,
P
o
r
t
S
a
i
d
,
E
g
y
p
t
2
D
e
p
a
r
t
me
n
t
o
f
B
a
si
c
S
c
i
e
n
c
e
s,
H
i
g
h
e
r
I
n
st
i
t
u
t
e
o
f
A
d
m
i
n
i
s
t
r
a
t
i
v
e
S
c
i
e
n
c
e
s,
El
-
M
e
n
z
a
l
a
,
E
g
y
p
t
3
D
e
p
a
r
t
me
n
t
o
f
I
n
f
o
r
mat
i
o
n
Te
c
h
n
o
l
o
g
y
M
a
n
a
g
e
me
n
t
,
F
a
c
u
l
t
y
o
f
M
a
n
a
g
e
m
e
n
t
T
e
c
h
n
o
l
o
g
y
a
n
d
I
n
f
o
r
ma
t
i
o
n
S
y
s
t
e
ms,
P
o
r
t
S
a
i
d
U
n
i
v
e
r
si
t
y
,
P
o
r
t
S
a
i
d
,
Eg
y
p
t
Art
icle
I
nfo
AB
S
T
RAC
T
A
r
ticle
his
to
r
y:
R
ec
eiv
ed
Ma
r
1
8
,
2
0
2
4
R
ev
is
ed
Sep
1
6
,
2
0
2
4
Acc
ep
ted
Oct
1
,
2
0
2
4
Li
v
e
r
d
ise
a
se
s
ra
n
k
a
m
o
n
g
t
h
e
m
o
st
p
re
v
a
len
t
h
e
a
lt
h
issu
e
s
g
l
o
b
a
ll
y
,
c
a
u
sin
g
sig
n
ifi
c
a
n
t
m
o
rb
i
d
it
y
a
n
d
m
o
rtali
ty
.
Earl
y
d
e
tec
ti
o
n
o
f
li
v
e
r
d
ise
a
se
s
a
ll
o
ws
fo
r
ti
m
e
ly
in
ter
v
e
n
ti
o
n
,
wh
ich
c
a
n
p
re
v
e
n
t
t
h
e
p
r
o
g
re
ss
io
n
o
f
su
c
h
d
ise
a
se
s
to
m
o
re
se
v
e
re
sta
g
e
s
s
u
c
h
a
s
c
i
rrh
o
sis
o
r
li
v
e
r
c
a
n
c
e
r.
T
o
t
h
is
e
n
d
,
m
a
n
y
m
a
c
h
in
e
lea
rn
in
g
m
o
d
e
ls
h
a
v
e
b
e
e
n
p
re
v
i
o
u
sl
y
d
e
v
e
lo
p
e
d
t
o
e
a
rl
y
p
re
d
ict
li
v
e
r
d
ise
a
se
s
a
m
o
n
g
p
o
ten
ti
a
l
p
a
ti
e
n
ts.
H
o
we
v
e
r,
e
a
c
h
m
o
d
e
l
h
a
s
i
ts
a
c
c
u
ra
c
y
a
n
d
p
e
rfo
rm
a
n
c
e
li
m
it
a
ti
o
n
s
.
I
n
th
is
p
a
p
e
r,
we
p
re
se
n
t
a
c
o
m
p
re
h
e
n
siv
e
c
o
m
p
a
riso
n
o
f
th
re
e
d
iffere
n
t
m
a
c
h
in
e
lea
rn
i
n
g
m
o
d
e
ls
t
h
a
t
c
a
n
b
e
e
m
p
l
o
y
e
d
to
e
n
h
a
n
c
e
t
h
e
p
re
d
ictio
n
a
n
d
m
a
n
a
g
e
m
e
n
t
o
f
li
v
e
r
d
ise
a
se
s.
We
u
ti
l
ize
a
b
i
g
d
a
t
a
se
t
o
f
3
2
,
0
0
0
re
c
o
rd
s
to
e
v
a
lu
a
te
th
e
p
e
rfo
rm
a
n
c
e
o
f
e
a
c
h
m
o
d
e
l.
F
irst
,
we
imp
lem
e
n
t
a
p
re
p
ro
c
e
ss
in
g
t
e
c
h
n
iq
u
e
to
re
c
ti
fy
m
issin
g
o
r
c
o
rr
u
p
t
d
a
ta
in
li
v
e
r
d
ise
a
se
d
a
tas
e
ts,
e
n
su
rin
g
d
a
ta
in
teg
rit
y
.
Afte
rwa
rd
s,
we
c
o
m
p
a
re
th
e
p
e
rfo
rm
a
n
c
e
o
f
t
h
re
e
m
a
c
h
in
e
m
o
d
e
ls:
k
-
n
e
a
re
st
n
e
ig
h
b
o
rs
(K
NN
),
g
a
u
ss
ian
n
a
iv
e
Ba
y
e
s
(G
a
u
ss
ian
NB)
a
n
d
ra
n
d
o
m
f
o
re
st
(RF
)
.
We
c
o
n
c
lu
d
e
d
th
a
t
t
h
e
RF
a
l
g
o
rit
h
m
d
e
m
o
n
stra
tes
su
p
e
rio
r
p
e
rfo
rm
a
n
c
e
in
o
u
r
e
v
a
lu
a
ti
o
n
,
e
x
c
e
ll
in
g
in
b
o
th
p
re
d
icti
v
e
a
c
c
u
ra
c
y
a
n
d
th
e
a
b
il
i
ty
t
o
c
las
sify
p
a
ti
e
n
t
s
a
c
c
u
ra
tely
re
g
a
rd
in
g
th
e
p
re
se
n
c
e
o
f
li
v
e
r
d
ise
a
se
.
Ou
r
re
s
u
lt
s sh
o
w t
h
a
t
RF
o
u
tp
e
rf
o
rm
s o
th
e
r
m
o
d
e
ls b
a
se
d
o
n
se
v
e
ra
l
p
e
rfo
rm
a
n
c
e
m
e
tri
c
s
in
c
lu
d
in
g
a
c
c
u
ra
c
y
:
9
7
.
3
%
,
p
re
c
isio
n
:
9
7
%
,
re
c
a
ll
:
9
6
%
,
a
n
d
F
1
-
sc
o
re
:
9
5
%
.
K
ey
w
o
r
d
s
:
B
ig
d
ata
G
a
u
ss
ian
n
a
iv
e
B
a
y
e
s
K
-
n
e
a
re
st n
e
ig
h
b
o
rs
L
iv
er
d
is
ea
s
e
p
atien
t d
ataset
Ra
n
d
o
m
f
o
re
st
T
h
is i
s
a
n
o
p
e
n
a
c
c
e
ss
a
rticle
u
n
d
e
r th
e
CC B
Y
-
SA
li
c
e
n
se
.
C
o
r
r
e
s
p
o
nd
ing
A
uth
o
r
:
Ah
m
ed
Z
ak
a
r
eia
E
ls
h
er
if
Dep
ar
tm
en
t o
f
Ma
th
em
atics a
n
d
C
o
m
p
u
te
r
Scien
ce
,
Facu
lty
o
f
Scien
ce
,
Po
r
t Said
Un
i
v
er
s
ity
E
l Z
o
h
o
u
r
Dis
tr
ict,
Po
r
t Said
Go
v
er
n
o
r
ate,
8
5
6
0
0
0
1
,
E
g
y
p
t
E
m
ail: a
h
m
ad
.
elsh
er
if
@
s
ci.
p
s
u
.
ed
u
.
e
g
1.
I
NT
RO
D
UCT
I
O
N
I
n
a
b
n
o
r
m
al
liv
er
f
u
n
ctio
n
(
al
s
o
ca
lled
liv
er
d
is
ea
s
e)
,
t
h
e
liv
er
'
s
ef
f
ec
tiv
en
ess
is
s
ev
er
ely
d
im
in
is
h
ed
if
o
n
l
y
2
5
%
o
f
it
is
s
till
wo
r
k
i
n
g
wh
ile
th
e
o
th
er
7
5
%
is
d
a
m
ag
ed
[
1
]
,
[
2
]
.
Pre
d
ictin
g
liv
e
r
d
is
ea
s
e
at
a
n
ea
r
l
y
s
tag
e
allo
ws
f
o
r
tim
ely
in
ter
v
en
tio
n
,
wh
ich
ca
n
p
r
e
v
en
t
th
e
d
is
ea
s
e
f
r
o
m
p
r
o
g
r
ess
in
g
to
m
o
r
e
s
ev
er
e
s
tag
es.
E
ar
ly
tr
ea
tm
en
t
ca
n
h
alt
o
r
s
lo
w
d
o
wn
th
e
d
is
ea
s
e,
im
p
r
o
v
in
g
p
atie
n
t
o
u
tco
m
es.
T
o
th
is
en
d
,
ar
tific
ial
in
tellig
en
ce
ap
p
r
o
ac
h
es,
p
ar
ticu
lar
ly
m
ac
h
in
e
lear
n
i
n
g
m
o
d
els,
o
f
f
er
p
r
o
m
is
in
g
s
o
lu
tio
n
s
to
m
a
n
y
class
if
icatio
n
an
d
p
r
ed
ictio
n
p
r
o
b
lem
s
,
in
clu
d
in
g
liv
er
d
is
ea
s
e
[
3
]
–
[
1
0
]
.
Ma
n
y
ap
p
r
o
ac
h
es
wer
e
i
n
tr
o
d
u
ce
d
to
p
r
e
d
ict
an
d
class
if
y
liv
er
d
is
ea
s
es
u
s
in
g
m
ac
h
in
e
lear
n
in
g
[
1
1
]
–
[
1
9
]
.
C
h
o
u
d
h
ar
y
et
a
l
.
in
[
2
0
]
p
r
o
p
o
s
ed
a
m
ac
h
in
e
lear
n
in
g
m
o
d
el
f
o
r
liv
er
d
is
ea
s
e
p
r
ed
ictio
n
.
T
h
is
s
tu
d
y
h
elp
s
im
p
r
o
v
e
liv
er
d
is
ea
s
e
d
iag
n
o
s
is
b
y
v
alid
ati
n
g
p
atien
t
p
ar
am
eter
s
an
d
g
en
o
m
e
ex
p
r
ess
io
n
,
Evaluation Warning : The document was created with Spire.PDF for Python.
I
n
t J E
lec
&
C
o
m
p
E
n
g
I
SS
N:
2088
-
8
7
0
8
E
va
lu
a
tin
g
ma
ch
in
e
le
a
r
n
in
g
mo
d
els fo
r
p
r
ed
ictive
a
n
a
lytics o
f live
r
d
is
ea
s
e
…
(
Osa
ma
Mo
h
a
r
eb
K
h
a
led
)
1163
an
aly
zin
g
co
m
p
u
ter
al
g
o
r
ith
m
s
,
an
d
o
f
f
e
r
in
g
way
s
to
in
cr
ea
s
e
ef
f
icien
cy
.
T
h
e
au
th
o
r
s
em
p
lo
y
ed
Scik
it
-
lear
n
,
Nu
m
p
y
,
Pan
d
as,
a
n
d
Seab
o
r
n
lib
r
ar
ies
to
cr
ea
te
m
ac
h
in
e
l
ea
r
n
in
g
m
o
d
els
th
at
ac
h
iev
e
ac
cu
r
ac
y
o
f
7
0
.
5
%
u
s
in
g
lo
g
is
tic
r
eg
r
ess
io
n
an
d
ac
cu
r
ac
y
o
f
6
5
%
u
s
in
g
th
e
v
e
cto
r
m
ac
h
in
e
a
p
p
r
o
ac
h
es.
B
esid
es,
an
in
tellig
en
t
ap
p
r
o
ac
h
h
as
b
ee
n
in
tr
o
d
u
ce
d
to
p
r
ed
icted
liv
er
illn
ess
b
y
Vee
r
an
k
i
a
n
d
Var
s
h
n
ey
in
[
2
1
]
.
T
h
e
y
i
n
tr
o
d
u
ce
d
a
n
o
v
el
b
io
in
f
o
r
m
atics
m
o
d
el
t
h
at
h
as
b
ee
n
a
p
p
lied
to
p
atien
t
g
en
etic
d
ata
to
d
is
co
v
er
s
af
eg
u
ar
d
s
ag
ai
n
s
t
liv
er
d
is
ea
s
e.
T
h
e
r
esu
lt
s
h
o
wed
th
at
an
ac
cu
r
ac
y
o
f
6
9
%
is
ac
h
i
ev
ed
u
s
in
g
th
e
r
a
n
d
o
m
f
o
r
est
(
RF
)
m
eth
o
d
wh
ile
th
e
k
-
n
ea
r
est
n
eig
h
b
o
r
s
(
KN
N
)
m
eth
o
d
ac
h
ie
v
ed
ac
cu
r
ac
y
o
f
6
7
%,
th
e
s
u
p
p
o
r
t
v
ec
t
o
r
m
ac
h
in
e
(
SVM)
m
eth
o
d
ac
h
iev
es
7
4
%,
an
d
th
e
m
u
ltil
ay
er
p
er
ce
p
tr
o
n
(
ML
P
)
m
eth
o
d
ac
h
iev
es
6
8
%.
Priy
a
et
a
l.
[
2
2
]
u
tili
ze
d
m
ac
h
in
e
lear
n
in
g
m
eth
o
d
s
f
o
r
liv
er
d
is
ea
s
e
p
r
ed
ictio
n
an
d
co
n
d
u
cte
d
a
p
er
f
o
r
m
a
n
ce
an
aly
s
is
.
I
n
o
r
d
er
to
b
etter
f
o
r
ec
ast
th
e
o
u
tco
m
es
o
f
liv
er
p
atien
ts
in
I
n
d
ia,
th
e
au
th
o
r
s
b
u
ild
a
f
ea
tu
r
e
m
o
d
el
an
d
co
m
p
ar
es
it
to
o
th
er
s
.
T
h
e
au
th
o
r
s
u
tili
ze
d
p
ar
ticle
s
war
m
o
p
tim
izatio
n
a
n
d
m
in
-
m
a
x
tech
n
iq
u
es
f
o
r
f
ea
tu
r
e
s
elec
tio
n
.
Af
ter
p
ar
ticle
s
war
m
o
p
tim
izatio
n
(
PSO
)
,
th
e
alg
o
r
ith
m
ac
h
ie
v
ed
th
e
b
est
p
er
f
o
r
m
a
n
ce
c
o
m
p
ar
e
d
to
J
4
8
alg
o
r
ith
m
,
wh
ich
ac
h
iev
es
an
ac
cu
r
ac
y
o
f
9
5
.
0
4
%.
Similar
ly
,
Alam
et
a
l.
[
2
3
]
s
u
g
g
ested
a
n
ew
m
o
d
el
f
o
r
m
ed
ical
d
ata
class
if
icatio
n
u
s
in
g
f
ea
tu
r
e
r
an
k
in
g
.
T
h
is
wo
r
k
in
tr
o
d
u
ce
s
t
h
e
u
s
e
o
f
r
an
k
er
alg
o
r
ith
m
s
an
d
R
F
clas
s
if
ier
s
f
o
r
f
ea
tu
r
e
-
r
a
n
k
in
g
-
b
ased
m
ed
ical
d
ata
ca
te
g
o
r
izatio
n
to
m
ak
e
r
eliab
le
d
is
ea
s
e
p
r
ed
ictio
n
s
.
T
h
e
r
esu
lt
s
h
o
ws
th
at
f
ea
tu
r
e
r
an
k
in
g
a
n
d
s
elec
tio
n
c
o
n
tr
ib
u
te
to
th
eir
m
o
d
el’
s
s
u
p
e
r
io
r
p
er
f
o
r
m
a
n
ce
.
R
ec
en
tly
,
Am
in
et
a
l.
[
2
4
]
p
r
o
p
o
s
ed
th
e
p
r
e
d
ictio
n
o
f
ch
r
o
n
ic
liv
er
d
is
ea
s
e
p
atien
ts
u
s
in
g
in
teg
r
ated
p
r
o
jectio
n
-
b
ased
s
tatis
tical
f
e
atu
r
e
ex
tr
ac
tio
n
with
m
ac
h
in
e
lear
n
in
g
alg
o
r
ith
m
s
.
T
h
is
m
o
d
el
class
if
ies
l
iv
er
p
atien
ts
u
s
in
g
d
ata
th
at
h
as
al
r
ea
d
y
b
ee
n
p
r
ep
r
o
ce
s
s
ed
an
d
v
ar
io
u
s
m
ac
h
i
n
e
-
lear
n
i
n
g
tech
n
iq
u
es.
Pre
d
ictio
n
s
o
f
liv
er
d
is
o
r
d
er
s
ar
e
m
ad
e
wi
th
an
ac
cu
r
ac
y
o
f
8
8
.
1
0
%,
p
r
e
cisi
o
n
o
f
8
5
.
3
3
%,
r
ec
all
o
f
9
2
.
3
0
%,
F1
s
co
r
e
o
f
8
8
.
6
8
%,
an
d
a
r
ea
u
n
d
er
th
e
c
u
r
v
e
(
AUC
)
8
8
.
2
0
%
th
at
ca
lc
u
lates
th
e
en
tire
two
-
d
im
en
s
io
n
al
ar
ea
u
n
d
er
n
ea
th
th
e
r
ec
eiv
er
o
p
er
atin
g
ch
ar
ac
t
er
is
tic
(
R
O
C
)
cu
r
v
e.
Desp
ite
all
th
ese
ef
f
o
r
ts
o
f
cla
s
s
if
y
in
g
liv
er
d
is
ea
s
e,
s
till
th
er
e
is
n
o
k
n
o
wn
a
p
p
r
o
ac
h
/m
et
h
o
d
wh
ich
p
r
o
d
u
c
e
s
th
e
b
est
p
r
e
d
ictio
n
.
Mo
r
eo
v
er
,
th
e
m
ajo
r
ity
o
f
r
el
ated
wo
r
k
d
o
n
e
s
o
f
ar
u
s
es
a
r
elativ
ely
s
m
all
d
ata
s
et
f
o
r
m
o
d
el
tr
ain
in
g
an
d
t
esti
n
g
wh
ich
ev
en
tu
ally
af
f
e
cts
th
e
o
v
er
all
m
o
d
el
p
r
ed
ic
tio
n
ac
cu
r
ac
y
.
Fo
r
in
s
tan
ce
,
in
[
1
0
]
,
[
2
2
]
,
[
2
5
]
–
[
2
9
]
a
d
ata
s
et
with
o
n
l
y
ar
o
u
n
d
5
8
3
d
is
ea
s
e
ca
s
es
ar
e
em
p
lo
y
ed
f
o
r
m
o
d
e
l
tr
ain
in
g
an
d
test
in
g
.
I
n
th
is
p
ap
e
r
,
we
c
o
m
p
ar
e
th
r
ee
m
ac
h
in
e
lear
n
in
g
m
o
d
els
-
KNN,
G
au
s
s
ian
n
aiv
e
B
ay
es
(
Gau
s
s
ian
NB
)
,
an
d
R
F
-
f
o
r
class
if
y
in
g
an
d
p
r
e
d
ictin
g
liv
er
d
is
ea
s
e.
W
e
u
tili
ze
d
a
s
u
b
s
tan
tial
d
ataset
co
m
p
r
is
in
g
3
2
,
0
0
0
d
is
ea
s
e
ca
s
es,
r
ep
r
esen
tin
g
a
s
ig
n
if
ican
t
b
ig
d
ata
ch
allen
g
e
a
n
d
o
f
f
er
in
g
a
co
m
p
r
e
h
en
s
iv
e
an
aly
s
is
.
Ad
d
itio
n
ally
,
we
l
ev
er
ag
ed
m
ac
h
in
e
lea
r
n
in
g
tech
n
iq
u
es
f
o
r
d
ata
p
r
ep
r
o
c
ess
in
g
an
d
f
ea
tu
r
e
ex
tr
ac
tio
n
.
Ou
r
c
o
m
p
a
r
is
o
n
s
tu
d
y
co
n
clu
d
e
d
th
at
th
e
KNN
m
o
d
el
s
h
o
wca
s
ed
im
p
r
ess
iv
e
p
er
f
o
r
m
an
ce
with
9
5
%
ac
cu
r
ac
y
,
9
4
%
p
r
ec
is
io
n
,
an
d
9
3
%
s
co
r
es
in
b
o
th
r
ec
all
an
d
F1
-
s
co
r
e
,
h
ig
h
lig
h
tin
g
its
r
eliab
ilit
y
an
d
h
ig
h
p
o
ten
tial
f
o
r
liv
e
r
d
is
ea
s
e
p
r
ed
ictio
n
task
s
.
Fu
r
th
er
m
o
r
e,
th
e
Gau
s
s
ian
NB
m
o
d
el
,
d
esp
ite
its
lo
wer
o
v
er
all
ac
cu
r
ac
y
o
f
5
5
.
7
%
an
d
p
r
ec
is
io
n
o
f
3
9
%,
d
em
o
n
s
tr
at
ed
a
r
em
ar
k
ab
le
r
ec
all
r
ate
o
f
9
6
%,
u
n
d
e
r
s
co
r
in
g
its
p
o
ten
tial
in
id
en
tify
in
g
th
e
p
r
esen
ce
o
f
d
is
ea
s
e.
T
h
e
s
t
an
d
o
u
t,
r
a
n
d
o
m
f
o
r
est
,
ac
h
ie
v
ed
an
ex
ce
p
tio
n
al
9
7
.
3
%
ac
cu
r
ac
y
,
with
n
ea
r
-
p
er
f
ec
t
p
r
ec
is
io
n
an
d
r
ec
all
r
ates o
f
9
7
%
a
n
d
9
6
%,
r
esp
ec
tiv
ely
,
alo
n
g
with
a
9
5
%
F1
-
s
co
r
e
an
d
AUC,
in
d
icatin
g
its
s
u
p
er
io
r
p
r
e
d
ictiv
e
ca
p
ab
il
ity
an
d
r
o
b
u
s
tn
ess
.
T
h
is
p
ap
er
is
o
r
g
an
ized
as
f
o
ll
o
ws:
s
ec
tio
n
2
d
etails
th
e
m
et
h
o
d
s
an
d
s
tag
es
we
f
o
llo
wed
to
ac
h
iev
e
o
p
tim
al
p
er
f
o
r
m
an
ce
wh
ile
co
m
p
ar
in
g
th
e
th
r
ee
d
if
f
er
e
n
t
m
ac
h
in
e
lear
n
in
g
m
o
d
els.
Sectio
n
3
p
r
esen
ts
th
e
ex
p
er
im
en
tal
r
esu
lts
o
b
tain
e
d
u
s
in
g
o
u
r
im
p
lem
en
tatio
n
,
alo
n
g
with
a
c
o
m
p
a
r
is
o
n
o
f
o
u
r
f
i
n
d
in
g
s
with
r
elate
d
ex
p
er
im
en
ts
f
r
o
m
th
e
liter
atu
r
e.
T
h
e
p
ap
e
r
c
o
n
clu
d
es
in
Sect
io
n
4
with
a
s
u
m
m
a
r
y
o
f
th
e
k
ey
co
n
clu
s
io
n
s
an
d
p
r
o
v
id
es su
g
g
esti
o
n
s
f
o
r
f
u
tu
r
e
r
esear
ch
d
ir
ec
tio
n
s
.
2.
M
E
T
H
O
D
I
n
th
is
s
ec
tio
n
,
we
o
u
tlin
e
th
e
m
eth
o
d
o
lo
g
y
e
m
p
lo
y
ed
i
n
th
is
p
ap
er
,
en
c
o
m
p
ass
in
g
t
h
r
ee
cr
u
cial
s
tag
es:
p
r
ep
r
o
ce
s
s
in
g
,
f
ea
t
u
r
e
ex
tr
ac
tio
n
,
an
d
liv
er
d
is
ea
s
e
p
r
ed
ictio
n
.
T
h
e
p
r
e
p
r
o
ce
s
s
in
g
s
tag
e
in
v
o
lv
es
clea
n
in
g
an
d
n
o
r
m
alizin
g
th
e
d
ata
to
en
s
u
r
e
its
ac
cu
r
ac
y
an
d
co
n
s
is
ten
cy
.
Du
r
in
g
th
e
f
ea
t
u
r
e
ex
tr
ac
tio
n
s
tag
e,
we
id
en
tify
an
d
ex
tr
ac
t
r
elev
a
n
t
f
ea
tu
r
es
f
r
o
m
th
e
d
ataset
to
en
h
an
ce
th
e
m
o
d
el'
s
p
er
f
o
r
m
an
ce
.
Fin
ally
,
in
th
e
liv
er
d
is
ea
s
e
p
r
ed
ictio
n
s
tag
e,
we
test
ea
ch
m
o
d
el
with
t
h
e
p
r
ep
ar
e
d
d
ata
to
co
m
p
ar
e
th
e
th
r
ee
d
if
f
er
e
n
t
m
ac
h
in
e
lear
n
i
n
g
a
p
p
r
o
ac
h
es
an
d
d
eter
m
in
e
th
e
m
o
s
t
ef
f
ec
t
iv
e
o
n
e.
W
e
u
s
e
g
en
er
ate
d
s
tatis
tics
to
ev
alu
ate
p
er
f
o
r
m
an
ce
m
etr
ics,
in
clu
d
i
n
g
ac
cu
r
ac
y
,
p
r
ec
is
io
n
,
r
ec
all,
an
d
F1
-
s
co
r
e,
to
s
elec
t
th
e
b
e
s
t
p
r
ed
ictiv
e
m
o
d
el.
Fig
u
r
e
1
p
r
o
v
id
es a
s
u
m
m
ar
y
o
f
o
u
r
ad
ap
te
d
m
eth
o
d
o
lo
g
y
i
n
th
is
p
ap
er
.
T
h
e
f
ir
s
t
s
tep
in
v
o
lv
es
ad
d
r
es
s
in
g
m
is
s
in
g
an
d
co
r
r
u
p
ted
d
ata
in
th
e
liv
er
p
atien
t
d
ataset.
T
h
e
d
ata
p
r
ep
r
o
ce
s
s
in
g
is
a
cr
itical
s
te
p
in
m
ac
h
in
e
lear
n
in
g
,
as
th
e
q
u
ality
o
f
t
h
e
in
p
u
t
d
ata
ca
n
s
ig
n
if
ican
tly
im
p
ac
t
th
e
m
o
d
el'
s
p
er
f
o
r
m
an
ce
[
3
0
]
.
I
n
th
is
p
ap
e
r
,
we
u
s
e
th
e
latest
p
r
ep
r
o
ce
s
s
in
g
tec
h
n
iq
u
e
t
o
h
an
d
le
m
is
s
in
g
,
co
r
r
u
p
te
d
d
ata,
a
n
d
em
p
lo
y
m
eth
o
d
s
lik
e
im
p
u
tatio
n
o
r
d
ata
clea
n
in
g
to
en
s
u
r
e
th
e
d
a
taset
is
s
u
itab
le
f
o
r
m
o
d
elin
g
as sh
o
wn
in
Fig
u
r
e
1
.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
SS
N
:
2
0
8
8
-
8
7
0
8
I
n
t J E
lec
&
C
o
m
p
E
n
g
,
Vo
l.
15
,
No
.
1
,
Feb
r
u
ar
y
20
25
:
1
162
-
1
1
7
4
1164
Fig
u
r
e
1
.
Ou
r
wo
r
k
f
lo
w
f
o
r
co
m
p
ar
in
g
t
h
e
th
r
ee
d
if
f
er
e
n
t m
a
ch
in
e
lear
n
in
g
m
o
d
els u
s
in
g
th
e
s
am
e
d
ataset.
T
h
e
wo
r
k
f
lo
w
s
tep
s
in
v
o
lv
e
p
r
ep
r
o
ce
s
s
in
g
,
f
ea
tu
r
e
ex
tr
ac
tio
n
,
as we
ll a
s
m
o
d
el
tr
ain
in
g
a
n
d
p
r
ed
ictio
n
.
Af
ter
ap
p
ly
in
g
th
e
s
am
e
wo
r
k
f
lo
w
t
o
th
e
th
r
ee
m
ac
h
in
e
-
lea
r
n
in
g
m
o
d
els,
we
u
tili
ze
d
m
o
d
el
o
u
t
p
u
ts
f
o
r
c
o
m
p
ar
i
n
g
th
e
th
r
ee
-
m
o
d
el
p
er
f
o
r
m
a
n
ce
L
in
ea
r
tr
an
s
f
o
r
m
atio
n
tech
n
i
q
u
es
s
u
ch
as
n
o
r
m
aliza
tio
n
,
s
tan
d
ar
d
izatio
n
,
an
d
f
ea
tu
r
e
s
ca
lin
g
ar
e
u
s
ed
to
s
ca
le
an
d
s
h
if
t
d
ata
p
o
in
ts
.
I
n
th
is
co
n
tex
t,
m
in
-
m
a
x
s
ca
lin
g
m
eth
o
d
is
ca
lcu
lated
b
y
s
u
b
tr
ac
ted
d
ata
elem
en
ts
f
r
o
m
th
e
s
m
allest
v
alu
e
an
d
d
iv
id
e
d
b
y
th
e
r
esu
lt
o
f
s
u
b
tr
ac
tin
g
th
e
lar
g
est
d
ata
elem
en
t
f
r
o
m
th
e
s
m
allest as (
1
)
:
=
−
(
)
(
)
−
(
)
(
1
)
Mo
r
eo
v
er
,
one
-
h
o
t
e
n
co
d
in
g
t
u
r
n
s
ca
teg
o
r
ical
v
ar
iab
les
in
to
b
in
ar
y
v
ec
to
r
s
with
a
s
in
g
le
b
in
ar
y
v
alu
e
(
1
o
r
0
)
to
r
ep
r
esen
t
ea
ch
ca
teg
o
r
y
[
3
1
]
.
I
n
th
e
s
am
e
co
n
tex
t,
th
e
Z
-
Sco
r
e
s
ca
lin
g
m
eth
o
d
u
s
ed
to
s
ca
le
n
u
m
b
er
in
a
m
ea
n
o
f
0
an
d
a
s
tan
d
ar
d
d
ev
iatio
n
o
f
o
n
e
[
3
2
]
.
T
h
is
ca
n
b
e
ac
h
iev
ed
b
y
tak
in
g
th
e
m
ea
n
an
d
d
i
v
id
in
g
t
h
e
r
esu
lt b
y
th
e
s
tan
d
a
r
d
d
e
v
iatio
n
as
(
2
)
.
=
−
(
)
(
)
(
2
)
Fin
ally
,
s
ig
m
o
id
f
u
n
ctio
n
m
a
p
s
an
y
r
ea
l
-
v
alu
ed
n
u
m
b
er
t
o
th
e
r
a
n
g
e
[
0
,
1
]
.
T
h
is
m
ath
em
atica
l
f
u
n
ctio
n
is
in
teg
r
al
to
th
e
p
r
e
p
r
o
ce
s
s
in
g
s
tep
s
in
m
ac
h
in
e
lear
n
in
g
,
h
elp
in
g
to
tr
a
n
s
f
o
r
m
d
ata
in
to
a
s
u
itab
le
f
o
r
m
at
f
o
r
tr
ain
in
g
m
o
d
els
an
d
im
p
r
o
v
in
g
th
e
p
e
r
f
o
r
m
an
ce
o
f
alg
o
r
ith
m
s
.
Un
d
er
s
tan
d
in
g
th
ese
f
u
n
ctio
n
s
an
d
wh
en
to
ap
p
ly
t
h
em
is
cr
u
cial
f
o
r
ef
f
ec
tiv
e
d
ata
p
r
ep
r
o
ce
s
s
in
g
.
Sig
m
o
id
f
u
n
ctio
n
is
co
m
m
o
n
ly
u
s
ed
in
lo
g
is
tic
r
eg
r
ess
io
n
to
m
o
d
el
b
in
a
r
y
cla
s
s
if
icatio
n
p
r
o
b
lem
s
in
th
e
p
r
e
p
r
o
ce
s
s
in
g
p
h
ase
as
(
3
)
:
(
)
=
1
1
+
−
(
3
)
T
h
e
s
ec
o
n
d
s
tep
o
f
o
u
r
f
r
a
m
ewo
r
k
in
v
o
lv
es
r
ed
u
cin
g
ir
r
e
lev
an
t
f
ea
tu
r
es
in
th
e
d
ataset.
Featu
r
e
s
elec
tio
n
is
im
p
o
r
tan
t
to
im
p
r
o
v
e
th
e
m
o
d
el'
s
ef
f
icien
cy
an
d
ac
cu
r
ac
y
.
I
r
r
elev
an
t
o
r
r
e
d
u
n
d
an
t
f
ea
tu
r
es
ca
n
in
tr
o
d
u
ce
n
o
is
e
an
d
co
m
p
lex
it
y
,
m
a
k
in
g
it
m
o
r
e
ch
allen
g
in
g
f
o
r
th
e
m
o
d
el
to
lear
n
m
ea
n
i
n
g
f
u
l
p
atter
n
s
.
W
e
u
s
e
tech
n
iq
u
es
s
u
ch
as
f
ea
tu
r
e
im
p
o
r
tan
ce
,
c
o
r
r
elatio
n
an
al
y
s
is
,
o
r
d
im
e
n
s
io
n
ality
r
e
d
u
ctio
n
m
eth
o
d
s
to
s
elec
t
th
e
m
o
s
t
in
f
o
r
m
ativ
e
f
ea
tu
r
es
f
o
r
o
u
r
m
o
d
el.
Her
e,
th
e
p
r
in
c
ip
al
co
m
p
o
n
en
t
an
aly
s
is
(
PC
A)
is
u
s
ed
to
r
e
d
u
ce
th
e
d
im
en
s
io
n
ality
o
f
a
d
ata
s
et
co
n
s
is
tin
g
o
f
a
lar
g
e
n
u
m
b
er
o
f
in
ter
r
elate
d
v
a
r
iab
les
u
s
in
g
co
v
ar
ian
ce
m
atr
i
x
as
(
4
)
[
3
3
]
:
,
=
∑
(
−
̄
)
(
−
̄
)
−
1
(
4
)
Evaluation Warning : The document was created with Spire.PDF for Python.
I
n
t J E
lec
&
C
o
m
p
E
n
g
I
SS
N:
2088
-
8
7
0
8
E
va
lu
a
tin
g
ma
ch
in
e
le
a
r
n
in
g
mo
d
els fo
r
p
r
ed
ictive
a
n
a
lytics o
f live
r
d
is
ea
s
e
…
(
Osa
ma
Mo
h
a
r
eb
K
h
a
led
)
1165
wh
er
e
,
r
ep
r
esen
ts
th
e
co
v
a
r
ian
ce
o
f
f
ea
tu
r
es
a
an
d
b
.
is
th
e
tr
ain
in
g
s
am
p
le
f
r
o
m
f
ea
tu
r
e
b
.
̄
d
en
o
tes
th
e
m
ea
n
s
am
p
le
o
f
f
e
atu
r
e
a
.
̄
is
th
e
m
ea
n
s
am
p
le
o
f
f
ea
tu
r
e
b
a
n
d
M
r
e
p
r
esen
ts
th
e
to
tal
am
o
u
n
t o
f
s
am
p
les.
Fin
ally
,
in
th
e
th
ir
d
s
tep
o
f
o
u
r
m
eth
o
d
m
ac
h
in
e
lear
n
in
g
m
o
d
els
ar
e
ap
p
lied
to
b
ig
d
at
a
o
n
liv
er
d
is
ea
s
e
.
T
h
e
m
o
d
els
ar
e
u
s
ed
to
p
r
ed
ict
an
d
class
if
y
p
atien
ts
as
e
ith
er
h
av
in
g
o
r
n
o
t
h
a
v
i
n
g
liv
er
d
is
ea
s
e,
as
s
h
o
wn
in
Fig
u
r
e
1
.
T
h
e
m
o
d
e
ls
ar
e
tr
ain
ed
u
s
in
g
th
e
p
r
ep
r
o
ce
s
s
ed
d
ataset
an
d
th
e
s
e
lecte
d
r
elev
an
t
f
ea
tu
r
es.
T
h
e
s
p
ec
if
ic
m
o
d
els
u
s
ed
:
G
au
s
s
ian
NB
,
KNN
,
an
d
th
e
R
F
alg
o
r
ith
m
,
w
o
u
ld
d
ep
e
n
d
o
n
th
e
p
r
esen
ted
m
eth
o
d
o
l
o
g
y
.
First,
in
th
e
tr
a
in
in
g
p
h
ase,
th
e
KNN
alg
o
r
it
h
m
s
im
p
ly
m
em
o
r
izes
th
e
d
at
aset.
Ass
u
m
in
g
we
h
av
e
a
d
ataset
with
m
d
ata
p
o
in
ts
in
an
n
-
d
im
en
s
io
n
al
f
ea
t
u
r
e
s
p
ac
e,
ea
ch
ass
o
ciate
d
wit
h
a
lab
el
,
an
d
we
wan
t to
p
r
ed
ict
th
e
lab
el
f
o
r
a
n
ew
d
ata
p
o
in
t
.
T
h
en
,
t
h
e
d
is
tan
ce
b
etwe
en
two
p
o
i
n
ts
xi
an
d
ca
n
b
e
co
m
p
u
ted
u
s
in
g
E
u
clid
ea
n
d
is
tan
ce
as
(
5
)
[
3
4
]
:
(
,
)
=
√
∑
(
−
,
)
2
(
5
)
E
q
u
atio
n
(
5
)
r
e
p
r
esen
ts
a
m
et
h
o
d
f
o
r
u
p
d
atin
g
a
v
ec
to
r
x
b
as
ed
o
n
d
if
f
er
e
n
ce
s
b
etwe
en
its
cu
r
r
en
t
v
alu
e
an
d
a
n
ew
v
alu
e.
I
t
in
v
o
lv
es
ca
lcu
latin
g
th
e
av
er
a
g
e
o
f
th
e
s
q
u
ar
ed
d
if
f
er
e
n
ce
s
b
etwe
en
th
e
cu
r
r
en
t
v
alu
e
an
d
th
e
n
ew
v
alu
e
f
o
r
all
v
alu
es
o
f
j
f
r
o
m
1
to
n
.
T
h
e
s
q
u
ar
e
r
o
o
t o
f
th
is
a
v
er
ag
e
is
th
en
u
s
ed
t
o
u
p
d
ate
,
ef
f
ec
tiv
ely
m
o
v
i
n
g
it
clo
s
er
to
.
T
h
is
iter
ativ
e
u
p
d
ate
r
u
le
s
u
g
g
ests
th
at
th
e
v
ec
to
r
is
g
r
ad
u
ally
tr
an
s
itio
n
in
g
to
war
d
s
th
e
d
esire
d
v
alu
e
.
Alg
o
r
ith
m
1
p
r
esen
ts
th
e
KNN
p
r
o
ce
d
u
r
e
wh
ich
r
elies o
n
th
e
s
im
p
le
p
r
in
ci
p
le
o
f
d
eter
m
in
i
n
g
th
e
ca
te
g
o
r
y
o
f
a
s
p
ec
if
ic
q
u
er
y
p
o
in
t
b
ased
o
n
th
e
m
o
s
t f
r
e
q
u
en
t
ca
teg
o
r
ies
am
o
n
g
th
e
'
k
'
n
ea
r
est
p
o
in
ts
in
th
e
d
ataset.
T
h
e
f
ir
s
t
s
tep
in
th
e
alg
o
r
ith
m
in
v
o
l
v
es
ca
lcu
latin
g
th
e
d
is
tan
ce
s
b
etwe
en
th
e
q
u
er
y
p
o
in
t
an
d
ea
ch
p
o
in
t
in
th
e
d
ata
s
et,
ty
p
ically
u
s
in
g
E
u
clid
ea
n
d
is
tan
ce
.
Nex
t,
th
e
d
ata
p
o
in
ts
ar
e
s
o
r
te
d
b
ased
o
n
th
eir
d
is
tan
ce
f
r
o
m
th
e
q
u
e
r
y
p
o
in
t,
an
d
t
h
e
f
ir
s
t
'
k
'
p
o
in
ts
ar
e
s
elec
ted
as
th
e
n
ea
r
est
n
eig
h
b
o
r
s
.
T
h
e
m
o
s
t
f
r
eq
u
en
t
ca
teg
o
r
y
am
o
n
g
t
h
ese
n
eig
h
b
o
r
s
is
th
en
d
eter
m
i
n
ed
an
d
co
n
s
id
er
ed
as
th
e
p
r
ed
ictio
n
o
r
class
if
icatio
n
f
o
r
th
e
q
u
er
y
p
o
in
t.
T
h
is
m
et
h
o
d
is
ef
f
ec
tiv
e
in
v
ar
i
o
u
s
class
if
icatio
n
s
ce
n
ar
io
s
b
u
t r
eq
u
ir
es in
ten
s
iv
e
co
m
p
u
t
atio
n
s
,
esp
ec
ially
with
lar
g
er
d
atasets
.
Alg
o
r
ith
m
1
.
T
h
e
KNN
alg
o
r
ith
m
Input
:
A set of data points 'D', a query point 'q', and an integer 'k' representing the
number of neighbors and D can be represented as Dataset
(
[
1
…
.
1
…
]
)
.
Output
:
The most common class among the 'k' nearest neighbors of 'q'.
for
←
1
to
m
do visited
[
]
←
false
execute visited [i] ←
false
Calculate the distance between the query point 'q' and each point in the data set 'D'.
Visited
[
]
←
Current
←
for
←
2
to
do
Sort the points in 'D' based on their distance to 'q'.
Select the first 'k'
points from this sorted list. These points are the 'k' nearest
neighbors of 'q'.
Count the frequency of each class among the 'k' nearest neighbors.
Determine the most common class among these neighbors.
Return the most common class as the prediction for the class of 'q'.
T
h
e
KNN
alg
o
r
ith
m
ass
ig
n
s
th
e
class
lab
el
b
ased
o
n
m
ajo
r
ity
v
o
tin
g
a
m
o
n
g
th
e
k
n
ea
r
est
n
e
ig
h
b
o
r
s
.
̂
=
∑
(
=
)
(
6)
wh
er
e
is
th
e
class
o
f
th
e
i
-
th
n
eig
h
b
o
r
,
̃
r
ep
r
esen
ts
th
e
p
r
e
d
icted
class
lab
el
f
o
r
th
e
n
e
w
d
ata
p
o
i
n
t,
a
r
gma
x
d
en
o
tes
th
e
ar
g
u
m
en
t
t
h
at
m
ax
im
izes
th
e
e
x
p
r
ess
io
n
with
in
th
e
p
ar
e
n
th
eses
.
∑
(
=
)
.
T
o
ca
lcu
late
th
e
s
u
m
o
f
in
d
icato
r
f
u
n
ctio
n
s
,
w
h
er
e
ea
c
h
in
d
icato
r
f
u
n
ctio
n
is
e
q
u
al
t
o
1
if
t
h
e
i
-
th
n
ea
r
est
n
eig
h
b
o
r
b
elo
n
g
s
to
t
h
e
class
b
ein
g
e
v
a
lu
ated
(
ci)
a
n
d
0
o
th
er
wis
e.
T
h
e
g
au
s
s
ian
d
is
tr
ib
u
tio
n
f
o
r
co
n
tin
u
o
u
s
f
ea
t
u
r
es
is
a
p
r
o
b
ab
ilit
y
d
e
n
s
ity
f
u
n
ctio
n
th
at
d
escr
ib
es
th
e
lik
elih
o
o
d
o
f
o
b
s
er
v
in
g
a
s
p
ec
if
ic
v
al
u
e
f
o
r
a
f
ea
tu
r
e
g
iv
en
its
m
ea
n
an
d
s
tan
d
ar
d
d
ev
iatio
n
as
(
7
)
[
3
5
]
:
(
|
)
=
1
√
2
(
−
(
−
)
2
2
2
)
(
7
)
wh
er
e
is
th
e
m
ea
n
o
f
f
ea
t
u
r
e
f
o
r
class
an
d
2
r
ep
r
esen
ts
th
e
s
tan
d
ar
d
d
ev
iatio
n
o
f
f
ea
t
u
r
e
i
f
o
r
class
.
T
h
e
n
o
r
m
aliza
tio
n
c
o
n
s
tan
t
1
√
2
en
s
u
r
es
th
e
to
tal
a
r
ea
u
n
d
er
th
e
p
r
o
b
ab
ilit
y
d
e
n
s
ity
cu
r
v
e
Evaluation Warning : The document was created with Spire.PDF for Python.
I
SS
N
:
2
0
8
8
-
8
7
0
8
I
n
t J E
lec
&
C
o
m
p
E
n
g
,
Vo
l.
15
,
No
.
1
,
Feb
r
u
ar
y
20
25
:
1
162
-
1
1
7
4
1166
in
teg
r
ates to
1
,
an
d
t
h
e
b
ell
-
s
h
ap
ed
cu
r
v
e
is
d
ef
in
ed
b
y
th
e
e
x
p
o
n
e
n
t.
Alg
o
r
ith
m
2
(
Gau
s
s
ian
NB
alg
o
r
ith
m
)
is
a
p
o
p
u
lar
class
if
icatio
n
m
eth
o
d
in
m
ac
h
in
e
lear
n
in
g
,
b
ased
o
n
th
e
p
r
in
ci
p
le
o
f
B
ay
es'
th
eo
r
em
.
T
h
is
alg
o
r
ith
m
class
if
ies
d
ata
b
ased
o
n
th
e
p
r
o
b
a
b
ilit
y
o
f
ea
ch
ca
teg
o
r
y
,
ass
u
m
in
g
th
at
ea
ch
f
ea
tu
r
e
f
o
llo
ws
a
n
o
r
m
al
(
Gau
s
s
ian
)
d
is
tr
ib
u
tio
n
.
I
n
itially
,
th
e
alg
o
r
ith
m
ca
lcu
lates
th
e
p
r
io
r
p
r
o
b
ab
ilit
y
o
f
ea
c
h
ca
teg
o
r
y
b
ased
o
n
its
p
r
esen
ce
in
th
e
tr
ai
n
in
g
s
et.
T
h
en
,
it
c
o
m
p
u
tes
th
e
m
ea
n
an
d
v
ar
ia
n
ce
f
o
r
ea
c
h
f
ea
t
u
r
e
wi
th
in
ea
ch
ca
teg
o
r
y
.
W
h
en
class
if
y
in
g
a
n
ew
in
s
t
an
ce
,
th
e
alg
o
r
ith
m
ca
lcu
late
s
th
e
p
r
o
b
ab
ilit
y
o
f
ea
ch
f
ea
t
u
r
e
in
th
is
in
s
tan
ce
u
n
d
er
ea
c
h
ca
teg
o
r
y
u
s
in
g
t
h
e
Gau
s
s
ian
p
r
o
b
a
b
ilit
y
d
en
s
ity
f
u
n
ctio
n
.
T
h
e
p
r
o
b
a
b
ilit
y
o
f
th
e
ca
teg
o
r
y
is
d
eter
m
in
ed
b
y
m
u
ltip
ly
in
g
th
ese
p
r
o
b
a
b
ilit
ies
to
g
eth
er
an
d
th
e
n
m
u
ltip
ly
i
n
g
b
y
t
h
e
ca
teg
o
r
y
'
s
p
r
io
r
p
r
o
b
a
b
ilit
y
.
Fin
ally
,
th
e
in
s
ta
n
ce
is
clas
s
if
ied
in
to
th
e
ca
te
g
o
r
y
th
at
ac
h
iev
es
th
e
h
ig
h
es
t
p
r
o
b
ab
ilit
y
.
T
h
is
m
eth
o
d
is
ef
f
icien
t
b
u
t
r
elies
o
n
th
e
'
n
aiv
e'
ass
u
m
p
tio
n
o
f
i
n
d
ep
en
d
en
ce
am
o
n
g
v
ar
iab
les,
wh
ich
m
ay
n
o
t
b
e
ac
cu
r
ate
in
all
s
ce
n
ar
io
s
.
Alg
o
r
ith
m
2
.
T
h
e
Gau
s
s
ian
n
a
ïv
e
B
ay
es
Input: Training set (features X and labels Y), Test instance (x)
Output: Predicted label for the test instance
1. Calculate Prior Probabilities:
For each class 'c' in labels Y:
Compute P(c) = Number of instances in class 'c' / Total number of instances
2. Calculate Mean and Variance for each Feature:
For each feature 'f' in the
feature set X:
For each class 'c' in labels Y:
Calculate mean (f, c) = Mean of feature 'f' in class 'c'
Calculate variance (f, c) = Variance of feature 'f' in class 'c'
3. Classify the Test Instance:
For each class 'c' in labels Y:
Initialize likelihood = P(c)
For each feature 'f' in the test instance x:
Calculate the probability density of x[f] using Gaussian distribution with mean(f, c)
and variance(f, c)
4. Determine the Class with the Highest Likelihood:
Predict the label as the class with the maximum likelihood
5. Return the Predicted Label
Fin
ally
,
f
o
r
ea
c
h
f
ea
tu
r
e
,
th
e
im
p
o
r
tan
ce
s
co
r
e
ca
n
b
e
ca
lcu
l
ated
b
ased
o
n
R
F a
lg
o
r
ith
m
as
(
8
)
:
Impor
ta
n
c
e
(
)
=
1
∑
∑
=
1
(
)
×
(
)
(
8
)
wh
er
e
is
th
e
to
tal
n
u
m
b
er
o
f
tr
ee
s
in
th
e
R
F
m
o
d
el,
(
)
is
th
e
p
r
o
p
o
r
tio
n
o
f
s
am
p
les
r
ea
ch
in
g
th
e
n
o
d
es
th
at
s
p
lit
o
n
in
tr
ee
t
an
d
(
)
is
th
e
d
ec
r
ea
s
e
in
im
p
u
r
ity
in
tr
ee
t
ca
u
s
ed
b
y
th
e
s
p
lit
o
n
.
T
h
e
R
F
alg
o
r
ith
m
,
s
ee
Alg
o
r
ith
m
3
,
is
an
en
s
em
b
le
lear
n
in
g
m
eth
o
d
p
r
im
ar
ily
u
s
ed
f
o
r
class
if
icatio
n
an
d
r
e
g
r
ess
io
n
.
I
t
co
n
s
tr
u
cts
m
u
ltip
le
d
ec
is
i
o
n
tr
ee
s
d
u
r
i
n
g
tr
ai
n
in
g
,
lev
er
ag
in
g
th
e
r
an
d
o
m
n
ess
in
tr
o
d
u
ce
d
b
y
two
k
ey
tech
n
iq
u
es:
b
o
o
ts
tr
ap
s
am
p
li
n
g
an
d
f
ea
t
u
r
e
r
an
d
o
m
n
ess
.
I
n
b
o
o
ts
tr
ap
s
am
p
lin
g
,
ea
ch
tr
ee
is
tr
ain
ed
o
n
a
r
an
d
o
m
s
am
p
le
o
f
th
e
d
ata,
a
llo
win
g
f
o
r
d
iv
e
r
s
e
tr
ain
in
g
s
ets
f
o
r
ea
ch
tr
ee
.
Fo
r
ea
ch
n
o
d
e
o
f
th
ese
tr
ee
s
,
a
r
an
d
o
m
s
u
b
s
et
o
f
f
ea
tu
r
es
is
c
o
n
s
id
er
ed
f
o
r
s
p
litt
in
g
,
r
ath
er
th
an
ev
alu
atin
g
all
av
ailab
le
f
ea
tu
r
es,
wh
ich
ad
d
s
to
th
e
r
a
n
d
o
m
n
ess
an
d
h
elp
s
in
r
e
d
u
cin
g
co
r
r
elatio
n
b
etwe
en
tr
ee
s
[
3
6
]
.
T
h
is
co
m
b
in
a
tio
n
o
f
tech
n
iq
u
es
en
ab
les
R
F
to
ac
h
iev
e
h
ig
h
a
cc
u
r
ac
y
an
d
r
o
b
u
s
tn
ess
,
as
it
ef
f
ec
tiv
ely
m
itig
ates
o
v
er
f
itti
n
g
b
y
av
er
a
g
in
g
th
e
p
r
ed
ictio
n
s
f
r
o
m
m
u
ltip
le
tr
e
es.
Fo
r
class
if
icatio
n
task
s
,
th
e
alg
o
r
ith
m
o
u
tp
u
ts
th
e
m
o
d
e
o
f
th
e
class
es
p
r
ed
icted
b
y
in
d
i
v
id
u
al
tr
ee
s
,
wh
ile
f
o
r
r
e
g
r
ess
io
n
,
it c
o
m
p
u
tes th
e
av
er
ag
e
o
f
th
eir
p
r
ed
ict
io
n
s
.
Alg
o
r
ith
m
3
.
T
h
e
r
a
n
d
o
m
f
o
r
e
s
t
Input: A training set, number of trees
'N'
, and number of features to consider
'K'
.
Output: A collection of decision trees.
1. Initialize an empty forest (a
collection of trees).
2. For each tree '
t
' from 1 to '
N
':
a. Generate a random sample of the training set (with replacement), called a bootstrap
sample.
b. Build a decision tree 'Tree_
t'
on this bootstrap sample.
-
At each node of the tree, randomly select
'K'
features without replacement.
-
Choose the best split from these 'K' features to split the node.
-
Grow the tree to the largest extent possible without pruning.
c. Add 'Tree_
t'
to the forest.
3. For classification tasks, the RFst output is the mode of the classes predicted by
individual trees.
For regression tasks, it is the average of the predictions.
4. Return the forest.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
n
t J E
lec
&
C
o
m
p
E
n
g
I
SS
N:
2088
-
8
7
0
8
E
va
lu
a
tin
g
ma
ch
in
e
le
a
r
n
in
g
mo
d
els fo
r
p
r
ed
ictive
a
n
a
lytics o
f live
r
d
is
ea
s
e
…
(
Osa
ma
Mo
h
a
r
eb
K
h
a
led
)
1167
3.
RE
SU
L
T
S AN
D
D
I
SCU
SS
I
O
N
T
h
is
s
ec
tio
n
p
r
esen
ts
a
n
o
v
er
v
iew
o
f
t
h
e
ev
al
u
atio
n
cr
iter
ia
an
d
b
en
ch
m
a
r
k
s
u
s
ed
in
o
u
r
e
x
p
er
im
en
t
,
f
o
cu
s
in
g
o
n
th
e
e
v
alu
atio
n
m
e
tr
ics.
I
t
th
en
d
elv
es
in
to
th
e
d
ata
an
aly
s
is
,
p
r
o
v
id
in
g
a
d
etai
led
ex
am
in
atio
n
o
f
th
e
r
esu
lts
.
Fin
ally
,
we
co
m
p
ar
e
o
u
r
f
in
d
in
g
s
with
p
r
ev
i
o
u
s
r
esear
ch
to
h
ig
h
lig
h
t
th
e
im
p
r
o
v
em
e
n
ts
an
d
co
n
tr
ib
u
tio
n
s
o
f
o
u
r
s
tu
d
y
.
3
.
1
.
I
m
ple
m
ent
a
t
io
n
d
et
a
ils
T
h
is
s
u
b
s
ec
tio
n
o
u
tlin
es
th
e
im
p
lem
en
tatio
n
s
p
ec
if
ics
o
f
o
u
r
m
o
d
el
co
m
p
ar
is
o
n
,
wh
ich
is
cr
u
cial
f
o
r
ass
es
s
in
g
th
e
f
ea
s
ib
ilit
y
,
q
u
ality
,
an
d
ef
f
icien
c
y
o
f
o
u
r
wo
r
k
.
T
o
ac
h
iev
e
th
is
o
b
jec
tiv
e,
th
e
p
r
o
p
o
s
ed
ap
p
r
o
ac
h
was
im
p
lem
en
ted
o
n
a
lap
to
p
eq
u
ip
p
e
d
with
an
I
n
tel®
C
o
r
e
™
i7
-
9
8
5
0
H
C
PU
r
u
n
n
in
g
at
2
.
6
GHz
,
3
2
GB
o
f
R
AM
,
an
d
th
e
W
in
d
o
ws
1
1
×
6
4
o
p
er
atin
g
s
y
s
tem
.
Py
th
o
n
was
ch
o
s
en
f
o
r
th
e
ap
p
licatio
n
d
ev
elo
p
m
e
n
t
d
u
e
to
its
ex
ten
s
iv
e
lib
r
ar
ies
an
d
ca
p
ab
ilit
ies.
T
h
e
class
if
icatio
n
an
d
ev
alu
atio
n
p
r
o
ce
s
s
es
wer
e
ca
r
r
ied
o
u
t
u
s
in
g
th
e
s
cik
it
-
l
ea
r
n
(
s
k
lear
n
)
p
ac
k
a
g
e,
wh
il
e
d
ata
p
r
o
ce
s
s
in
g
was
h
an
d
l
ed
with
th
e
Pan
d
as
lib
r
ar
y
.
Fo
r
d
ata
v
is
u
aliza
tio
n
an
d
f
u
r
th
er
d
ata
m
an
i
p
u
lat
io
n
,
th
e
Ma
tp
lo
tlib
an
d
Nu
m
Py
lib
r
ar
ies
wer
e
u
tili
ze
d
.
3.
2
.
Da
t
a
s
et
T
h
e
ex
p
e
r
im
en
ts
wer
e
co
n
d
u
c
ted
u
s
in
g
a
liv
e
r
d
is
ea
s
e
p
atien
t
d
ataset
[
3
7
]
.
T
h
is
d
ataset
in
clu
d
es
ten
v
ar
iab
les:
ag
e,
g
en
d
er
,
to
t
al
b
iliru
b
in
,
d
ir
ec
t
b
iliru
b
i
n
,
alk
alin
e
p
h
o
s
p
h
atase
(
Alk
p
h
o
s
)
,
alam
i
n
e
am
in
o
tr
an
s
f
er
ase
(
Sg
p
t)
,
asp
a
r
tate
am
in
o
tr
a
n
s
f
er
ase
(
Sg
o
t)
,
to
tal
p
r
o
tein
s
,
al
b
u
m
in
,
an
d
th
e
al
b
u
m
in
an
d
g
lo
b
u
lin
r
atio
.
I
t
also
co
n
tain
s
a
class
if
icatio
n
f
ield
lab
eled
b
y
e
x
p
er
ts
,
in
d
icatin
g
eith
er
"
1
"
f
o
r
liv
er
p
atien
t
o
r
"2
"
f
o
r
non
-
liv
e
r
p
atien
t
.
T
h
e
liv
er
d
is
ea
s
e
p
atien
t
d
ataset
ex
em
p
lifie
s
b
ig
d
ata,
c
h
ar
ac
ter
ized
b
y
its
v
o
lu
m
e,
c
o
m
p
lex
ity
,
v
elo
city
,
v
a
r
iety
,
v
er
ac
ity
,
an
d
v
alu
e.
W
ith
3
2
,
0
0
0
r
ec
o
r
d
s
,
ea
ch
c
o
n
tain
in
g
ten
attr
i
b
u
te
s
,
th
e
d
ataset'
s
s
iz
e
n
ec
ess
itates ad
v
an
ce
d
b
ig
d
ata
tech
n
iq
u
es f
o
r
s
to
r
a
g
e
an
d
an
aly
s
is
.
I
ts
co
m
p
lex
ity
,
en
co
m
p
ass
in
g
f
ac
to
r
s
s
u
ch
as
ag
e,
g
e
n
d
er
,
an
d
v
a
r
io
u
s
b
io
ch
em
ical
m
ar
k
er
s
,
r
e
q
u
ir
es
s
o
p
h
is
ticated
p
r
o
ce
s
s
in
g
m
eth
o
d
s
.
Alth
o
u
g
h
t
h
e
d
ataset
is
s
tatic,
its
u
tili
ty
in
r
ap
id
a
n
aly
s
is
f
o
r
d
ev
elo
p
in
g
ef
f
ec
tiv
e
m
ac
h
i
n
e
lear
n
in
g
m
o
d
els
h
ig
h
lig
h
ts
its
v
elo
city
.
3.
3
.
E
v
a
lua
t
i
o
n
m
et
rics
T
o
g
au
g
e
t
h
e
ef
f
ec
tiv
e
n
ess
o
f
a
p
ar
ticu
lar
class
if
icatio
n
alg
o
r
ith
m
,
it
is
im
p
er
ativ
e
to
ass
es
s
its
p
er
f
o
r
m
an
ce
.
I
n
th
e
p
u
r
s
u
it
o
f
ev
alu
atin
g
th
e
p
r
o
p
o
s
ed
p
ap
er
,
we
h
av
e
ca
r
ef
u
lly
ex
p
lo
r
ed
p
er
f
o
r
m
a
n
ce
ev
alu
atio
n
m
et
r
ics,
en
co
m
p
ass
in
g
p
ar
am
eter
s
lik
e
ac
cu
r
ac
y
,
p
r
ec
is
io
n
,
r
ec
all,
th
e
F
-
1
s
c
o
r
e,
an
d
th
e
AUC
s
co
r
e.
No
n
eth
eless
,
in
o
u
r
s
tu
d
y
,
we
s
y
s
tem
atica
lly
elab
o
r
ated
o
n
th
e
f
o
llo
win
g
m
etr
i
cs
to
ap
p
r
aise
th
e
class
if
icatio
n
alg
o
r
ith
m
:
a.
Acc
u
r
ac
y
:
th
e
ac
cu
r
ac
y
o
f
a
b
i
n
ar
y
class
if
icatio
n
m
o
d
el
ca
n
b
e
ex
p
r
ess
ed
m
ath
e
m
atica
lly
a
s
(
9
)
:
=
+
+
+
+
(
9
)
wh
er
e
TP
is
th
e
n
u
m
b
er
o
f
liv
er
d
is
ea
s
e
o
b
s
er
v
atio
n
s
co
r
r
ec
tly
class
if
ied
as
liv
er
d
is
ea
s
e
at
th
r
esh
o
ld
.
TN
th
e
n
u
m
b
er
o
f
n
o
r
m
al
liv
er
o
b
s
er
v
atio
n
s
co
r
r
ec
tly
class
if
ied
as
th
e
ab
s
en
ce
o
f
liv
er
d
is
ea
s
e
at
th
r
esh
o
ld
.
FP
th
e
n
u
m
b
er
o
f
n
o
r
m
al
liv
er
o
b
s
er
v
atio
n
s
in
c
o
r
r
ec
tly
cl
ass
if
ied
as
liv
er
d
is
ea
s
e
at
t
h
r
esh
o
ld
.
T
h
e
k
e
y
p
r
in
cip
les
an
d
laws
th
at
u
n
d
er
lie
th
ese
m
ath
em
atica
l
r
ep
r
esen
tatio
n
s
in
clu
d
e
FN
is
Nu
m
b
er
o
f
n
o
r
m
al
liv
er
o
b
s
er
v
atio
n
s
in
co
r
r
ec
tly
class
if
ied
as th
e
ab
s
en
ce
o
f
liv
er
d
is
ea
s
e
at
th
r
esh
o
ld
.
b.
Pre
cisi
o
n
:
d
iv
id
es
th
e
to
tal
n
u
m
b
er
o
f
o
b
s
er
v
atio
n
s
th
e
m
o
d
el
d
etec
ts
b
y
th
e
n
u
m
b
er
o
f
o
b
s
er
v
atio
n
s
p
er
tain
in
g
to
liv
er
d
is
ea
s
e.
=
+
(
1
0
)
c.
R
ec
all:
it
d
eter
m
in
es
th
e
n
u
m
b
er
o
f
liv
e
r
d
is
ea
s
e
ca
s
es
id
en
tifie
d
b
y
th
e
m
o
d
el
d
iv
id
ed
b
y
th
e
to
tal
n
u
m
b
e
r
o
f
test
s
s
et
ac
tiv
itie
s
.
=
+
(
11
)
d.
F1
s
co
r
e:
is
th
e
weig
h
ted
av
e
r
ag
e
o
f
r
ec
all
an
d
p
r
ec
is
io
n
r
at
e
is
ca
lcu
lated
.
as
(
1
2
)
:
1
=
2
×
×
+
(1
2
)
Evaluation Warning : The document was created with Spire.PDF for Python.
I
SS
N
:
2
0
8
8
-
8
7
0
8
I
n
t J E
lec
&
C
o
m
p
E
n
g
,
Vo
l.
15
,
No
.
1
,
Feb
r
u
ar
y
20
25
:
1
162
-
1
1
7
4
1168
3.
4
.
Resul
t
a
na
ly
s
is
In
th
is
p
ap
er
,
we
co
m
p
ar
e
d
th
r
ee
d
is
tin
ct
m
ac
h
in
e
lear
n
in
g
m
o
d
els:
Gau
s
s
ian
NB
,
KN
N,
an
d
R
F.
W
e
u
s
ed
b
o
th
tr
ain
-
test
s
p
lit
a
n
d
cr
o
s
s
-
v
alid
atio
n
m
eth
o
d
s
d
u
r
in
g
th
is
ev
alu
atio
n
.
T
h
e
d
at
aset
wa
s
r
an
d
o
m
ly
s
p
lit
in
to
an
8
0
%
tr
ain
in
g
s
et
an
d
a
2
0
%
test
in
g
s
et,
en
s
u
r
in
g
d
ata
b
alan
ce
th
r
o
u
g
h
s
tr
atif
ied
r
an
d
o
m
s
am
p
lin
g
.
T
o
f
u
r
th
er
ass
ess
an
d
co
n
tr
ast
th
e
class
if
ier
s
'
ef
f
ec
tiv
en
ess
,
we
em
p
lo
y
ed
t
h
e
R
OC
cu
r
v
e,
with
th
e
to
tal
ar
ea
u
n
d
er
th
e
R
OC
cu
r
v
e
AUC
s
er
v
in
g
as
a
k
ey
p
er
f
o
r
m
an
ce
m
etr
ic.
AUC
v
alu
es
r
an
g
e
f
r
o
m
0
.
5
t
o
1
,
r
ef
lectin
g
th
e
class
if
ier
s
'
d
is
cr
i
m
in
atio
n
an
d
p
r
e
d
ictiv
e
ca
p
ab
ilit
ies.
Ou
r
co
m
p
ar
ativ
e
an
aly
s
is
r
e
v
ea
led
s
ig
n
if
ica
n
t
v
a
r
iatio
n
s
in
m
o
d
el
p
er
f
o
r
m
a
n
ce
.
As
s
h
o
wn
in
T
ab
le
1
,
th
e
KNN
m
o
d
el
e
x
h
ib
ited
s
tr
o
n
g
class
if
icatio
n
c
ap
ab
ilit
ies,
ac
h
iev
in
g
h
ig
h
p
r
ec
is
io
n
,
r
ec
all,
a
n
d
F1
-
s
co
r
es
ac
r
o
s
s
b
o
th
class
es,
r
esu
ltin
g
in
a
n
o
v
er
all
ac
cu
r
a
cy
o
f
9
5
%.
I
n
c
o
n
tr
ast,
t
h
e
G
au
s
s
ian
NB
m
o
d
el
ac
h
iev
ed
a
lo
wer
o
v
e
r
all
ac
cu
r
ac
y
o
f
5
5
%,
d
esp
ite
h
av
i
n
g
a
h
ig
h
r
ec
all
r
ate.
T
h
e
R
F
class
if
ier
s
to
o
d
o
u
t
with
n
ea
r
-
p
er
f
ec
t
p
er
f
o
r
m
an
c
e,
ac
h
iev
in
g
p
e
r
f
ec
t
p
r
ec
is
io
n
,
r
e
ca
ll,
an
d
F1
-
s
co
r
es
ac
r
o
s
s
all
class
e
s
an
d
an
o
u
ts
tan
d
in
g
o
v
er
all
ac
cu
r
ac
y
o
f
9
7
.
3
%.
T
h
ese
f
in
d
in
g
s
h
ig
h
lig
h
t
th
e
cr
itical
im
p
o
r
tan
ce
o
f
m
o
d
el
s
elec
tio
n
,
with
th
e
R
F m
o
d
el
em
er
g
in
g
a
s
th
e
m
o
s
t r
o
b
u
s
t a
n
d
ac
cu
r
ate
f
o
r
th
is
class
if
icatio
n
task
.
T
ab
le
1
.
Per
f
o
r
m
an
ce
m
etr
ics o
f
co
m
p
ar
in
g
th
e
th
r
ee
class
if
icatio
n
m
o
d
els KNN
,
Gau
s
s
ian
NB
an
d
r
an
d
o
m
f
o
r
e
s
t
M
o
d
e
l
A
c
c
u
r
a
c
y
P
r
e
c
i
s
i
o
n
R
e
c
a
l
l
F1
-
S
c
o
r
e
AUC
K
N
N
95
%
94
%
93
%
93
%
93
%
G
a
u
ss
i
a
n
N
B
5
5
.
7
%
3
9
%
9
6
%
5
5
%
6
8
%
R
a
n
d
o
m F
o
r
e
s
t
9
7
.
3
%
9
7
%
9
6
%
9
5
%
9
5
%
Fig
u
r
e
2
p
r
esen
ts
th
e
p
er
f
o
r
m
an
ce
m
etr
ics
o
f
th
e
th
r
ee
m
o
d
els,
in
clu
d
in
g
p
r
ec
is
io
n
,
r
ec
all
,
F1
-
s
co
r
e,
an
d
s
u
p
p
o
r
t.
T
h
e
R
F
m
o
d
el,
s
h
o
wn
in
Fig
u
r
e
2
(
a
)
,
ac
h
iev
es th
e
h
ig
h
est
ac
cu
r
ac
y
at
9
7
.
3
% a
n
d
an
F1
-
s
co
r
e
o
f
9
5
%.
T
h
e
KNN
m
o
d
el,
d
e
p
icted
in
Fig
u
r
e
2
(
b
)
,
f
o
llo
ws
w
ith
a
p
r
ec
is
io
n
o
f
ap
p
r
o
x
im
at
ely
9
4
%,
r
ec
all
o
f
ab
o
u
t
9
3
%,
an
d
a
n
F1
-
s
co
r
e
o
f
9
3
%.
Alth
o
u
g
h
th
e
Ga
u
s
s
ian
NB
m
o
d
el,
illu
s
tr
ated
in
Fig
u
r
e
2
(
c)
,
h
as
a
lo
wer
ac
cu
r
ac
y
o
f
5
5
.
7
%
an
d
an
F1
-
s
co
r
e
o
f
5
5
%,
it
ex
ce
ls
in
r
ec
all
with
a
v
alu
e
o
f
9
6
%,
h
ig
h
lig
h
tin
g
its
s
tr
en
g
th
in
id
en
tify
in
g
p
o
s
itiv
e
in
s
tan
c
es.
Fig
u
r
e
3
p
r
esen
ts
th
e
c
o
n
f
u
s
io
n
m
atr
ices
f
o
r
th
e
th
r
ee
cla
s
s
if
icatio
n
m
o
d
els.
T
h
e
KNN
m
o
d
el
in
Fig
u
r
e
3
(
a)
co
r
r
ec
tly
id
en
tifie
d
8
,
4
3
9
in
s
tan
ce
s
as
tr
u
e
p
o
s
it
iv
es
b
u
t
also
in
co
r
r
ec
tly
lab
eled
3
3
3
in
s
tan
ce
s
as
f
alse
p
o
s
itiv
es
an
d
m
is
s
ed
4
3
0
tr
u
e
p
o
s
itiv
es,
class
if
y
in
g
th
em
as
f
alse
n
eg
ativ
es.
T
h
e
Gau
s
s
ian
N
B
m
o
d
el
in
Fig
u
r
e
3
(
b
)
,
h
o
wev
er
,
ex
h
ib
ited
a
h
ig
h
er
r
ate
o
f
f
alse p
o
s
itiv
es a
t 5
,
3
5
8
,
wh
ile
co
r
r
ec
tly
id
en
tify
in
g
3
,
3
7
1
tr
u
e
p
o
s
itiv
es.
T
h
e
b
alan
ce
b
etwe
en
f
alse
p
o
s
itiv
es
an
d
tr
u
e
p
o
s
itiv
es
i
s
cr
u
cial
as
it
af
f
ec
ts
m
o
d
el
d
ec
is
io
n
s
d
ep
en
d
i
n
g
o
n
th
e
a
p
p
licatio
n
co
n
tex
t.
T
h
e
R
F
m
o
d
el
Fig
u
r
e
3
(
c)
d
em
o
n
s
tr
ated
ex
ce
p
tio
n
a
l
p
er
f
o
r
m
an
ce
,
with
o
n
ly
7
f
alse
p
o
s
itiv
es
an
d
3
5
1
f
alse
n
eg
ativ
es,
r
esu
ltin
g
in
a
h
ig
h
co
u
n
t
o
f
8
,
7
6
5
tr
u
e
p
o
s
itiv
es.
T
h
e
ch
o
ice
o
f
m
o
d
el
d
e
p
en
d
s
o
n
th
e
s
p
ec
if
i
c
task
r
eq
u
ir
em
e
n
ts
an
d
th
e
a
cc
ep
tab
le
tr
ad
e
-
o
f
f
s
b
etwe
en
f
alse
p
o
s
itiv
es
an
d
f
alse
n
eg
ativ
es,
wh
ich
v
ar
y
a
cr
o
s
s
d
if
f
er
en
t
d
o
m
ai
n
s
an
d
ap
p
licatio
n
s
.
T
h
e
co
n
f
u
s
io
n
m
atr
ix
is
es
s
en
tial
f
o
r
ev
alu
atin
g
t
h
ese
tr
ad
e
-
o
f
f
s
a
n
d
m
ak
i
n
g
i
n
f
o
r
m
ed
m
o
d
el
s
elec
tio
n
d
ec
is
io
n
s
,
as
illu
s
tr
ated
in
Fig
u
r
e
3
.
W
h
ile
KNN
an
d
Gau
s
s
ian
NB
h
av
e
h
ig
h
e
r
f
alse
p
o
s
itiv
e
r
ates,
R
F
s
h
o
ws
m
in
im
al
f
alse
p
o
s
i
tiv
es
an
d
n
eg
ativ
es,
r
esu
ltin
g
in
a
s
u
p
e
r
io
r
tr
u
e
p
o
s
itiv
e
co
u
n
t.
Fig
u
r
e
4
d
is
p
lay
s
th
e
R
OC
cu
r
v
es
f
o
r
th
e
KNN,
Gau
s
s
ian
NB
,
an
d
R
F
m
o
d
els.
I
n
th
is
s
tu
d
y
,
we
ev
alu
ated
th
ese
m
o
d
els
b
ased
o
n
th
eir
d
is
cr
im
in
ativ
e
p
o
w
er
,
as
d
ep
icted
b
y
th
ei
r
R
OC
cu
r
v
e
v
alu
es.
T
h
e
R
OC
cu
r
v
e
illu
s
tr
ate
s
a
m
o
d
el'
s
ca
p
ac
ity
to
d
if
f
er
en
tiate
b
et
wee
n
p
o
s
itiv
e
an
d
n
eg
ativ
e
cl
ass
es
ac
r
o
s
s
v
ar
io
u
s
th
r
esh
o
ld
s
ettin
g
s
.
T
h
e
a
r
ea
u
n
d
er
t
h
e
R
OC
cu
r
v
e
AUC
q
u
an
tifie
s
o
v
er
all
m
o
d
el
p
er
f
o
r
m
an
ce
,
with
v
alu
es
n
ea
r
in
g
1
.
0
in
d
icatin
g
h
i
g
h
ac
cu
r
ac
y
an
d
v
alu
es a
r
o
u
n
d
0
.
5
s
u
g
g
esti
n
g
r
an
d
o
m
class
if
icatio
n
.
Ou
r
f
in
d
in
g
s
s
h
o
w
th
at
th
e
R
F
m
o
d
el
Fig
u
r
e
4
(
a)
ac
h
iev
es
th
e
h
ig
h
est
AUC
v
alu
e
at
9
5
%,
d
em
o
n
s
tr
atin
g
ex
ce
p
tio
n
al
cla
s
s
d
if
f
er
en
tiatio
n
ab
ilit
y
.
T
h
e
KNN
m
o
d
el
Fig
u
r
e
4
(
b
)
f
o
llo
ws
with
an
AUC
o
f
9
3
%,
in
d
icatin
g
h
i
g
h
b
u
t
s
lig
h
tly
lo
wer
p
er
f
o
r
m
an
ce
.
T
h
e
Gau
s
s
ian
NB
m
o
d
el
Fig
u
r
e
4
(
c)
h
as
a
n
AUC
o
f
8
6
%,
r
ef
lectin
g
a
less
r
o
b
u
s
t
class
if
icatio
n
ab
ilit
y
.
T
h
e
R
O
C
cu
r
v
es
an
d
th
eir
AUC
v
alu
es
p
r
o
v
id
e
cr
u
cial
in
s
ig
h
ts
in
to
th
e
m
o
d
els'
ef
f
ec
tiv
en
ess
,
h
elp
in
g
to
in
f
o
r
m
d
ec
is
io
n
s
ab
o
u
t
th
eir
ap
p
r
o
p
r
iaten
ess
f
o
r
s
p
ec
if
ic
ap
p
licatio
n
s
o
r
task
s
.
T
h
ese
v
alu
es
h
ig
h
lig
h
t
R
F's
s
u
p
er
io
r
class
d
is
tin
ctio
n
ca
p
ab
ilit
y
co
m
p
ar
ed
to
KNN
an
d
Gau
s
s
ian
NB
,
o
f
f
er
in
g
v
alu
a
b
l
e
g
u
id
an
ce
f
o
r
class
if
icatio
n
task
s
elec
tio
n
.
Fig
u
r
e
5
p
r
esen
ts
th
e
c
o
r
r
ela
tio
n
m
atr
ix
f
o
r
th
e
KNN,
Ga
u
s
s
ian
NB
,
an
d
R
F
m
o
d
els,
p
r
o
v
id
i
n
g
in
s
ig
h
ts
in
to
th
e
lin
ea
r
r
elatio
n
s
h
ip
s
b
etwe
en
v
ar
io
u
s
liv
er
f
u
n
ctio
n
test
s
an
d
d
em
o
g
r
a
p
h
ic
d
ata.
T
h
e
m
atr
ix
r
ev
ea
ls
th
at
Ag
e
h
as
a
n
eg
li
g
ib
le
co
r
r
elatio
n
with
o
t
h
er
v
ar
iab
les,
in
d
icatin
g
its
lim
it
ed
lin
ea
r
im
p
ac
t
o
n
liv
er
-
r
elate
d
test
s
.
T
o
tal
an
d
D
ir
ec
t
B
iliru
b
in
f
ea
tu
r
es
s
h
o
w
a
s
tr
o
n
g
p
o
s
itiv
e
co
r
r
elatio
n
(
0
.
8
8
7
)
,
s
u
g
g
esti
n
g
a
d
ir
ec
t
r
elatio
n
s
h
ip
in
liv
er
f
u
n
ctio
n
.
L
iv
er
en
zy
m
es,
s
u
ch
as
Alk
p
h
o
s
,
Sg
p
t,
an
d
Sg
o
t,
ex
h
ib
it
m
o
d
er
ate
to
h
ig
h
co
r
r
elatio
n
s
,
p
ar
ticu
lar
ly
b
etwe
en
Sg
p
t
an
d
Sg
o
t
(
0
.
7
8
3
)
,
r
ef
lectin
g
th
eir
in
ter
co
n
n
ec
ted
r
o
les
in
liv
er
Evaluation Warning : The document was created with Spire.PDF for Python.
I
n
t J E
lec
&
C
o
m
p
E
n
g
I
SS
N:
2088
-
8
7
0
8
E
va
lu
a
tin
g
ma
ch
in
e
le
a
r
n
in
g
mo
d
els fo
r
p
r
ed
ictive
a
n
a
lytics o
f live
r
d
is
ea
s
e
…
(
Osa
ma
Mo
h
a
r
eb
K
h
a
led
)
1169
h
ea
lth
.
T
o
tal
p
r
o
tein
s
an
d
a
l
b
u
m
in
(
AL
B
)
also
d
is
p
lay
a
s
tr
o
n
g
co
r
r
elatio
n
(
0
.
7
7
6
)
,
u
n
d
er
s
co
r
i
n
g
th
ei
r
co
m
b
in
ed
im
p
o
r
tan
ce
in
liv
e
r
f
u
n
ctio
n
ass
ess
m
en
ts
.
T
h
e
A
/G
r
atio
s
h
o
ws
a
s
ig
n
if
ican
t
p
o
s
itiv
e
co
r
r
elatio
n
with
a
lb
u
m
in
(
0
.
6
8
3
)
,
wh
ich
alig
n
s
with
its
co
m
p
o
s
itio
n
.
Gen
d
er
,
r
ep
r
esen
ted
as
b
in
ar
y
v
ar
iab
les,
s
h
o
ws
a
s
tr
o
n
g
n
eg
ativ
e
c
o
r
r
elatio
n
b
etwe
en
its
ca
teg
o
r
ies
(
-
0
.
9
2
8
)
,
as
ex
p
ec
ted
f
r
o
m
b
i
n
ar
y
d
ata.
T
h
is
m
atr
ix
is
cr
u
cial
f
o
r
u
n
d
er
s
tan
d
in
g
th
e
i
n
ter
d
ep
en
d
en
cies
am
o
n
g
liv
e
r
-
r
elate
d
v
ar
ia
b
les
an
d
ca
n
g
u
id
e
in
-
d
e
p
th
a
n
aly
s
is
an
d
m
o
d
el
d
ev
elo
p
m
en
t,
p
ar
ti
cu
lar
ly
in
id
en
tif
y
in
g
p
o
ten
tia
lly
r
ed
u
n
d
an
t
o
r
h
ig
h
l
y
p
r
e
d
ic
tiv
e
v
ar
iab
les.
T
h
e
co
r
r
elatio
n
s
am
o
n
g
liv
er
en
z
y
m
es
an
d
p
r
o
tein
s
ar
e
n
o
tab
ly
s
tr
o
n
g
,
with
g
en
d
e
r
e
x
h
ib
itin
g
a
s
ig
n
if
ica
n
t
n
eg
ativ
e
co
r
r
elatio
n
b
etwe
en
its
b
in
ar
y
ca
teg
o
r
ies,
o
f
f
er
i
n
g
c
r
itical
in
s
ig
h
ts
f
o
r
p
r
ec
is
e
liv
er
f
u
n
ctio
n
an
aly
s
is
.
(
a)
(
b
)
(
c)
Fig
u
r
e
2
.
A
class
if
icatio
n
r
ep
o
r
t o
f
th
e
t
h
r
ee
m
o
d
els (
a)
KN
N,
(
b
)
Gau
s
s
ian
NB
,
an
d
(
c
)
R
F.
R
F d
em
o
n
s
tr
ates th
e
h
ig
h
est ac
cu
r
ac
y
an
d
F1
-
s
co
r
e
,
f
o
llo
we
d
b
y
KNN,
wh
ile
Gau
s
s
ian
N
B
lag
s
b
eh
in
d
in
ac
cu
r
ac
y
an
d
F1
-
s
co
r
e
b
u
t e
x
c
els in
r
ec
all
Evaluation Warning : The document was created with Spire.PDF for Python.
I
SS
N
:
2
0
8
8
-
8
7
0
8
I
n
t J E
lec
&
C
o
m
p
E
n
g
,
Vo
l.
15
,
No
.
1
,
Feb
r
u
ar
y
20
25
:
1
162
-
1
1
7
4
1170
(
a)
(
b
)
(
c)
Fig
u
r
e
3
.
C
o
n
f
u
s
io
n
m
atr
i
x
f
o
r
(
a)
KNN
,
(
b
)
Gau
s
s
ian
NB
,
an
d
(
c)
R
F
.
KNN
an
d
Gau
s
s
ian
NB
h
as h
ig
h
er
f
alse p
o
s
itiv
e
r
ates,
wh
ile
R
F
ex
h
ib
its
m
in
im
al
f
alse p
o
s
itiv
es a
n
d
n
eg
ativ
es,
r
esu
ltin
g
in
a
h
ig
h
er
t
r
u
e
p
o
s
itiv
e
co
u
n
t
Rec
e
iv
e
r
op
e
rati
ng
c
h
a
rac
te
ris
tic
(RO
C)
c
urv
e
-
KNN
Rec
e
iv
e
r
op
e
rati
ng
c
h
a
rac
te
ris
tic
(RO
C)
c
urv
e
-
RF
(
a)
(
b
)
Rec
e
iv
e
r
op
e
rati
ng
c
h
a
rac
te
ris
tic
(RO
C)
c
urv
e
-
G
a
us
s
ia
n
NB
(
c)
Fig
u
r
e
4
.
R
OC
cu
r
v
es f
o
r
(
a)
R
F,
(
b
)
KNN,
an
d
(
c)
Gau
s
s
ian
NB
,
s
h
o
wca
s
in
g
th
eir
d
is
cr
i
m
in
ativ
e
p
o
wer
.
T
h
e
AUC v
alu
es in
d
icate
R
F's
s
u
p
er
io
r
a
b
ilit
y
to
d
is
tin
g
u
is
h
b
etwe
en
class
es c
o
m
p
ar
ed
to
KNN
an
d
Gau
s
s
ian
NB
.
T
h
ese
cu
r
v
es p
r
o
v
id
e
v
al
u
ab
le
in
s
ig
h
ts
in
to
m
o
d
el
p
er
f
o
r
m
a
n
ce
,
aid
in
g
in
in
f
o
r
m
ed
d
ec
is
io
n
-
m
ak
in
g
f
o
r
class
if
icatio
n
task
s
Evaluation Warning : The document was created with Spire.PDF for Python.
I
n
t J E
lec
&
C
o
m
p
E
n
g
I
SS
N:
2088
-
8
7
0
8
E
va
lu
a
tin
g
ma
ch
in
e
le
a
r
n
in
g
mo
d
els fo
r
p
r
ed
ictive
a
n
a
lytics o
f live
r
d
is
ea
s
e
…
(
Osa
ma
Mo
h
a
r
eb
K
h
a
led
)
1171
Fig
u
r
e
5
.
C
o
r
r
elatio
n
m
atr
ix
b
etwe
en
v
ar
iab
les,
h
ig
h
lig
h
tin
g
r
elatio
n
s
h
ip
s
b
etwe
en
liv
er
f
u
n
ctio
n
test
s
an
d
d
em
o
g
r
a
p
h
ic
d
ata.
Stro
n
g
co
r
r
elatio
n
s
ar
e
ev
id
en
t a
m
o
n
g
liv
er
en
zy
m
es a
n
d
p
r
o
tein
s
,
with
g
en
d
er
d
is
p
lay
in
g
a
n
o
tab
le
n
e
g
ativ
e
co
r
r
elatio
n
b
etwe
en
its
b
in
ar
y
ca
teg
o
r
ies.
T
h
is
m
atr
ix
o
f
f
e
r
s
cr
itical
in
s
ig
h
ts
f
o
r
p
r
ec
is
e
liv
er
f
u
n
ctio
n
an
aly
s
is
3
.
5
.
Co
m
pa
riso
n wit
h pre
v
io
us
wo
rk
s
T
ab
le
2
p
r
o
v
id
es
a
s
u
m
m
ar
y
o
f
p
r
ev
i
o
u
s
r
esear
ch
th
at
em
p
lo
y
ed
d
if
f
e
r
en
t
m
ac
h
i
n
e
lear
n
i
n
g
m
o
d
els
to
p
r
ed
ict
liv
er
d
is
ea
s
e.
E
ac
h
m
o
d
el
h
as
a
d
if
f
e
r
en
t
lev
el
o
f
ac
cu
r
ac
y
,
p
r
ec
is
io
n
,
r
ec
all,
F
1
-
s
co
r
e,
an
d
AUC.
Sin
g
h
et
a
l.
[
1
0
]
im
p
lem
e
n
ted
lo
g
is
tic
r
eg
r
ess
io
n
(
L
R
)
wit
h
an
ac
cu
r
ac
y
o
f
7
4
.
3
6
%.
Priy
a
et
a
l.
[
2
2
]
u
s
ed
SVM,
ac
h
iev
in
g
an
ac
cu
r
ac
y
o
f
7
1
.
3
5
%.
Gh
o
s
h
et
a
l.
[
2
5
]
a
p
p
lied
b
ac
k
p
r
o
p
ag
atio
n
,
r
ep
o
r
tin
g
an
ac
cu
r
ac
y
o
f
7
3
.
2
%
a
n
d
p
r
ec
is
io
n
o
f
6
5
.
7
%
.
B
ah
r
am
ir
ad
et
a
l.
[
2
6
]
also
u
s
ed
lo
g
is
tic
r
eg
r
ess
io
n
,
ac
h
iev
in
g
a
n
ac
c
u
r
ac
y
o
f
7
3
.
3
9
%
an
d
a
p
r
ec
is
io
n
o
f
5
7
.
6
9
%.
T
h
ir
u
n
av
u
k
k
ar
asu
et
a
l
.
[
2
7
]
em
p
lo
y
ed
d
ec
is
io
n
tr
ee
s
(
DT
)
,
ac
h
iev
in
g
a
n
o
tab
le
ac
c
u
r
ac
y
o
f
8
1
%.
Nah
ar
et
a
l.
[
2
8
]
u
s
ed
l
o
g
i
s
tic
r
eg
r
ess
io
n
,
with
a
n
ac
c
u
r
ac
y
o
f
7
3
.
9
7
%.
Vijay
ar
an
i
an
d
Dh
ay
a
n
an
d
[
2
9
]
u
tili
ze
d
Ad
aBo
o
s
t,
ac
h
iev
i
n
g
an
ac
cu
r
ac
y
o
f
7
0
.
2
5
%.
I
n
an
o
th
e
r
s
tu
d
y
,
th
e
y
u
s
ed
SVM
an
d
ac
h
iev
e
d
a
h
i
g
h
er
ac
c
u
r
ac
y
o
f
7
9
.
6
6
%
wit
h
7
6
.
6
%
p
r
ec
is
io
n
.
I
n
th
is
s
tu
d
y
,
th
e
KNN
m
o
d
el
ac
h
iev
ed
a
r
em
a
r
k
ab
le
ac
c
u
r
a
cy
o
f
9
5
%,
with
p
r
ec
is
io
n
,
r
ec
all,
an
d
F1
-
s
co
r
e
v
al
u
es
o
f
9
4
%,
9
3
%,
an
d
9
3
%,
r
esp
ec
tiv
ely
,
d
em
o
n
s
tr
atin
g
i
ts
ef
f
ec
tiv
en
ess
in
cla
s
s
if
icat
io
n
.
T
h
e
Gau
s
s
ian
NB
m
o
d
el
d
is
p
lay
ed
lo
wer
p
er
f
o
r
m
an
ce
,
with
an
ac
c
u
r
ac
y
o
f
5
5
.
7
%,
p
r
ec
is
io
n
o
f
3
9
%
,
r
ec
all
o
f
9
6
%,
a
n
d
F1
-
s
co
r
e
o
f
5
5
%,
in
d
icatin
g
less
ef
f
ec
tiv
en
ess
co
m
p
ar
ed
to
o
th
er
m
o
d
els.
T
h
e
r
an
d
o
m
f
o
r
est
m
o
d
el
em
er
g
ed
as
th
e
to
p
p
er
f
o
r
m
er
,
b
o
asti
n
g
an
ac
cu
r
ac
y
o
f
9
7
.
3
%
an
d
well
-
b
alan
ce
d
p
r
ec
is
io
n
,
r
ec
all,
an
d
F1
-
s
co
r
e
v
alu
es
o
f
9
7
%,
9
6
%,
an
d
9
5
%,
r
esp
ec
tiv
ely
.
B
y
lev
er
a
g
in
g
a
lar
g
e
an
d
co
m
p
le
x
d
ataset,
th
is
s
tu
d
y
n
o
t
o
n
ly
m
ain
tain
ed
b
u
t
also
im
p
r
o
v
e
d
th
e
p
r
e
d
ictio
n
ac
cu
r
ac
y
f
o
r
liv
er
d
is
ea
s
es.
T
h
is
ca
n
co
n
tr
ib
u
te
p
o
s
itiv
ely
to
en
h
a
n
cin
g
th
e
ab
ilit
y
t
o
p
r
ed
ict
liv
er
d
is
ea
s
es e
f
f
ec
tiv
ely
,
th
u
s
f
ac
ilit
atin
g
ea
r
ly
d
iag
n
o
s
is
an
d
tim
ely
in
ter
v
en
tio
n
.
T
ab
le
2
.
co
m
p
ar
es p
r
ev
io
u
s
p
a
p
er
s
o
n
liv
er
d
is
ea
s
e
p
r
ed
ictio
n
u
s
in
g
m
ac
h
in
e
lear
n
i
n
g
m
o
d
els
P
a
p
e
r
M
o
d
e
l
A
c
c
u
r
a
c
y
P
r
e
c
i
s
i
o
n
R
e
c
a
l
l
F1
-
S
c
o
r
e
AUC
S
i
n
g
h
e
t
a
l
.
[
1
0
]
LR
7
4
.
3
6
%
-
-
-
-
P
r
i
y
a
e
t
a
l
.
[
2
2
]
S
V
M
7
1
.
3
5
%
-
-
-
-
S
o
n
t
a
k
k
e
e
t
a
l
.
[
1
]
B
a
c
k
P
r
o
p
a
g
a
t
i
o
n
7
3
.
2
%
6
5
.
7
%
-
-
-
B
a
h
r
a
mi
r
a
d
e
t
a
l
.
[
2
6
]
Lo
g
i
s
t
i
c
7
3
.
3
9
%
5
7
.
6
9
%
-
-
-
Th
i
r
u
n
a
v
u
k
k
a
r
a
s
u
e
t
a
l
.
[
2
7
]
DT
8
1
%
-
-
-
-
N
a
h
a
r
e
t
a
l
.
[
2
8
]
LR
7
3
.
9
7
%
-
-
-
-
V
i
j
a
y
a
r
a
n
i
e
t
a
l
.
[
2
9
]
A
d
a
B
o
o
st
7
0
.
2
5
%
-
-
-
-
S
V
M
7
9
.
6
6
%
7
6
.
6
%
-
-
-
R
e
s
u
l
t
s
o
f
t
h
i
s
st
u
d
y
KNN
9
5
%
9
4
%
9
3
%
9
3
%
9
3
%
G
a
u
ss
i
a
n
N
B
5
5
.
7
%
3
9
%
9
6
%
5
5
%
6
8
%
RF
9
7
.
3
%
9
7
%
9
6
%
9
5
%
9
5
%
Evaluation Warning : The document was created with Spire.PDF for Python.