Internati
o
nal
Journal of Ele
c
trical
and Computer
Engineering
(IJE
CE)
V
o
l.
6, N
o
. 4
,
A
ugu
st
2016
, pp
. 18
80
~
1
888
I
S
SN
: 208
8-8
7
0
8
,
D
O
I
:
10.115
91
/ij
ece.v6
i
4.9
800
1
880
Jo
urn
a
l
h
o
me
pa
ge
: h
ttp
://iaesjo
u
r
na
l.com/
o
n
lin
e/ind
e
x.ph
p
/
IJECE
Efficient Feature Subset Sele
ction Algorithm for High
Dimensional Data
Smita Chor
m
unge
1
,
Su
dar
s
on
Jen
a
2
1
Department of
Computer Scien
ce
and
Engine
e
r
in
g
,
GI
T
A
M Un
iv
e
r
si
ty
, Hy
de
r
a
ba
d
,
I
n
di
a
2
Department of
Information Technolog
y
,
GITA
M University
, H
y
der
a
bad
,
Ind
i
a
Article Info
A
B
STRAC
T
Article histo
r
y:
Received Dec 26, 2015
Rev
i
sed
Jun
13,
201
6
Accepted
Jun 27, 2016
F
eature s
e
lec
tio
n approach s
o
lv
es
th
e dimensio
nality
prob
lem
b
y
removing
irrelevant
and redundant features
. Exis
ting Featur
e selection algor
ithms take
m
o
re tim
e to
o
b
tain
fea
t
ure
subset
for h
i
gh dimensional data. This p
a
per
proposes a featu
r
e selection algo
rithm
based on Information gain measures
for high d
i
mensional d
a
ta termed as
IFSA (Information gain based Featur
e
Selection Algorithm) to produce optimal
feature
subset in efficient time and
improve the computational perform
ance of
learn
i
ng algori
t
h
m
s
. IFS
A
algorithm works in two fo
lds: First a
pply
filter
on datas
e
t. Seco
nd produce
the sm
all fea
t
u
r
e subset b
y
u
s
ing in
form
atio
n gain m
easure
.
Extensiv
e
experiments ar
e carried ou
t to
compare proposed algorithm
and other
methods with respect to
two dif
f
erent
classifiers
(Naive b
a
y
e
s and IBK) on
microarray
and
text data sets. The re
sults demonstrate th
at IFSA not only
produces
the m
o
s
t
s
e
lect
fea
t
ure
s
ubs
et in effi
cie
n
t tim
e but a
l
s
o
i
m
p
roves
the
clas
s
i
fi
er p
e
rfor
m
ance.
Keyword:
Classifiers
Feature
selection
Filters
In
fo
rm
ation ga
in
K-
neare
s
t
nei
g
hb
o
r
s
Naive bay
e
s
Copyright ©
201
6 Institut
e
o
f
Ad
vanced
Engin
eer
ing and S
c
i
e
nce.
All rights re
se
rve
d
.
Co
rresp
ond
i
ng
Autho
r
:
1.
INTRODUCTION
In
hi
g
h
di
m
e
nsi
onal
dat
a
l
i
k
e genes
dat
a
o
r
D
NA
dat
a
cont
ai
n
s
l
a
rge
num
ber o
f
at
t
r
i
but
es w
h
i
c
h
affect the c
o
mputational cost
and
decrea
se
the lear
ning
accuracy. T
h
e
increase
of
data size in term
s
of
num
ber
of i
n
stances a
nd
number
of features
becom
e
s a
great challenge
for t
h
e feat
ure
selection algorith
m
s
[1
]. To
d
eal with
th
ese prob
l
e
m
s
, two
m
a
in
d
i
m
e
n
s
i
o
n
a
lity redu
ctio
n
app
r
o
a
ch
es are t
y
p
i
cally u
s
ed
: featu
r
e
extraction and feature selection
[2].
Feature extraction method creates
a new feature
set whe
r
e as
feature
sel
ect
i
on m
e
t
hod sea
r
ch t
h
e
go
od s
ubset
fr
om
ori
g
i
n
al
set
by
rem
ovi
ng i
r
rel
e
va
nt
and re
d
u
n
d
a
n
t
dat
a
.
Ir
rel
e
va
nt
feat
ures c
o
nt
ai
n
no
usef
ul
i
n
fo
rm
at
i
on a
b
o
u
t
t
h
e
cl
assi
fi
catio
n
pr
ob
lem
w
h
er
e as
r
e
dund
an
t f
e
atu
r
es
contain i
n
formation whic
h is already present in m
o
re
inform
ative features.
Feature selection useful in
statistical p
a
tte
rn reco
gn
ition
,
m
ach
in
e learnin
g
,
d
a
ta m
i
n
i
n
g
and
statisti
cs. It is also
effectiv
e in enh
a
n
c
ing
learning e
fficie
n
cy, inc
r
easing predictiv
e acc
uracy, and
reducing c
o
m
p
lexity.
Gi
ve
n t
h
e i
n
p
u
t
dat
a
D t
a
bl
ed as N
sam
p
l
e
s and M
fe
a
t
ures
X={x
i
, i =1
…M} and th
e targ
et
classification variable c, the featur
e sel
ect
i
o
n pr
o
b
l
e
m
i
s
to fi
n
d
fr
om
t
h
e M
-
di
m
e
nsi
onal
ob
ser
v
at
i
o
n space
,
R
M
, a subs
pace
of m
features, R
m
, that “optimally” charact
er
izes c [3]. T
h
ere a
r
e t
h
ree
gene
ral a
p
proa
ches t
o
featu
r
e selectio
n
:
Filters,
Wrap
p
e
rs and
Em
b
e
d
d
e
d m
e
th
o
d
s
[4
].
Ev
al
uatio
n
fun
c
tion
is m
easu
r
es of filter,
wra
p
per m
e
t
hods an
d em
bedded m
e
t
hods
. Whi
c
h are cl
assi
fi
ed i
n
t
o
t
h
r
ee cat
egori
e
s
unce
r
t
a
i
n
t
y
m
e
asure
s
,
d
i
stan
ce m
easu
r
es, an
d
d
e
p
e
n
d
e
n
ce m
easu
r
es. Th
e
d
a
ta i
n
tr
in
sic
categor
y in
clud
es d
i
stan
ce, en
tropy, and
depe
n
d
ence m
easure
s
. Acc
o
r
d
i
n
g t
o
rece
nt
researc
h
eval
uat
i
on
fu
nct
i
o
n i
s
di
vi
de
d i
n
t
o
fi
ve categories:
distance, information, de
pe
ndence
, c
o
ns
istency, a
n
d classier error
rate.
[5]
.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
J
ECE
I
S
SN
:
208
8-8
7
0
8
Efficien
t
Fea
t
ure Sub
set S
e
lectio
n
Al
g
o
r
it
h
m
fo
r Hi
gh
Dimen
s
iona
l Da
ta (S
mita Cho
r
mun
g
e
)
1
881
Most of the fe
ature sel
ection alg
o
rith
m
s
[6
]-[1
0
]
have
bee
n
p
r
op
ose
d
f
o
r
cl
assi
fi
cat
i
on t
echni
q
u
es
.
Aim of all approac
h
es is to search
an
op
ti
mal featu
r
e set an
d
im
p
r
ov
e the
classifier perform
a
nce. Ma
ny of
these feat
ure
selection algorith
m
s
use statistical
m
easures suc
h
as mutual inform
ation, c
o
rrelation a
nd
inform
ation ga
in m
easure. It
also im
prove
d
searc
h
ap
pro
ach
es su
ch
as p
opu
latio
n-
based
heur
istic sear
ch
ap
pro
ach, g
e
netic alg
o
r
ith
m
s
an
d
an
t co
lony o
p
t
i
m
izatio
n
.
K
r
ier
et al. pr
opo
sed
a m
e
t
h
odo
log
y
co
mb
in
ing
hi
erarc
h
i
cal
co
nst
r
ai
ne
d cl
ust
e
ri
n
g
o
f
s
p
ect
r
a
l
vari
abl
e
s a
n
d sel
ect
i
on
o
f
cl
ust
e
rs
by
m
u
t
u
al
i
n
f
o
rm
at
i
on [
1
1]
.
Van
Di
jc
k an
d
Van H
u
l
l
e
[1
2]
prese
n
t
e
d s
a
m
e
m
e
t
hod
ol
ogy
as Kri
e
r,
onl
y
di
f
f
ere
n
c
e
i
s
t
h
at
consecut
i
v
e
feat
ure
s
c
ont
a
i
n i
n
e
v
e
r
y
c
l
ust
e
rs.
To
re
m
ove red
u
nda
nt
feat
ures
b
o
t
h
m
e
t
hod
s
use
d
a
ggl
om
erat
i
v
e
hierarc
h
ical cl
ustering. Hy
bri
d
Feat
ur
es
w
h
i
c
h are
n
o
t
rel
e
vant
a
n
d re
d
u
n
d
ant
features a
l
so affects the
spee
d
and accuracy
of learning
al
gorithm
s
[13].
M.A
Hall use
d
correlation m
easure t
o
a
d
dres
s the pr
oblem
of
feature selec
tion for m
achine learning.
I
t
d
e
f
i
n
e
s th
e cen
tr
al h
ypo
th
esis w
h
er
e
g
ood f
eatu
r
e sets co
n
t
ain
f
eatu
r
es th
at ar
e h
i
g
h
l
y co
r
r
e
lated
w
i
th
th
e
class, yet unc
o
rrelated
with each
othe
r.
CFS (C
orrela
tion
base
d Feature Selection)
is an al
gorithm
that
co
up
les th
is evalu
a
tio
n
form
u
l
a with
an a
p
propriate correlation m
easure a
nd a
he
uri
s
t
i
c
search st
rat
e
gy
[1
4]
.
L.yu et al. introduce
d
FCBF is a fast filter
m
e
thod wh
ich
can identify rel
e
vant feat
ures
as well as redunda
nc
y
am
ong
rel
e
va
n
t
feat
ures wi
t
h
out
pai
r
wi
se c
o
r
r
el
at
i
on a
n
al
y
s
i
s
.
It is
based on preproces
sing step t
o
machine
learn
i
ng
, t
o
filter th
e
d
a
ta wh
ich
is e
ff
ectiv
e in
d
i
m
e
n
s
io
nality red
u
c
tio
n, rem
o
v
i
n
g
, it in
crease t
h
e learn
i
n
g
accuracy, a
n
d
i
m
proving result com
p
rehe
nsibility [15]. Sakar et al.
propos
ed
Pre
d
ictive Mut
u
al Inform
at
ion
(PMI), a h
y
b
r
i
d
m
easu
r
e
o
f
relev
a
n
ce
no
t on
ly is b
a
se
d
on MI
b
u
t
also
acco
un
ts
for
p
r
ed
ictab
ility o
f
sig
n
a
ls
fro
m
o
n
e
an
o
t
h
e
r as in
KCC
A
.
Au
t
h
o
r
s
sho
w
s th
at PM
I
h
a
s m
o
re im
p
r
o
v
e
d
feature
detectio
n
cap
ab
i
lity th
an
M
I
and
KC
C
A
, especi
al
l
y
i
n
cat
chi
ng s
u
s
p
i
c
i
ous c
o
i
n
ci
de
nces t
h
at
are ra
re b
u
t
pot
e
n
t
i
a
l
l
y
im
port
a
nt
n
o
t
onl
y
fo
r su
bse
que
n
t
experi
m
e
nt
al st
udi
es b
u
t
al
so fo
r b
u
i
l
d
i
ng com
put
at
i
onal
p
r
e
d
i
c
t
i
v
e
m
odel
s
whi
c
h i
s
dem
onst
r
at
ed
o
n
t
w
o t
o
y
dat
a
s
e
t
s
an
d a
real
i
n
t
r
usi
o
n
det
ect
i
on
sy
st
em
dat
a
set
[1
6]
.
So
ng
Q et
al
.
[
17]
pr
op
ose
d
a
Fast
cl
ust
e
ri
ng
base
d feat
ure
Sel
ect
i
on al
g
o
r
i
t
h
m
(FAS
T)
b
a
sed
on t
h
e
M
S
T
m
e
t
hod,
. The FA
ST al
g
o
ri
t
h
m
wor
k
s i
n
t
w
o st
eps
.
I
n
t
h
e fi
rst
st
ep, i
rrel
e
va
nt
feat
u
r
es are rem
ove
d an
d
seco
nd st
ep re
du
n
d
ant
feat
u
r
es
are
el
i
m
i
n
ated usi
n
g gra
p
h
-
t
h
e
o
ret
i
c
cl
ust
e
ri
n
g
m
e
t
hod. Feat
ures
i
n
di
f
f
ere
n
t
clu
s
ters are
relativ
ely in
d
e
p
e
n
d
e
n
t
the clu
s
t
e
ring
b
a
sed
strateg
y
of FAST h
a
s a
h
i
gh
p
r
ob
ab
ility o
f
prod
u
c
i
n
g
a subs
et
of
us
eful
an
d i
n
de
p
e
nde
nt
feat
ure
s
. M
i
t
r
a et
al
. Pro
p
o
sed
u
n
s
upe
r
v
i
s
ed feat
ure s
u
bset
sel
ect
i
on
m
e
t
hod u
s
i
n
g
feat
ure si
m
ilari
t
y
t
o
rem
ove re
du
n
d
anc
y
. Aut
h
o
r
s us
e a new m
e
asure cal
l
e
d m
a
xi
m
a
l
in
fo
rm
atio
n
com
p
ressio
n
i
n
d
e
x
to calcu
late th
e sim
ilarit
y
between t
w
o ra
ndom
variable
s for
feature se
lection
[18
]
. Battiti [1
9
]
i
n
t
r
od
uce
d
m
u
t
u
al
i
n
fo
rm
ati
on-
base
d
feat
ure selectio
n
al
g
o
rith
m called
MIFS. Th
is
algorithm
cons
iders
both feat
ure
-feat
ur
e
and feature-class
m
u
tual informati
on
for
feature selection. T
o
searc
h
optim
al feature
s
it uses a
gree
dy technique t
h
at m
a
ximizes
inform
ation about t
h
e class
la
bel.
K
w
ak et a
l
. [20]
extended a
n
MIFS algorith
m called
MIFS-U t
o
ov
erco
m
e
th
e li
mit
a
tio
n
s
of to
o
b
t
ain
b
e
tter
m
u
tu
a
l
i
n
f
o
rm
at
i
on be
t
w
een i
n
p
u
t
fe
at
ures a
n
d
o
u
t
put
cl
asse
s.
Pe
ng et al.
[3] i
n
troduced a m
u
tual inform
atio
n
base
d
feat
ure s
e
l
ect
i
on m
e
t
hod cal
l
e
d m
R
M
R
(M
axR
e
l
e
vanc
e
and M
i
n
-
R
e
d
u
n
d
a
n
cy
) t
h
at
m
i
nim
i
zes red
u
n
d
a
n
cy
am
ong
feat
ures
an
d m
a
xim
i
zes de
pe
nde
ncy
bet
w
ee
n a
feat
ure
su
bset
a
n
d
a cl
ass l
a
bel
.
In RELIEF algorithm
an instance
base
d at
tri
b
ut
e ran
k
i
n
g schem
e
. It
deal
s wi
t
h
i
n
com
p
l
e
t
e
, noi
sy
an
d m
u
ltic
lass d
a
tasets. It assig
n
s
a relev
a
n
c
e weigh
t
to
each
feat
u
r
e, sam
p
les an
instan
ce
rando
m
l
y fro
m
d
a
ta
and t
h
e
n
l
o
cat
i
ng i
t
s
neare
s
t
nei
g
hb
o
r
f
r
om
t
h
e sam
e
and
opposite class. The
values
of th
e attribu
t
es
o
f
t
h
e
nearest
neighbors are com
p
ar
ed
with the sam
p
led instan
ce which is
used
to updat
e relevance scores for each
at
t
r
i
but
e
[2
1]
.
R
e
l
i
e
f i
s
ext
e
n
d
ed
by
K
o
no
n
e
nk
o
[
22]
.
Ch
i
-
Squ
a
red is commo
n
statistic
al test to
m
easu
r
es
diffe
re
nce if
one as
sum
e
s the feature
occ
u
rre
nce is
act
u
a
l
l
y
i
ndepe
n
d
e
n
t
of
t
h
e cl
ass
val
u
e.
It
be
have
s
u
npred
ictab
l
y
for
v
e
ry sm
all
ex
p
ected
co
un
t
s
, wh
ich are
co
mm
o
n
in
tex
t
classificatio
n
.
In tex
t
classifi
catio
n
wo
rd
feat
ures
occu
rre
d rarely
[
23]
.
Thi
s
pa
pe
r p
r
o
pos
es feat
ure
sel
ect
i
on al
g
o
r
i
t
h
m
t
e
r
m
ed as IFS
A
(
I
n
f
or
m
a
t
i
on gai
n
b
a
sed Feat
ure
sub
s
et
Sel
ect
i
o
n Al
go
ri
t
h
m
)
t
o
rem
ove i
r
rel
e
vant
,
n
o
i
s
y
a
nd
re
du
n
d
ant
f
eat
ures a
n
d p
r
od
uce sm
al
l
feat
ure
subset in e
ffici
ent tim
e
as we
ll as im
prove
classifier
pe
rf
o
r
m
a
nce. IFS
A
al
go
ri
t
h
m
i
s
based
on
i
n
fo
r
m
at
i
o
n
g
a
in
m
easu
r
e.
IFSA wo
rk
s i
n
two
fo
ld
s: Fi
rst is to
ap
p
l
y fi
lter o
n
who
l
e dataset wh
ich
mak
e
s d
a
ta in
unifo
rm
m
a
nner
.
Seco
n
d
i
s
t
o
rem
ove i
rrel
e
va
nt
and
red
u
n
d
a
n
t
feat
ures a
nd
pr
o
d
u
ce opt
i
m
al feat
ure s
u
b
s
et
by
usi
n
g
i
n
f
o
rm
at
i
on ga
i
n
m
easure. F
o
r e
x
peri
m
e
nt
al
resul
t
s
we
use
d
m
i
croarr
ay
and
t
e
xt
da
t
a
set
s
whi
c
h
f
eat
ures
ran
g
es f
r
om
200
0 t
o
m
o
re t
h
an
70
0
0
. P
r
o
pos
ed al
g
o
r
i
t
h
m
i
s
co
m
p
ared wi
t
h
ot
her
wel
l
kn
ow
n f
eat
ur
e
sel
ect
i
on al
go
r
i
t
h
m
s
l
i
k
e R
e
li
ef and chi
-
sq
uare
d. Na
i
v
e bayes and
K-nearest nei
g
hbor classifier used
for
analyzing
t
h
e results.
The rest
o
f
t
h
e pa
per i
s
o
r
ga
ni
zed as f
o
l
l
o
w
s
. Det
a
i
l
s
of p
r
op
ose
d
IFSA al
go
ri
t
h
m
has bee
n
di
scuss
e
d i
n
se
ct
i
on 2
.
Sect
i
o
n 3
pre
s
ent
s
t
h
e bri
e
f
desc
ri
p
t
i
on o
f
C
l
assi
fi
cat
i
on m
e
t
hod
s use
d
f
o
r a
n
al
y
s
i
s
,
Ex
peri
m
e
nt
al
R
e
sul
t
s
di
sc
uss
e
d i
n
sect
i
o
n
4,
fi
nal
l
y
sec
t
i
o
n
5
o
f
t
h
i
s
pa
per
p
r
esent
s
t
h
e c
oncl
udi
ng
rem
a
rks
.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
SN
:
2
088
-87
08
I
J
ECE
Vo
l. 6
,
N
o
. 4
,
Au
gu
st 2
016
:
18
80
–
1
888
1
882
2.
IFSA FEAT
URE SUBSET
SELECTION
ALGORITHM
Th
is sectio
n
d
i
scu
ss
p
r
op
osed
alg
o
rith
m fo
r ob
tain
i
n
g o
p
tim
a
l
featu
r
e sub
s
et u
s
ing
filter and
i
n
f
o
rm
at
i
on ga
i
n
m
easures
na
m
e
d as
IFS
A
.
Whi
l
e
e
v
al
uat
i
ng
hi
gh
di
m
e
nsi
onal
dat
a
wi
t
h
s
o
m
e
wel
l
k
n
o
w
n
featu
r
e selection
al
g
o
rith
m
,
it tak
e
s m
o
re com
p
u
t
atio
n
a
l
time to
ob
tain go
od
feature
sub
s
et.
To ov
erco
m
e
th
is
we pr
o
p
o
s
ed f
eat
ure sel
ect
i
o
n al
go
ri
t
h
m
based o
n
I
n
f
o
rm
at
i
on gai
n
m
easure
.
IFS
A
w
o
rks i
n
t
w
o f
o
l
d
s:
Fi
rst
is ap
p
l
ying
d
i
screte un
su
p
e
rvised
attribu
t
e fi
lter o
n
wh
ole datasets
D suc
h
as F{x1…
.
xn}
to prepa
r
e
t
h
e data
in
un
ifo
r
m
m
a
n
n
e
r. Seco
nd
co
m
p
u
t
e th
e i
n
fo
rm
atio
n
g
a
i
n
measu
r
e
o
n
filtered d
a
ta t
o
o
b
tain
op
tim
a
l
featu
r
e
sub
s
et
by
rem
ovi
n
g
i
r
rel
e
va
nt
and
re
d
u
n
d
a
n
t
feat
ure
s
.
2.
1.
Attribute Filter
In m
achine learni
ng m
o
st of the classification task
s learn
i
ng
to
no
m
i
n
a
l c
l
ass v
a
lu
es,
b
u
t so
m
e
m
a
y
contains
features that a
r
e m
i
xed
varia
b
les such
as
o
r
d
i
n
a
l
or co
n
tinuo
us as well as no
m
i
n
a
l. To
d
eal wit
h
su
ch
m
i
xed vari
a
b
l
e
s
m
a
ny
m
achine l
earni
n
g
al
go
ri
t
h
m
s
have
been
devel
o
p
e
d. R
a
t
h
e
r
t
h
a
n
deal
i
n
g wi
t
h
m
i
xed
vari
a
b
l
e
som
e
m
achi
n
e l
earni
ng al
go
ri
t
h
m
s
wo
rk
ed
on
u
n
i
f
o
r
m
dat
a
. On
e of t
h
e m
o
st
com
m
on
m
e
t
h
ods
o
f
accom
p
lishing this is called
discretization. Discretiza
tion
is the proces
s of tra
n
sform
i
ng continuous value
d
at
t
r
i
but
es t
o
no
m
i
nal
.
Discretizatio
n
classified
in
to
th
r
ee cat
eg
ori
e
s:
Supe
r
v
i
s
ed
vers
us
Un
su
pe
rvi
s
e
d
,
Gl
o
b
al
vers
us L
o
ca
l
and
St
at
i
c
ver
s
us
dy
nam
i
c. C
l
ass l
a
bel
i
s
use
d
f
o
r
di
cret
i
zat
i
on i
n
s
u
p
e
rvi
s
e
d
m
e
t
hods a
n
d
u
n
s
upe
rvi
s
e
d
di
scret
i
zat
i
o
n
i
s
u
n
k
n
o
w
n
a
b
o
u
t
cl
ass l
a
bel
.
I
t
di
vi
de
s t
h
e
ra
nge
o
f
ob
ser
v
e
d
val
u
es
f
o
r
a
feat
ure
i
n
t
o
p
equal
si
zed bi
ns
, wh
ere p i
s
a param
e
t
e
r pro
v
i
d
e
d
by
t
h
e us
er. T
h
e differe
n
ce between gl
obal
and local m
e
th
ods is
base
d on
whe
n
di
scret
i
zat
i
o
n i
s
perf
o
r
m
e
d. Feat
u
r
es ar
e di
scri
t
i
ze pri
o
r t
o
i
n
du
ct
i
o
n i
s
a Gl
obal
m
e
t
h
o
d
whe
r
eas di
sc
re
t
i
zat
i
on carry
out
d
u
r
i
n
g t
h
e i
n
d
u
ct
i
on
pr
oc
ess i
s
l
o
cal
m
e
t
h
o
d
. Dy
nam
i
c
m
e
t
hods searc
h
t
h
e
space
of possi
ble k
values
at the sa
m
e
tim
e
for all features
[24].
2.
2.
Information Gain
Inform
atio
n
gain
is a sim
p
lest attrib
u
t
e
rank
in
g
m
e
th
o
d
,
wid
e
ly used
in tex
t
categ
o
rizatio
n
applications and now days for
m
i
croarray da
ta analysis an
d im
age data analysis. If A is
an attribute and C is
t
h
e cl
ass,
Eq
ua
t
i
ons
A
1
a
n
d
A
2
s
h
o
w
t
h
e e
n
t
r
opy
o
f
t
h
e
class be
fore a
n
d after obse
rvi
n
g the attribute
[25].
H
C
log2p
c
,
H(C|A
)
=
∑
∑
|
log2pc|a
(A
1)
The am
ount by
which t
h
e e
n
tropy
of the
clas
s dec
r
eases
re
fl
ects th
e ad
d
ition
a
l inform
atio
n
ab
ou
t the
class p
r
ov
id
ed
b
y
th
e attribu
t
e an
d is called in
fo
rm
atio
n
gain
. Each
attrib
u
t
e
A
i
is assigned a
score
ba
sed
on
th
e inform
atio
n
g
a
in
b
e
tween itself and
t
h
e class:
IG
i
= H(
C)
−
H(
C|A
i
)
= H(A
i
)
−
H(
A
i
|C)
= H(A
i
) +
H
(
C
)
−
H(
A
i
,C
).
(A
2)
Th
e inform
atio
n
g
a
i
n
calcu
late th
e en
tro
p
y
wh
ere it
is a d
i
fferen
ce
b
e
tween
th
e
p
r
ior
u
n
certain
ty and
expect
e
d
post
e
ri
or u
n
cert
a
i
n
t
y
usi
n
g
X feat
ure
.
C
o
nsi
d
e
r
one e
x
am
ple feature
Y is pre
f
erred t
o
feat
ure Z if
th
e in
fo
rm
atio
n
g
a
i
n
fro
m
featu
r
e Y is
g
r
eater th
an
th
at
fro
m
f
eatu
r
e.
In
fo
r
m
atio
n
g
a
in
use
d
a ranke
r
search
m
e
t
hod
ol
o
g
y
[
25]
.
IFSA
Al
gori
t
h
m
Input :
Data Sample D with features F(x
1
...x
n
)
Ɵ
Threshold value
Output :
Optimal Feature Subset
Steps :
1.
Apply attribute filter on training dataset D
2.
Compute Information gain by using equation A1 and A2
IG
H
Y
⁄
//X & Y are features
3.
If(IG >
Ɵ
)
Evaluation Warning : The document was created with Spire.PDF for Python.
I
J
ECE
I
S
SN
:
208
8-8
7
0
8
Efficien
t
Fea
t
ure Sub
set S
e
lectio
n
Al
g
o
r
it
h
m
fo
r Hi
gh
Dimen
s
iona
l Da
ta (S
mita Cho
r
mun
g
e
)
1
883
4.
Verify classifier performance
IFS
A
al
g
o
ri
t
h
m
used u
n
su
p
e
rvi
s
e
d
di
scret
i
zat
i
on ap
pr
oa
ch w
h
i
c
h di
vi
des t
h
e ra
nge
of
obse
r
ve
d
val
u
es
f
o
r a
fe
at
ure i
n
t
o
p eq
ual
si
zed
bi
ns
,
t
h
i
s
p i
s
a
pa
r
a
m
e
t
e
r defi
ne
d
by
use
r
.
Thi
s
m
e
t
hod
pre
p
a
r
ed t
h
e
d
a
taset in
u
n
i
fo
rm
fash
ion
.
On
ce
d
a
ta arran
g
e
u
n
i
form
l
y
th
en
calcu
late th
e in
form
a
tio
n
g
a
i
n
to
fi
n
d
the
rel
e
va
nt
and n
on re
d
u
n
d
a
n
t
f
eat
ure su
bset
.
H (Y
) an
d H
(Y/
X
) are com
put
e
d
usi
n
g eq
uat
i
on
A1 a
n
d
A2.
Datasets ev
aluated
on
10
-fo
ld
(k
) cross-v
a
lid
atio
n
strate
g
y
, wh
ere th
e
d
a
t
a
is rand
o
m
ly
sp
lit in
to
10
mu
tu
ally
exclusi
v
e subs
ets of approxim
ately equal size. A learni
ng algorithm
is
t
r
ai
ned a
nd t
e
st
e
d
1
0
t
i
m
e
s;
each t
i
m
e
it is tested
on
on
e
o
f
th
e
k
fo
l
d
s and
train
e
d
u
s
ing
t
h
e rem
a
in
in
g k
−
1
f
o
l
d
s. I
n
f
o
rm
ati
on
gai
n
m
easure
r
a
nke
d
t
h
e feat
ure
s
. S
o
here
we appl
i
e
d t
h
res
hol
d v
a
l
u
e
Ɵ
whi
c
h i
s
user de
fi
ne
d.
Feat
ures are s
e
lected whose Gain
val
u
e i
s
m
o
re t
h
an t
h
res
hol
d val
u
es
. Fo
r ex
peri
m
e
nt
al
work t
h
resh
ol
d
va
l
u
e i
s
defi
ne
d -1
.7
. Feat
ures
wh
ose
gai
n
val
u
e i
s
m
o
re t
h
an t
h
re
shol
d val
u
es ar
e consi
d
ere
d
i
n
o
p
t
i
m
al
subs
et
. To ve
ri
fy
and a
n
al
y
ze res
u
l
t
s
w
e
use
d
two class
i
fiers Naïve
ba
yes and IB
K c
l
assifier.
C
o
mp
u
t
ation
a
l tim
e
is ev
aluated
fo
r
d
i
fferen
t
d
a
tasets
by
usi
n
g I
F
S
A
fo
r
10
f
o
l
d
s
cr
oss
val
i
d
at
i
o
n
st
rat
e
gy
t
o
o
b
t
a
i
n
o
p
t
i
m
al
feat
ure s
u
bset
.
Ti
m
e
co
m
p
lex
ity an
alysis o
f
p
r
o
p
o
s
ed
algo
rith
m
is
calc
u
lated
con
s
id
ering
two
step
s. In
first step
con
s
i
d
er
s w
hol
e dat
a
set
‘n’
.
Next
st
ep c
o
m
put
i
n
g i
n
fo
rm
at
i
on gai
n
fo
r n
feat
ure
s
i
t
co
m
put
e as O (
n
+
n
). It
i
s
eq
u
a
l t
o
O
(2n
)
. Ov
erall co
m
p
lex
ity is O (n).
3.
CLAS
SIFI
C
A
T
ION METH
ODS
U
S
ED F
O
R A
NAL
YS
IS
3.
1.
Nai
v
e B
ayes
c
l
assi
fi
er
Naïv
e Bayes classifier is a sim
p
le p
r
o
b
a
b
ilistic classifier with
assu
m
p
ti
o
n
o
f
i
n
d
e
p
e
nd
en
ce am
o
n
g
attributes. As com
p
are to other cla
ssifiers it provides better classificatio
n accuracy on gene expres
sion data.
This classifier learns from
tr
ain
i
n
g
d
a
ta and
pred
icting
the class o
f
th
e test in
stan
ce with
h
i
gh
er posterio
r
p
r
ob
ab
ility. Let C ={c1
,c2,...,cl} b
e
th
e set of i classes, and let W
={w
1
,..
.,
w
m
} be the set of
features enc
l
ose
d
in
th
ese classes. Gi
v
e
n a
n
e
w
d
o
c
u
m
en
t d
,
t
h
e prob
ab
ility th
at d
b
e
lon
g
s to
class c
i
i
s
gi
ve
n
by
B
a
y
e
s r
u
l
e
,
pd|
ci
|
A3
(A
3)
Nai
v
e B
a
y
e
s cl
assi
fi
er req
u
i
r
es num
ber of
param
e
t
e
rs l
i
n
ear i
n
t
h
e am
ount
o
f
feat
u
r
es
i
n
a l
earni
ng
problem because of its high scalability.
Many of clas
sifier use
d
expensive iterative ope
ration but bayes
evaluates a
closed-form
expre
ssion to
get m
a
x
im
u
m
-lik
elih
o
o
d
t
r
ain
i
ng
wi
th
lin
ear ti
m
e
[2
6
]
.
3.
2.
K-ne
ares
t nei
g
hb
or
(IBK
)
The K Nea
r
est
-
Nei
g
hbors
is a
non-linea
r
cl
assi
fi
er w
h
i
c
h i
s
use
d
fo
r
e
x
p
e
ri
m
e
nt
al
st
udy
.
The
out
put
of t
h
i
s
cl
assi
fi
er i
s
de
pen
d
s
on t
w
o t
h
i
n
gs,
whet
her
k
-
N
N
i
s
used
f
o
r cl
assi
fi
cat
i
on
or
reg
r
essi
o
n
.
I
n
k
-
N
N
classification,
the out
put is a class
m
e
m
b
er
sh
ip. An
o
b
j
e
ct is c
l
assified
b
y
a
m
a
j
o
rity
v
o
t
e of its n
e
ig
hbo
rs,
wi
t
h
t
h
e
o
b
ject
bei
n
g assi
gne
d t
o
t
h
e cl
ass
m
o
st
co
m
m
on am
ong
i
t
s
k
ne
arest
nei
g
h
b
o
rs
. Fo
r e
x
am
pl
e g
iv
en
a
reg
u
l
a
ri
zat
i
o
n
param
e
t
e
r k a
n
d a
n
e
x
am
pl
e x
,
t
h
e
k
-
N
N
t
e
c
hni
que
co
nsi
d
e
r
s t
h
e
k
t
r
ai
ni
n
g
e
x
am
pl
es cl
osest
t
o
x accordi
ng to
their distance i
n
the feature s
p
ace {0, 1}
K
and
gives as
predicted cl
ass the dom
inant rea
l
class
am
ong t
h
ose
k exam
pl
es
[27
]
. In
K-NN
regression
, th
e o
u
t
pu
t is th
e p
r
op
erty v
a
lue fo
r th
e obj
ect. I
t
calculates the
avera
g
e of the
values of
k-nearest neighbors. The algorit
h
m
is also able to calculate
values
co
n
tinuo
usly fo
r a targ
et
[28
]
. It is
sen
s
itiv
e
to
lo
cal stru
ct
ure
o
f
d
a
ta.
4.
R
E
SU
LTS AN
D ANA
LY
SIS
Th
is sectio
n
we presen
t the d
e
tails o
f
datasets wh
ich
is u
s
ed
fo
r em
p
i
rical
stu
d
y
, ex
p
e
rim
e
n
t
al
resul
t
s
obt
ai
ne
d f
o
r
p
r
o
p
o
se
d
al
go
ri
t
h
m
(IFSA)
an
d c
o
m
p
arison
resu
lts
with
Relief and
Ch
i-squ
a
red
featu
r
e
sel
ect
i
on m
e
t
hods
. T
h
i
s
st
ud
y
i
s
on
di
ffe
r
e
nt
dat
a
s
e
t
s
suc
h
as C
o
l
o
n
cance
r, SR
B
C
T an
d Ly
m
phom
a,
Leu
k
e
m
i
a
, C
N
S, DB
Wo
rl
d
_
b
o
d
i
e
s,
DB
Worl
d_
b
odi
es
_st
e
m
m
ed whi
c
h
have ve
ry
hi
gh
feat
ure
s
[2
9]
. The
sum
m
ary
of da
t
a
set
s
use
d
f
o
r
em
pi
ri
cal
st
ud
y
sho
w
n i
n
Ta
bl
e 1
.
4.
1.
Da
ta
set Descr
i
ption
C
o
l
o
n C
a
nce
r
cont
ai
n
s
6
2
sa
m
p
l
e
s col
l
ect
ed f
r
om
pat
i
e
nt
s
.
Am
on
g t
h
em
, 40
t
u
m
o
r bi
op
si
es are f
r
o
m
tu
m
o
rs (lab
eled
as "n
eg
ativ
e") and
22
n
o
rmal (lab
eled
as "po
s
itiv
e")
bio
p
s
ies are
from
h
ealth
y p
a
rts of th
e
co
lon
s
of th
e sa
m
e
p
a
tien
t
s, th
e to
tal n
u
m
b
e
r o
f
g
e
n
e
s to
b
e
tested
is 2
0
0
0
. SRBCT Gen
e
exp
r
ession d
a
ta
(2
3
08
ge
nes f
o
r 8
3
sam
p
l
e
s) f
r
om
t
h
e
m
i
croarray
ex
pe
ri
m
e
nt
s o
f
Sm
al
l
Rou
n
d
B
l
ue C
e
l
l
Tum
o
rs (SR
B
C
T
).
A t
o
t
a
l
of
6
3
t
r
ai
ni
ng
sam
p
l
e
s an
d
25
t
e
st
sa
m
p
l
e
s are pr
o
v
i
ded.
Ly
m
pho
m
a
i
s
a br
oa
d t
e
rm
encom
p
assi
ng
a
vari
et
y
of ca
nc
ers o
f
t
h
e l
y
m
phat
i
c
sy
st
em
.
Tot
a
l
num
ber
of
genes i
s
4
0
26 a
nd t
h
e n
u
m
ber of sam
p
les i
s
62.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
SN
:
2
088
-87
08
I
J
ECE
Vo
l. 6
,
N
o
. 4
,
Au
gu
st 2
016
:
18
80
–
1
888
1
884
There
are
all toget
h
er three t
y
pes
of lym
p
hom
a
s.
T
h
e
first category, C
h
ronic Lym
phocytic Lym
phoma, the
second type F
o
llicular Lym
p
hom
a and
the
third type Di
ffuse Large B-c
e
ll Lym
phom
a. T
h
e
Le
uke
m
ia
data
set
co
n
t
a
i
n
s
e
x
p
r
e
ssi
o
n
l
e
ve
l
s
of 71
2
9
g
e
nes
t
a
ke
n o
v
er
7
2
sam
p
les. Lab
e
l
s
i
n
di
c
a
t
e
wh
i
c
h
of
t
w
o
v
a
r
i
a
n
ts o
f
le
uk
em
ia is p
r
ese
n
t
in
t
h
e
sa
m
p
le (A
ML
, 2
5
sa
m
p
les,
o
r
ALL,
4
7
sa
m
p
l
e
s).
C
N
S
m
i
cro
a
rray
d
a
taset rep
r
esen
t a h
e
terog
e
neo
u
s gro
u
p
o
f
t
u
m
o
rs abou
t wh
ich
little is k
n
o
w
n
b
i
o
l
o
g
i
cal
ly. Th
e to
tal
nu
m
b
er
o
f
g
e
n
e
s is 7129
an
d th
e
nu
mb
er of
sam
p
les is 42
as sho
w
n
in
Tab
l
e 1.
Tabl
e 1. Sum
m
a
ry
o
f
Dat
a
set
s
use
d
fo
r
em
pi
ri
cal
st
udy
Datasets Features
Instances
Do
m
a
in
Colon Cancer
2000
62
M
i
cr
oarr
ay
SRBCT 2308
83
Micr
oarr
ay
DBW
o
r
l
d_bodies_
S
tem
m
ed
3722
64
T
e
xt
L
y
m
pho
m
a
4026
62
M
i
cr
oarr
ay
DBW
o
r
l
d_bodies 4703
64
T
e
xt
L
e
ukem
i
a 7129
72
M
i
cr
oarr
ay
CNS 7129
60
M
i
cr
oarr
ay
D
B
W
o
r
l
d
_bo
dies, D
B
W
o
r
l
d_
bod
ies_
stemmed
ar
e tex
t
datasets. It co
n
t
ain
s
DB
World
m
a
il
in
g
list
w
h
er
e it classif
i
ed
in
to
anno
un
ces
o
f
conf
er
en
ce an
d
ev
er
yth
i
n
g
else. D
B
W
o
r
l
d_
bo
d
i
es con
t
ain
s
4703
at
t
r
i
but
es a
nd
64 i
n
st
ance
s.
DB
Worl
d_
b
o
d
i
es_st
e
m
m
e
d cont
ai
n
s
3
7
2
2
f
eat
ures a
n
d 6
4
i
n
st
ances
.
W
e
ka t
o
o
l
[30
]
is
u
s
ed fo
r an
alyzin
g th
e
resu
lts.
4.
2.
Com
p
ari
s
on
E
val
u
a
ti
on
We analyses the perform
ance of Naïve
bayes and
IB
K C
l
assi
fi
er by
co
m
put
i
ng t
i
m
e
and acc
uracy
fo
r al
l
dat
a
set
s
by
usi
ng
IF
SA
,
R
e
l
i
e
f and C
h
i
-
sq
ua
red
feat
u
r
e sel
ect
i
on al
g
o
ri
t
h
m
s
. Tabl
e
2 s
h
o
w
s e
v
al
ua
t
i
on
resu
lts of feature selectio
n
alg
o
rith
m
s
fo
r Naïv
e b
a
yes
classifier. It re
pres
ents the tim
e
a
nd
f-m
easure for all
dat
a
set
s
by
usi
n
g
IF
SA
al
g
o
ri
t
h
m
and ot
her
t
w
o
m
e
t
hods C
h
i
-
s
qua
re
d a
n
d
R
e
l
i
e
f.
From
resul
t
s
o
f
Tabl
e
2
we f
o
u
n
d
t
h
at
I
F
S
A
t
a
kes l
e
ss c
o
m
put
at
i
onal
t
i
m
e
t
o
obt
ai
n f
eat
ure s
u
b
s
e
t
for all
d
a
tasets as co
m
p
ared
t
o
Relief m
e
th
od
.
Whereas
ac
curacy
of IFSA, Relief
an
d
C
h
i
-
s
qua
red
i
s
alm
o
st
sam
e
. Co
m
p
u
t
atio
n
a
l ti
m
e
o
f
Relief
m
e
th
o
d
in
creases as
i
n
creasi
ng i
n
num
b
er of
features. C
o
m
p
aring IFSA
alg
o
rith
m
with
ch
i-squ
a
red
meth
od
, SRBCT d
a
taset co
m
p
u
t
atio
n
a
l ti
m
e
is
0
.
45
sec fo
r IFSA an
d
0
.
6
3
sec for
chi
-
s
qua
re
d. F
o
r t
e
xt
dat
a
D
B
W
o
rl
d
_
b
o
d
i
e
s_st
em
m
e
d co
m
put
at
i
onal
t
i
m
e
i
s
0.14 sec
for
IFS
A
an
d
0.
17 sec
for Chi-squa
re
d m
e
thod. C
o
m
putati
onal tim
e
of SRBCT
and DB
World_bod
ies_stem
m
e
d datasets is less for
IFS
A
t
h
a
n
C
h
i
-
sq
ua
red m
e
t
hod
. Dat
a
set
s
C
o
l
o
n C
a
nce
r
, L
y
m
phom
a, DB
Wo
rl
d
_
b
o
d
i
e
s,
Leukem
i
a and
C
N
S
co
m
p
u
t
atio
n
a
l
ti
m
e
is less fo
r
Ch
i-sq
u
a
red as co
m
p
ared to
IFSA.
Tab
l
e 3 show
s th
e ev
al
u
a
tion o
f
K-
n
e
ar
est neig
hb
or
cl
assi
f
i
er o
n
m
i
croarr
ay
and t
e
xt
dat
a
set
s
. He
re
al
so IFS
A
i
s
b
e
t
t
e
r i
n
perf
or
m
a
nce t
h
an R
e
l
i
e
f feat
ure s
ubs
et
sel
ect
i
on
m
e
t
hod. R
e
l
i
e
f t
a
kes l
a
rge t
i
m
e for
Leu
k
em
i
a
and C
N
S dat
a
set
6.
14 a
n
d 5.
6
1
sec respect
i
v
el
y. IFSA algo
ri
th
m
is efficien
t in
ti
m
e
fo
r
Co
lon
Can
cer
, SRBC
T, an
d D
B
W
o
r
l
d_
bod
ies_
stemmed
as co
mp
are
d
t
o
c
h
i
-
s
qua
re
d m
e
t
hod
. Acc
u
racy
o
f
K-
N
N
cl
assi
fi
er f
o
r
al
l
dat
a
set
s
by
us
i
ng
IFS
A
, Relief
an
d Ch
i-
squar
e
d is alm
o
st sam
e
.
Tabl
e
2. E
v
al
u
a
t
i
on
of
Feat
u
r
e sel
ect
i
on m
e
tho
d
s
f
o
r
Naï
v
e
bay
e
s cl
assi
fi
e
r
Datasets IFSA
Relief
Chi-squared
Ti
m
e
F-m
e
asure
Ti
m
e
F-m
e
asure
Ti
m
e
F-m
e
asure
Colon Cancer
0.
42
0.
555
1.
91
0.
555
0.
36
0.
555
SRBCT 0.45
0.988
2.84
0.988
0.63
0.988
DBW
o
r
l
d_bodies_
s
tem
m
ed
0.
14
0.
754
1.
48
0.
754
0.
17
0.
754
L
y
m
pho
m
a
0.
95
0.
953
4.
3
0.
936
0.
52
0.
953
DBW
o
r
l
d_bodies
0.
31
0.
736
3.
24
0.
736
0.
16
0.
736
L
e
ukem
i
a
1.
64
0.
972
5.
16
0.
972
0.
92
0.
972
CNS
1.
3
0.
622
4.
72
0.
607
0.
86
0.
622
Evaluation Warning : The document was created with Spire.PDF for Python.
I
J
ECE
I
S
SN
:
208
8-8
7
0
8
Efficien
t
Fea
t
ure Sub
set S
e
lectio
n
Al
g
o
r
it
h
m
fo
r Hi
gh
Dimen
s
iona
l Da
ta (S
mita Cho
r
mun
g
e
)
1
885
Table
3. E
v
al
uation
of Feature selecti
on m
e
tho
d
s
f
o
r
IBK
c
l
assifier
Datasets IFSA
Relief
Chi-squared
Ti
m
e
F-m
e
asure
Ti
m
e
F-m
e
asure
Ti
m
e
F-m
e
asure
Colon Cancer
0.
19
0.
774
1.
55
0.
774
0.
28
0.
774
SRBCT 0.2
0.842
3.28
0.842
0.44
0.842
DBW
o
r
l
d_bodies_
s
tem
m
ed
0.
16
0.
591
2.
67
0.
591
0.
19
0.
591
L
y
m
pho
m
a
0.
44
0.
97
3.
16
0.
97
0.
63
0.
97
DBW
o
r
l
d_bodies
0.
23
0.
502
2.
44
0.
502
0.
19
0.
502
L
e
ukem
i
a
1.
28
0.
873
6.
14
0.
873
0.
61
0.
873
CNS
0.
99
0.
571
5.
61
0.
571
0.
78
0.
571
Fi
gu
re
1 sh
o
w
s t
h
e c
o
m
p
ari
s
o
n
g
r
a
ph
o
f
feat
u
r
e s
u
bs
et
sel
ect
i
on al
go
ri
t
h
m
s
for
naï
v
e
bay
e
s
classifier. It
re
prese
n
ts
t
h
e
c
o
m
p
arison of three feat
ure s
e
lection algorithm
s
for
nu
mb
er of
f
eatur
es w
i
t
h
resp
ect to
co
m
p
u
t
ation
a
l ti
m
e
calcu
lated
in
seco
nd
s.
Co
m
p
arison
of IFSA, Relief an
d
chi-squ
a
red
m
e
th
o
d
is
sho
w
n i
n
fi
g
u
r
e
1.
F
r
om
t
h
e
resul
t
s
we
o
b
s
e
rve
d
t
h
at
R
e
l
i
e
f m
e
t
hod t
a
k
e
s l
a
rg
e t
i
m
e to c
o
m
put
e o
p
t
i
m
a
l
feat
ure s
u
bset
t
h
an ot
her t
w
o
m
e
t
hods. I
F
S
A
w
o
r
k
s ef
fi
ci
ent
l
y
for dat
a
s
e
t
s
whi
c
h feat
ures r
a
n
g
es 2
0
00 t
o
40
0
0
. A c
h
i
-
s
q
uare
d m
e
t
hod i
s
effi
ci
ent
fo
r l
a
rge
di
m
e
nsi
onal
dat
a
set
s
l
i
k
e l
e
ukem
i
a and
C
N
S wh
ose fe
at
ures
are m
o
re tha
n
7000.
Fi
gu
re
2 s
h
ow
s t
h
e c
o
m
p
ari
s
on
g
r
a
ph
o
f
f
e
at
ure s
u
bset
se
l
ect
i
on al
g
o
ri
t
h
m
s
for
IB
K
c
l
assi
fi
er.
It
sho
w
s
t
h
e c
o
m
p
ari
s
on
of
I
FSA
, R
e
l
i
e
f a
n
d
C
h
i
-
sq
uare
d f
eat
ure
sel
e
c
t
i
on m
e
t
hods
base
d
on
com
put
at
i
o
nal
t
i
m
e
for di
ffe
r
e
nt
dat
a
set
s
. I
n
t
h
i
s
g
r
ap
h
we o
b
ser
v
e
d
t
h
at
com
put
at
i
onal
t
i
m
e
of Rel
i
e
f
m
e
t
hod f
o
r I
B
K
cl
assi
fi
er i
s
ve
ry
hi
g
h
fo
r al
l
dat
a
set
s
as c
o
m
p
ared t
o
ot
her t
w
o al
g
o
ri
t
h
m
s
. C
o
m
put
at
i
onal
t
i
m
e
of
IFS
A
alg
o
rith
m
is efficien
t th
an
Relief alg
o
rith
m
an
d co
m
p
ar
e
d
wi
t
h
C
h
i
-
s
qua
r
e
d m
e
t
hod
t
i
m
e
va
ri
es as
pe
r
t
h
e
d
a
tasets.
The
perform
ance of IB
K a
nd Naïve
bayes
classifier
fo
r IFSA algo
r
ith
m
is show
n in
Fig
u
r
e
3
.
I
t
sho
w
s t
h
at
IB
K cl
assi
fi
er co
m
put
at
i
onal
perf
orm
a
nce i
s
bet
t
e
r t
h
an Naï
v
e bay
e
s cl
assi
fi
er by
usi
n
g p
r
op
ose
d
al
go
ri
t
h
m
.
Perform
ance of IB
K an
d Naï
v
e b
a
y
e
s cl
assi
fi
er
usi
n
g R
e
l
i
e
f and chi
-
s
q
uare
d m
e
t
hod are s
h
ow
n i
n
Fi
gu
re 4 a
n
d 5
respect
i
v
el
y
.
H
e
re we
ob
ser
v
e
d
t
h
at
com
put
a
t
i
onal
t
i
m
e
i
s
vari
es f
o
r
bot
h c
l
assi
fi
ers as pe
r t
h
e
datasets. Fi
gure 6
represe
n
ts the accuracy
of classi
fi
ers by
usi
ng IFSA
algorithm
.
Based on obse
rvations
Naïve bayes cl
assifier is favorable in accura
cy point
of vie
w
. From
observations,
Classifiers com
putational
per
f
o
r
m
a
nce i
m
prove
d by
u
s
i
ng
pr
o
pose
d
al
gori
t
hm
as
com
p
ared
wi
t
h
R
e
l
i
e
f al
go
ri
t
h
m
and C
h
i
-
s
qua
re
d
m
e
t
hod.
Fi
g
u
re
1
.
C
o
m
p
ari
s
on
gra
p
h
of
feat
u
r
e sel
ect
i
on
Figure
.2. Com
p
aris
on gra
p
h
of
feature selec
tion
algorit
h
m
s
for
naï
v
e
bayes classifier
al
gor
ith
m
s
for IB
K
classifier
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
SN
:
2
088
-87
08
I
J
ECE
Vo
l. 6
,
N
o
. 4
,
Au
gu
st 2
016
:
18
80
–
1
888
1
886
Figure
3.
Per
f
orm
a
nce of Classifiers
f
o
r I
F
S
A
Figure
4.
Performance of class
i
fiers for Relief
algorithm
al
gorithm
Figure
5.
Per
f
orm
a
nce of classifiers
f
o
r
Figure
6. Clas
sifiers
Accurac
y
using
IFS
A
C
h
i-s
q
uare
d m
e
thod
Al
gori
thm
5.
CO
NCL
USI
O
N
In t
h
i
s
pa
pe
r,
IFS
A
(I
n
f
o
r
m
a
t
i
on gai
n
bas
e
d Feat
u
r
e Sel
ect
i
on Al
g
o
r
i
t
h
m
)
has been
pr
o
pose
d
t
o
obt
ai
n
o
p
t
i
m
al
feat
ure
s
ubs
et
s i
n
ef
fi
ci
en
t
t
i
m
e
and i
m
prove c
o
m
put
at
i
onal
per
f
o
rm
ance o
f
l
earni
ng
alg
o
rith
m
s
. IFSA
work
s in two fo
ld
s:
first
ap
p
l
y filter
on
train
i
ng
d
a
taset. Second
is to
rem
o
v
e
irrelevan
t
and
red
u
nda
nt
feat
ures
by
u
s
i
n
g
In
fo
rm
at
i
on ga
i
n
m
easure. C
o
m
put
at
i
onal
t
i
m
e
i
s
cal
cul
a
ted t
o
obt
ai
n o
p
t
i
m
a
l
feat
ure
su
bset
by
usi
ng
pr
o
p
o
s
ed al
g
o
ri
t
h
m
fo
r hi
gh
di
m
e
nsi
onal
dat
a
. I
F
SA al
g
o
r
i
t
h
m
per
f
o
r
m
s
rem
a
rka
b
l
y
well in term
s of classification
perform
a
nce for microarray
an
d
tex
t
d
a
taset
s
as co
m
p
ared
to
ex
isting
algo
rith
m
lik
e Relief an
d Ch
i-squ
a
red
alg
o
rith
m
with
resp
ect to
two classifiers Naïve bayes and
IB
K. T
h
e res
u
lt shows
th
at IFSA algorith
m
is efficien
t in co
m
p
u
t
atio
n
a
l tim
e fo
r
d
a
tasets
wh
ich is used fo
r em
p
i
rical stu
d
y
th
an
R
e
l
i
e
f al
go
ri
t
h
m
and C
h
i
-
s
q
uare
d m
e
t
hod.
Fu
rt
he
r we
c
a
n e
x
t
e
n
d
t
h
i
s
wo
r
k
t
o
i
m
pr
ove
t
h
e acc
ur
acy
of
classifiers
by using differe
n
t
feature selection m
easures.
REFERE
NC
ES
[1]
T. Hl
aing
, “
F
eat
ure S
e
l
ect
ion
an
d F
u
zz
y De
cis
i
o
n
Tree for
Netw
ork Intrusion Detection,”
In
terna
tional Journal
o
f
Informatics
and Communication Technology
(
I
J-ICT)
, vol/issue: 1
(
2), pp
. 109~118
, 2012
. ISSN: 22
52-8776.
[2]
P. A. Estévez,
et al.
, “
N
orm
a
liz
e
d
Mutual
Inform
ation
Featur
e Se
lec
tion,
”
I
EEE Transactions on
Neural Networks
,
vol/issue: 20(2), pp-189-201,
200
9.
[3]
H. Peng,
et al.
, “Feature Selection Based on Mutual Informatio
n:
Criteria of Max-Dependen
c
y
,
Max-Relevan
c
e,
and Min-Redundancy
,
”
IEEE Transactions on
Pattern A
nalysis and Machin
e Intelligen
ce
, vol/issue: 27(8), pp.
1226-1238, 200
5.
[4]
I. Gu
y
on and
A. Elisseeff, “An
Introducti
on to
Variable and
Feature Selection
,
”
J.
Machin
e L
e
a
r
ning Res
e
ar
ch
,
vol. 3
,
pp
. 11571
182, 2003
.
[5]
M.
Da
sh a
n
d H.
L
i
u, “Fe
a
t
ure Sel
e
ct
i
on for Cla
ssi
fi
ca
t
i
o
n,
”
Inte
lli
gent Data
Anal
y
s
is
, vol/issue: 1(
3), pp
. 131-156
,
1997.
[6]
H. Frohlich,
et al.
, “Feature selection for
support vector machin
es b
y
means of
genetic algorith
m, in: Tools with
Arti
fi
cial
Int
e
ll
ig
ence
,”
Proceedings. 15th
IEEE I
n
ter
national Co
nference on
, I
E
EE
, pp
. 142–148
,
2003.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
J
ECE
I
S
SN
:
208
8-8
7
0
8
Efficien
t
Fea
t
ure Sub
set S
e
lectio
n
Al
g
o
r
it
h
m
fo
r Hi
gh
Dimen
s
iona
l Da
ta (S
mita Cho
r
mun
g
e
)
1
887
[7]
S. W. Lin,
et a
l
.
, “
A
n intell
igen
t
algori
t
hm
with featur
e sel
ect
io
n
and decision ru
les applied to
an
omaly
in
trusion
dete
ction
,
”
Applied Soft Computing
, vol/issue: 12
(10), pp
. 3285–3
290, 2012
.
[8]
L. Yu and H.
Liu, “Redundan
c
y
b
a
sed fe
atur
e selection fo
r
microarray
d
a
ta,”
Proceedings of
the tenth AC
M
SIGKDD international
conference on Know
ledg
e
discovery and d
a
ta mining
,
AC
M
, pp
. 737–742
, 2004.
[9]
D. K. Bh
att
ach
a
r
yya
and
J
.
K
.
Kalit
a, “
N
etwor
k
Anom
al
y Det
ect
ion: A M
a
ch
ine
Lear
ning P
e
rs
pectiv
e,
” CRC
Press, 2013.
[10]
S. Nem
a
ti,
et a
l
.
, “A novel aco
–ga h
y
br
id algo
rithm for featur
e selection in pr
otein function p
r
ediction,”
Exp
e
rt
systems with app
lications
, vo
l/iss
u
e: 36(10)
, pp
. 1
2086–12094, 20
09.
[11]
C. Krier
,
et al.
,
“Feature Cluster
i
ng and Mutual I
n
formation for th
e Sele
ction
of
Variabl
e
s in Spe
c
tra
l
Dat
a
,”
Pro
c
.
European Symp. Artificial Neur
al Networks Advan
ces in Computational Int
e
ll
i
g
ence and Learning
, pp. 157-1
62,
2007.
[12]
G. V. Dijck and
M
.
M
.
V. Hull
e, “
S
peeding Up the W
r
apper F
eature S
ubs
et S
e
l
ect
ion in Regres
s
i
on b
y
M
u
tual
Information Relevance
a
nd R
e
d
undancy
Analy
s
is,”
Proc
.
Int’l
C
onf.
Artif
icia
l N
e
ural Networks
, 2
006.
[13]
R. Kohavi and G. H. John, “Wr
a
ppe
rs for Featu
r
e Subset Selection,”
Arti
fic
ial In
tell
igen
ce
, vol/is
sue: 97(1/2), pp.
273-324, 1997
.
[14]
M. A. Hall, “
C
orrela
tion-Based
Feature Sel
e
c
tio
n
for Dis
c
rete a
nd Num
e
ric Cla
s
s
M
achine Lear
ning,”
P
r
oc
.
1
7
t
h
Int’l Con
f
.
Mach
ine Learning
, pp
. 359-366
, 2000
.
[15]
L. Yu and H. Liu, “Feature Selec
tion for High-D
i
mensional Data: A Fast
Correlat
i
on-Based Filt
er
Solution,”
Proc.
20th Int’
l Conf.
Machine Leanin
g
, vol/issue: 20(
2), pp
. 856-863
,
2003.
[16]
Sakar C. O and
Kursun O., “Hy
b
rid
Method fo
r Featur
e
Selection Based on M
u
tual
Info
rmatio
n and Canon
ical
Correla
tion An
al
ysis,
”
20th
International Con
f
erence on
Pa
ttern
Recognition (
I
CPR)
, IEEE
, 2010.
[17]
Q. Song,
et al.
,
“
F
ast Clusterin
g
Based Fea
t
ure Subset Selectio
n Algorithm
for
High Dimensional Data,”
IEEE
Transactions on
Knowledge and
Data Engin
eerin
g
, vol/issue: 25(
1), pp-1-14
, 201
3.
[18]
P. Mitra,
et al.
, “
U
ns
upervis
ed featur
e s
e
le
cti
on
using feature similarity
,”
IEEE
transactions on pattern analysis
and machine in
t
e
llig
ence
, vol/issue: 24(3)
, pp
. 30
1–312, 2002
.
[19]
R. Ba
tti
ti,
“
U
sing m
u
tual
inform
ation
for sel
e
c
t
i
ng fea
t
ures
in su
pervised n
e
ura
l
net
learn
i
ng,
”
I
E
EE Transactions
on Neural N
e
tw
orks
, vol/issue: 5
(
4), pp
. 537–550
, 1994
.
[20]
N. Kwak and C. H. Choi, “Inp
ut
featur
e s
e
l
ect
ion for clas
s
i
fi
cation problems,”
IEEE T
r
ansac
tions on Neural
Networks
, vo
l/is
sue: 13(1), pp
. 1
43–159, 2002
.
[21]
K. Kira
and L. A
.
Rend
ell, “The
Feature Selectio
n Probl
em:Traditional Methods
and a New Algorithm,”
Proc. 10
th
Nat’l Conf
.Art
ifi
cial
Inte
llig
enc
e
, pp. 129-134, 19
92.
[22]
I. Kononenko,
“Estimating Attributes: An
aly
s
is and Extension
s
of RELIEF,”
Proc.
European
Conf. Ma
chine
Learning
, pp. 17
1-182, 1994
.
[23]
G. Forman, “An Extensiv
e Empirical Stud
y
of
Feat
ure Se
le
cti
on Metrics fo
r Text
C
l
assification,”
J.
Mac
h
ine
Learning Resear
ch
, vo
l. 3, pp. 12
89-1305, 2003
.
[24]
M
.
A. Hall, “
C
orrela
tion-Bas
e
d
F
eature S
ubs
et
S
e
lect
ion for M
achine L
earn
i
ng,” P
h
D dis
s
e
rtation
,
Univ. of
Waikato, 1999.
[25]
M. A. Hall and
G. Holm
es, “
B
enchm
a
rking Attr
ibute Se
l
ect
ion
Techn
i
ques
for Dis
c
re
te
Cla
ss Da
ta
Mining,
”
IE
EE
Transactions on
Knowledge and
Data Engin
eerin
g
, vol/issue: 15(
3), 2003
.
[26]
I. S. Dhillon,
et al.
, “A Divisive Information
Theo
retic Feature Clustering Alg
o
r
ithm for Tex
t
Classification
,
”
J.
Machine Learning Research
, vo
l. 3
,
pp
. 1265-12
87, 2003
.
[27]
M.
A.
Hall,
et al.
, “Fast Binar
y
Feature Selection w
ith Conditional Mutual In
formation,”
J. Machine
Learning
Research
, vol. 5
,
pp. 1531-1555,
2004.
[28]
M. Abdar,
et al.
, “Comparing Performance of Data Mining Al
gori
t
h
m
s
in Predictio
n Heart Dise
ases,”
Internationa
l
Journal of Electrical and
Co
mputer
Eng
i
neer
ing
(
I
JECE)
, vol/iss
u
e: 5(6)
, pp
. 156
9~1576, 2015. I
SSN: 2088-8708.
[29]
Datas
e
ts
ca
n be d
o
wnloaded
from
:
https
:
//ar
c
hive
.i
cs
.uci
.
e
du/m
l/dat
as
ets
/
DBW
o
rld+em
ail
s
,
http://csse.szu
.edu.cn/staff/zh
uz
x/Datasets.html,http
://repositor
y
.
s
easr.or
g/Datasets/UCI/arff/.
[30]
R. R. Bouck
aer
t,
et al.
, “WEKA Manual for
Vers
ion 3-7-10,” 201
3.
BIOGRAP
HI
ES OF
AUTH
ORS
Smita Chormunge
is a PhD
student of Co
mputer Sc
ience and Engineering at GITAM
Univers
i
t
y
, H
y
d
e
rabad
,
INDIA.
S
h
e obtain
e
d
her M
a
s
t
er d
e
g
r
ee and
Bach
el
or Degree
in
Computer Scien
ce
Engineering
f
r
om SRTMU and
PUNE University
respec
tiv
el
y. Her
res
e
a
r
ch
inter
e
sts are in f
eatur
e selecti
on
and data mining
. She has publishe
d innov
ativ
e w
o
rks in journals
and top
conf
eren
ce pro
c
eedings s
u
ch as ACM, IEEE
and Spring
er
.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
SN
:
2
088
-87
08
I
J
ECE
Vo
l. 6
,
N
o
. 4
,
Au
gu
st 2
016
:
18
80
–
1
888
1
888
Sudarson Jena is currently
wo
rking as Assoc
i
ate Professor in the Dept. of
Information
Techno
log
y
, GI
TAM University
, H
y
derabad
.
He
rec
e
ived M
.
T
ech degr
ee in Co
m
puter S
c
ienc
e
and Engineering
from JNTU-
Hy
de
rabad
and Ph.D degree in Co
m
puter Science f
r
om Sambalpur
University
, in 2
008. Dr. Jen
a
h
a
s so far publis
hed more th
an
65 Technical p
a
pers in referred
Journals and Co
nference proceedings. One Ph.
D scholar has successfully
completed und
er his
guidance under
JNTU-Hy
d
erabad in 2015 and
15 ot
her Ph.D r
e
search
scholars
are workin
g
under him in
different universities
in
India. Dr. Jena was
the Join
t-Editor of
th
e Journal,
The
Chanak
y
a
durin
g 2005-2007 and Mentor of Inte
rscience R
e
search Network
(IRNet) since
2012.He serves
on the editor
i
al board of Asia
n Journal of
Engineering an
d Technolog
y
,
International Jo
urnal of Comp
uter Science E
ngineer
ing, American Journal
of Intellig
ent
S
y
stems, Journal of Grid and Dist
ributed Computing and ten o
t
her
j
ournals. Dr Jen
a
is an invited
s
p
eaker and
ac
ted as
a Cha
i
r
p
ers
on and Co-C
hairs for var
i
ous Internat
ion
a
l, Na
tiona
l
Conferences/workshops and oth
e
r Research
gro
ups
. He is
a Member of IEEE
and Lif
e
Member
of Computer Society
of India (C
SI) and ISTE
an
d Senior member of Intern
ation
a
l Association of
Com
puter S
c
ien
ce
and Inform
ati
on Techno
log
y
(
I
ACS
I
T). His
re
s
earch
inter
e
s
t
s
i
n
clude P
a
r
a
l
l
el
and Distributed
S
y
stem, Data Mi
ning, R
e
li
abil
it
y
a
nd Performance Evalu
a
tion of In
terconn
ection
Networks, Grid
Computing, Clu
s
te
r Computing
and Soft Compu
ting.
Evaluation Warning : The document was created with Spire.PDF for Python.