Internati
o
nal
Journal of Ele
c
trical
and Computer
Engineering
(IJE
CE)
V
o
l.
6, N
o
. 4
,
A
ugu
st
2016
, pp
. 18
11
~
1
817
I
S
SN
: 208
8-8
7
0
8
,
D
O
I
:
10.115
91
/ij
ece.v6
i
4.1
025
5
1
811
Jo
urn
a
l
h
o
me
pa
ge
: h
ttp
://iaesjo
u
r
na
l.com/
o
n
lin
e/ind
e
x.ph
p
/
IJECE
Incremental Learning on Non-st
ationary Data Stream Using
Ensemble Approach
Meena
k
shi Anurag
Tha
l
o
r
, Shrisha
ila
pa
Pa
til
Article Info
A
B
STRAC
T
Article histo
r
y:
Received
Ja
n 20, 2016
Rev
i
sed
Mar
15
, 20
16
Accepte
d Apr 1, 2016
Incremental Learning on non stationar
y
d
i
stribution has been
shown to be
a
ver
y
c
h
al
lenging
problem
in m
a
c
h
ine le
arning an
d data m
i
ning, becaus
e
t
h
e
joint prob
abil
it
y
distribution b
e
t
w
een th
e data an
d classes ch
anges over time.
M
a
n
y
re
al tim
e
problem
s
s
u
ffer
concep
t drift as
the
y
chang
e
s
with tim
e. F
o
r
exam
ple,
an ad
vertis
em
ent r
e
c
o
m
m
e
ndation s
y
s
t
em
,
in whic
h cus
t
om
er’s
behavior may
change depend
in
g on the
season of the
y
e
ar, on
the inflation
and on new products
m
a
de avai
labl
e. An extra
chal
lenge ar
is
es
when the
clas
s
e
s
to be
le
arned ar
e not r
e
pres
en
t
e
d equ
a
ll
y in th
e tr
aini
ng data
i.e
.
clas
s
e
s
are im
ba
lanc
ed, as
m
o
s
t
m
achine le
arnin
g
algorithm
s
work well on
l
y
when the tra
i
nin
g
data is balan
c
ed. Th
e objectiv
e of this paper is
to develop
an ensemble based classificatio
n algor
ithm for
non-stationar
y
data stream
(ENSDS) with focus on two-cl
a
ss problem
s. In addition
,
we
are
presentin
g
here an
exhaustive comparison
of purpos
ed alg
o
rithm
s
with s
t
ate-of-th
e
-ar
t
clas
s
i
fi
cat
ion ap
proaches
us
ing
differen
t
ev
alu
a
t
i
on m
eas
ures
li
ke re
cal
l,
f-
m
eas
ure and
g-
m
ean.
Keyword:
C
once
p
t
dri
f
t
Ensem
b
le
In
crem
en
tal le
arn
i
n
g
No
n-
st
at
i
ona
ry
dat
a
Stream
Copyright ©
201
6 Institut
e
o
f
Ad
vanced
Engin
eer
ing and S
c
i
e
nce.
All rights re
se
rve
d
.
Co
rresp
ond
i
ng
Autho
r
:
1.
INTRODUCTION
Non
s
tation
a
ry
d
a
ta is tim
e series d
a
ta
wh
ere
d
a
ta at tim
e t i
s
no
t eq
u
a
l t
o
data at ti
m
e
t+1
Th
e tim
e
series Y
t
is
n
onstatio
n
a
ry if for all v
a
l
u
es,
and
ev
er
y tim
e p
e
r
i
od
, Eq
. (1
) an
d Eq
.
(
2
)
ar
e
tr
u
e
.
E(Y
t
)
≠μ
(n
ot
havi
ng
co
nst
a
nt
m
ean)
(
1
)
Var
(
Y
t
)
≠σ
2
(n
ot
havi
ng
co
nst
a
nt
vari
a
n
ce
)
(
2
)
Conve
n
tional
data
m
i
ning and m
achine learni
ng [1
] algorithm
s
assumes that each dataset is
pr
o
duce
d
fr
om
a si
n
g
l
e
,
st
at
i
c
an
d
hi
d
d
e
n
fu
nct
i
o
n
.
T
h
at
i
s
,
t
h
e
fu
nct
i
o
n
(
m
odel
/
c
l
a
ssi
fi
er)
ge
nerat
i
n
g
d
a
t
a
at
train
i
ng
tim
e
is th
e sam
e
as th
at o
f
testing
time.
W
h
er
eas i
n
d
a
ta st
ream
, d
a
ta is con
tinuo
u
s
ly co
m
i
n
g
an
d th
e
fu
nct
i
o
n w
h
i
c
h
gene
rat
i
ng i
n
s
t
ances at
t
i
m
e
t
need n
o
t
be t
h
e sam
e
funct
i
on at
t
i
m
e
t+1
.
This
diffe
re
nce in
t
h
e u
nde
rl
y
i
ng
fu
nct
i
o
n i
s
cal
l
e
d as
co
nce
p
t
dri
f
t
[2]
.
T
h
us,
past
dat
a
m
a
y becom
e
i
rrel
e
vant
fo
r t
h
e c
u
rre
nt
cont
e
x
t
,
a
n
d
i
t
i
s
defi
ned
by
E
q
.
(
3
)
|
|
(3)
In
recent
data mining applica
tions,
dri
f
t can occur at
any
m
o
ment of time, so it is nece
ssary to take
som
e
m
easures t
o
ha
ndl
e
dr
i
f
t
e
d dat
a
st
re
am
s. In t
h
i
s
p
a
per
we are
u
s
i
ng e
n
sem
b
l
e
based i
n
crem
ent
a
l
l
earni
n
g
al
g
o
r
i
t
h
m
t
o
ha
ndl
e
aforem
ention phenom
ena.In e
n
sem
b
le based
classification [3] a set of
classifiers
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
SN
:
2
088
-87
08
I
J
ECE
Vo
l. 6
,
N
o
. 4
,
Au
gu
st 2
016
:
18
11
–
1
817
1
812
wh
ose i
n
di
vi
d
u
al
pre
d
i
c
t
i
o
ns
are com
b
ined
in som
e
way to classify unse
en data. T
h
e a
p
proach i
n
ens
e
m
b
le
syste
m
s [4] is
to create m
a
ny classifiers, and com
b
ine th
eir ou
tpu
t
s in
such
a way th
at th
is co
m
b
in
ation
will
im
prove the
perform
a
nce as
com
p
are to
si
ngle classifier.
Figure
1 s
h
ows
the
basic ste
p
s
of e
n
sem
b
le base
d
classification.
Fi
gu
re
1.
En
se
m
b
l
e
based Le
arni
ng
So t
h
i
s
wo
rk
i
n
t
r
od
uces
cl
assi
fi
cat
i
on al
go
ri
t
h
m
on
n
o
n
-
st
at
i
ona
ry
dat
a
usi
n
g e
n
sem
b
l
e
base
d
ap
pro
ach
wh
ich
will ex
p
licitly an
d
sim
u
ltan
e
o
u
sly ad
d
r
ess th
e aforem
en
t
i
o
n
e
d
ph
eno
m
en
a. Classifier u
s
ed
fo
r l
ear
ni
n
g
of
no
n
-
st
at
i
ona
ry
dat
a
st
ream
ge
neral
l
y
use
s
on
e o
f
t
h
e
f
o
l
l
o
wi
ng
m
e
t
hodol
og
i
e
s.
1.
1.
Ad
apti
ve Met
h
ods
These
are
trul
y increm
ental algor
ith
m
s
[5
],[6
] as th
ey learn d
a
ta
inc
r
e
m
entally, as
the instances
arri
ve
(i
nst
a
nc
e-by
-i
nst
a
nce
,
and
ef
fi
ci
ent
l
y
wi
t
h
a
sing
le p
a
ss t
h
rou
gh th
e
d
a
ta).
Adap
tiv
e m
e
th
o
d
s are
g
e
n
e
rally u
s
ed fo
r con
cep
t
d
r
ift d
e
tectio
n
alg
o
rith
m
s
. Th
e first is to
d
e
tect co
n
c
ep
t d
r
ift
,
su
ch
as thro
ug
h
t
h
e
u
s
e
o
f
nov
elty d
e
tectio
n algo
rith
m
s
, and
u
pon
d
e
tectio
n, adap
t th
e classi
fiers to th
is ch
ang
e
o
f
con
c
ep
t.
1.
2.
Wrap
per
Met
h
ods
Th
ese requ
ire th
e d
a
ta
m
u
st in
so
m
e
way
t
o
b
e
co
llected
in
to
chu
n
k
so
th
at a trad
itio
nal classifier
can be
use
d
f
o
r l
earni
ng
p
u
r
p
ose.
Wra
p
p
er
m
e
t
hods a
r
e ge
neral
l
y
use
d
f
o
r passi
ve d
r
i
f
t
det
ect
i
on al
go
r
i
t
h
m
s
.
In p
a
ssiv
e
drift d
e
tection
algorith
m
s
it is assu
m
e
d
th
at dr
ift
o
c
cu
rs, and
mo
d
e
l is con
s
tructed
tak
i
n
g
th
i
s
in
t
o
assum
p
t
i
on ;
t
h
e act
ual
l
e
vel
of c
once
p
t
d
r
i
f
t
or eve
n
i
f
i
t
act
ual
l
y
does oc
cur m
a
y
not
be
m
easured
.
Wr
appe
r
m
e
t
hods
ha
ve
f
o
l
l
o
wi
ng
ad
va
nt
ages
o
v
er
ad
apt
i
v
e
one
s:
•
We can
u
s
e trad
itio
n
a
l classifi
ers typ
e
lik
e
s
u
pport vector machine
fo
r lear
n
i
n
g
pu
rpo
s
e.
•
We can
co
nstru
c
t ensem
b
le i
n
p
a
rallel.
•
Spee
d
of ense
m
b
le design is
fast as c
o
m
p
are to a
d
ap
ti
v
e
as th
ey reliance
o
n
ch
ang
e
d
e
tectio
n
algorith
ms.
•
Ab
ility to
d
eal
with
reoccurri
n
g
con
c
ep
ts
–
to
learn
o
n
p
a
st b
a
tch
e
s of
d
a
ta an
d reu
s
e t
h
is in
fo
rm
atio
n
to
classify ne
w instances
on ne
w concepts
with
sim
i
l
a
r cl
ass di
st
ri
but
i
o
ns
as
o
l
d co
nce
p
t
s
.
•
We ca
n
use e
n
sem
b
l
e
m
e
t
hods
t
o
c
o
m
b
i
n
e cl
assi
fi
ers t
r
ai
ned
fr
om
di
ffere
nt
i
n
t
e
r
v
al
s o
f
t
i
m
e
whi
c
h
furth
e
r im
p
r
oves upo
n th
e perfo
r
m
a
n
ce over th
e sing
le classifier. Thu
s
en
sem
b
les are often
bu
ilt fro
m
past s
ubsets
(batches)
of data
that can
be
re
used to
classify
n
e
w in
stan
ces
fro
m
th
e n
e
w co
n
c
ep
t,
similar to
t
h
e cl
ass
di
st
ri
but
i
o
n
o
f
t
h
e
ol
d c
once
p
t
.
Perf
o
r
m
a
nce of al
l
t
h
ese
wra
p
per al
g
o
r
i
t
h
m
s
depe
nd
ent
u
p
o
n
t
h
e
si
ze of t
h
e
dat
a
ch
un
ks
(b
l
o
ck
/
b
atch).
Big
g
e
r b
l
o
c
ks
can
resu
lts i
n
accu
rate cla
ssifi
ers as classi
fiers are
g
e
ttin
g mo
re
d
a
ta
for train
i
n
g
,
but
ca
n
co
nt
ai
n t
o
o m
a
ny
di
f
f
ere
n
t
co
nce
p
t
s
d
r
i
f
t
.
Wh
ereas
sm
all
e
r bl
oc
k
s
are
bet
t
e
r
f
o
r
d
r
i
f
t
e
d
dat
a
st
ream
,
but
us
ual
l
y
l
ead t
o
p
o
o
re
r cl
a
ssi
fiers as trainin
g
d
a
ta is less.
D
D
2
D
t
D
1
…………
….
C
1
C
2
C
t
C*
Step1: Create
Multiple Datasets
Step2: Build
Multiple Classifiers
Step3: Co
m
b
ine
Classifiers
Evaluation Warning : The document was created with Spire.PDF for Python.
I
J
ECE
I
S
SN
:
208
8-8
7
0
8
Incre
m
e
n
t
a
l
Le
arni
n
g
on
N
o
n
-
st
at
i
o
na
ry Dat
a
St
rea
m
Usi
n
g
E
n
se
mbl
e
...
. (
Meenaks
h
i
A
n
aru
g
Th
al
or)
1
813
2.
RELATED WORK
The fi
rst experim
e
nt of ense
m
b
les in data stream
s was the purpose
d
by Street and Ki
m with their
Stream
in
g
En
se
m
b
le Algo
rithm
[7
] (SEA
)
where
a chunk of d insta
n
ces a
r
e rea
d
from
data stream
and us
e
d
to build a classifier. As fixe
d
size of ensem
b
le was us
ed, s
o
they co
m
p
are new
gene
rated classifier against a
p
o
o
l
o
f
p
r
ev
iou
s
ly train
e
d
classifiers (fro
m p
r
ev
i
o
us ch
unk
), an
d
if it cu
rren
t
classifier i
m
p
r
ov
es th
e qu
ality
o
f
en
sem
b
le it is in
clud
ed at th
e co
st
of
t
h
e wo
rst
cl
assi
fi
er
.
SE
A uses
a
si
m
p
l
e
m
a
jori
t
y
vot
e
a
n
d
m
a
y
not
b
e
abl
e
t
o
pe
rf
or
m
i
n
rec
u
r
r
i
n
g
envi
ro
nm
ent
s
.
Wang et al. propose
d
Acc
u
ra
cy W
e
i
ghte
d
E
n
sem
b
le
[8] (AWE)
of classifi
ers on each inc
o
m
i
ng data
chunk and
use
that chunk to e
v
aluate
the
pe
rform
a
nce of al
l existing clas
sifiers in
th
e ensem
b
le. Th
e weig
h
t
of eac
h classifier is the difference
of error rate of
a ra
ndom
classifier and the
m
ean squa
re er
ro
r of the
classifier for the curre
nt
ch
un
k. The m
ean square er
ro
rs o
f
ol
d cl
assi
fi
ers
are hi
g
h
, an
d t
hus t
h
e
wei
g
ht
s of ol
d
classifiers are
s
m
all.
Brzezins
ki and Stefanows
k
i proposed the
Accuracy Updated Ense
m
b
le [9] (AUE) whic
h is derive
d
f
r
o
m
A
W
E .I
t u
s
es sa
m
e
p
r
in
cip
l
es o
f
ch
unk-
b
a
sed
en
semb
les bu
t w
ith
in
cr
em
en
tal b
a
se
com
pone
nts/classifiers. It
not
only
builds
new classifiers
,
but als
o
conditionally update
s existing classifiers
on
new c
h
u
n
k
rat
h
er t
h
a
n
j
u
st
adj
u
st
i
n
g t
h
ei
r
wei
ght
s. T
h
e up
dat
i
n
g o
f
ex
i
s
t
i
ng base cl
assi
fi
ers m
a
kes
AU
E
better tha
n
AWE i
n
case
of gradual
drift
but c
o
nditiona
lly updati
ng of base
classi
fiers is less acc
urate for
sudde
n
dri
f
t.
R
obi
P
o
l
i
k
a
r
e
t
al
. pr
o
pose
d
Learn++
.
N
S
E
[1
0]
-[
1
4
]
(N
o
n
s
t
a
t
i
onary
E
n
v
i
ro
nm
ent
)
whi
c
h
gene
rat
e
s
cl
assi
fi
ers seq
u
e
nt
i
a
l
l
y
usi
ng bat
c
hes
of exa
m
pl
es/i
nst
a
nc
e
s
(N
ot
t
r
ue o
n
l
i
n
e l
earner as i
t
conve
rt
s t
h
e onl
i
n
e
data stream
into a
series
of c
h
unks
of a
fixe
d size).
At
eac
h tim
e step one ne
w classifie
r
is trai
ned
on
recent
di
st
ri
b
u
t
i
o
n
,
u
s
i
ng a
n
i
n
st
a
n
ce
wei
g
ht
i
ng
di
st
ri
b
u
t
i
o
n
.
I
n
Le
arn++
.
N
S
E ea
ch cl
assi
fi
er
’s
wei
g
ht
i
s
com
put
e
d
usi
n
g a wei
g
ht
ed aver
age o
f
i
t
s
predi
c
t
i
o
n er
ro
r o
n
ol
d an
d
cur
r
ent
bat
c
h and fi
nal
l
y
uses wei
ght
e
d
m
a
j
o
ri
t
y
v
o
ting
to
ob
tain
en
sem
b
le’s
o
u
t
p
u
t
.
Most rece
ntly, Brzezins
ki a
n
d Stefa
n
ows
k
i pr
opose
d AUE
2 [15]
intro
duces
a ne
w weighting
fun
c
tion
,
do
es
n
o
t
req
u
i
re cross-v
a
li
d
a
tio
n on
th
e ex
istin
g
classi
fi
ers,
doe
s
not
keep a cl
assi
fi
er buffer, prunes
i
t
s
base l
earne
rs, an
d al
way
s
unc
on
di
t
i
ona
l
l
y
updat
e
s i
t
s
com
pone
nt
s. C
l
assi
fi
ers are
up
dat
e
d aft
e
r
every
chunk, s
o
the
y
can react to gra
dual
drifts
. It can react
to sudde
n
drifts
and
gra
dual
l
y drifts but not for
reocc
u
r
r
i
n
g co
ncept
s
. C
o
m
p
ared t
o
Lea
r
n++
.
NSE
,
A
U
E
2
i
n
crem
ent
a
l
l
y
t
r
ai
ns exi
s
t
i
n
g com
pone
nt
cl
assi
fi
ers
,
ret
a
i
n
s
o
n
l
y
k
of
al
l
t
h
e c
r
eat
ed c
o
m
pone
nt
s
,
a
n
d
us
es a
di
ffe
rent
wei
g
ht
i
n
g
m
echani
s
m
w
h
i
c
h
en
su
res
t
h
at
com
pone
nts will
have non-ze
ro weights.
3.
ENSEMBLE
FOR NON-ST
ATIONARY
DAT
A ST
RE
AM (E
NSDS)
ENS
D
S u
s
es a sim
i
l
a
r orga
ni
zat
i
onal
f
r
a
m
ewor
k as L
earn
++
.NSE that is we are
buildi
ng an
ensem
b
l
e
of cl
assi
fi
ers f
r
om
dat
a
arri
ved at
t
i
m
e
and eval
uat
i
ng t
h
e pe
rf
orm
a
nce of e
x
i
s
t
i
ng cl
assi
fi
ers o
n
recent data then all gene
rated classifiers
are com
b
ined
by using wei
g
hted m
a
jor
ity voting to
provide the
pre
d
i
c
t
i
ons
o
f
uns
een
dat
a
.
O
n
e o
f
t
h
e m
a
jo
r
di
ffe
re
nces
in
ENSDS is
we
are no
t upd
ating
a set of
weigh
t
s fo
r
each i
n
stance
rather
only
uniform
weight is
considere
d
. T
h
e ENSDS
algorithm
is descri
bed in detail
below,
with
its m
a
th
ematical
m
o
d
e
l g
i
v
e
n
in Figure 2
.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
SN
:
2
088
-87
08
I
J
ECE
Vo
l. 6
,
N
o
. 4
,
Au
gu
st 2
016
:
18
11
–
1
817
1
814
Inpu
t
: Fo
r each
d
a
taset
whe
r
e t
=
1,
2,…
.
.
Training d
a
ta
: {
x
∈
X
;
y
∈
Y
1
,….,c
,
i
1…..m
insta
n
ces
Descripti
o
n
:
S
upe
r
v
i
s
ed l
e
a
r
n
i
ng al
go
ri
t
h
m
to
han
d
l
e
No
n
-
s
t
at
i
onary
dat
a
s
t
ream
Pseud
o
c
o
de
:
D
o
fo
r t=1
,
2…
1
.
In
itialize
i
=1/m
,
∀i,
2
.
Call
b
a
se classifier with
, obt
ai
n
:
→
3.
Eval
uate a
ll existing clas
sifiers
(
) o
n
ε
∑
i
.
|
h
x
y
|
fo
r
k
1
,
…
,
t
If
ε
gen
er
ate
anew
h
If
ε
set
ε
1
/
2
4. C
o
m
put
e t
h
e
wei
g
ht
f
o
r
k
t
h cl
assi
fi
er
Sig
w
1
t
k
∑
,
otherwise
5. Calculate c
l
assifier voting weights
V
o
ting
w
l
n
1
∑
w
ε
1
…
6.
Ob
tain th
e
fin
a
l
h
ypo
th
esi
s
H
x
a
r
g
m
a
x
∑
w
.
|
h
x
c
|
Fi
gu
re
2.
The
M
a
t
h
em
at
i
cal
m
odel
of t
h
e al
go
ri
t
h
m
ENSD
S
We
a
r
e
b
u
i
l
d
i
n
g
a
k
cl
assi
fi
er
on
dat
a
dra
w
n fr
om
t
h
e
c
u
r
r
e
n
t
t
r
ai
ni
n
g
dat
a
set
.
A
f
ter
for
m
atio
n
of
k
t
h
classifier, t
h
e p
e
rform
a
n
ce o
f
ex
isting
classifiers
will b
e
evalu
a
ted
ov
er t
h
e curren
t
train
i
ng
d
a
taset
an
d we will
get
ε
wh
ich is er
ro
r of k
t
h
classif
i
er
on
cu
rren
t
.
If
er
r
o
r
ge
nerat
e
d
by
c
u
r
r
ent
cl
assi
fi
er
i
s
m
o
re th
an
.5
that is h
a
lf o
f
th
e p
r
ed
iction
s
are wrong
then
generate a new
classifier
fo
r cur
r
ent
di
st
ri
but
i
on. I
f
erro
r
g
e
n
e
rated b
y
on
e
o
f
t
h
e
p
r
ev
iou
s
classi
fier is m
o
re th
an
.5
th
en
set its
ε
0
.
5
.W
e a
r
e
not
norm
alizing
th
e
ε
as i
t
s
val
u
e rem
a
i
n
s bet
w
een
0 t
o
0.
5 a
n
d v
o
t
i
n
g
po
wer
of a
cl
assi
fi
er
havi
ng
ε
.
5
will remain
lo
w.
A non
lin
ear si
g
m
o
i
d
fun
c
tion
is u
s
ed
to
set weig
h
t
o
f
a
classifier. Becau
se
o
f
th
is i
f
a classifier
will b
e
ev
alu
a
ted
m
o
re th
an
o
n
c
e then
its sig
m
o
i
d
weigh
t
will g
e
t in
creased
. The weigh
t
to
a
classifier is assig
n
e
d
base
d o
n
i
t
s
p
e
rf
orm
a
nce o
n
pre
v
i
o
us
di
st
r
i
but
i
o
ns as
we
l
l
as on rece
nt
di
st
ri
b
u
t
i
on
s
o
wei
ght
e
d
av
erage
of
classifier is com
puted in step 4.
Whe
n
a cl
assifier is
generated it’s
w
1
, aft
e
r its eval
uation
on
rece
nt
envi
ro
nm
ent
i
t
s
w
gets
keep
updated. If a clas
si
fi
er d
o
es
n
o
t
per
f
o
r
m
s
wel
l
on
rece
nt
en
vi
r
onm
ent
,
t
h
e
n
i
t
s
weighted error (
w
.ε
will
g
e
ts in
creased.
In step 5 th
e wei
g
h
t
erro
r av
erag
e i
s
co
m
p
u
t
ed
to
d
e
term
in
e th
e
vot
i
n
g
wei
g
ht
of cl
assi
fi
ers.
The
v
o
t
i
ng
p
o
w
er
o
f
eac
h cl
a
ssi
fi
er i
s
c
o
m
put
ed
usi
ng l
o
g
a
ri
t
h
m
of t
h
e i
n
verse
o
f
its weig
h
t
ed
error av
erag
e
.If weigh
t
ed
error av
erag
e
is h
i
g
h
a classifier
will g
e
t less p
o
w
er of vo
ting. Th
e
ti
m
e
co
m
p
lex
i
t
y
o
f
ENSDS is O(t
*
k*O(
x
*
m
)+k*
t*
m
)
wh
ich
is less th
an
Learn
++
.N
SE. Whe
r
e O(
x*m
)
is
the
ti
m
e
co
m
p
lex
ity o
f
Naïv
e Bayes classifier,
x
is
n
u
m
b
e
r
of features a
n
d
m
is num
ber
of insta
n
ces in training
set, k indicates num
b
er of cla
ssifiers, t indicat
es n
u
m
b
er o
f
dat
a
c
h
un
ks
t
o
be
pre
d
i
c
t
e
d.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
J
ECE
I
S
SN
:
208
8-8
7
0
8
Incre
m
e
n
t
a
l
Le
arni
n
g
on
N
o
n
-
st
at
i
o
na
ry Dat
a
St
rea
m
Usi
n
g
E
n
se
mbl
e
...
. (
Meenaks
h
i
A
n
aru
g
Th
al
or)
1
815
4.
CO
MP
AR
AT
IVE
EV
AL
U
A
TIO
N
AN
D AN
ALY
S
IS
In t
h
e f
o
l
l
o
wi
ng s
u
bsect
i
o
ns
;
we desc
ri
be
t
h
e t
e
st
ed dat
a
set
s
, eval
uat
i
o
n
m
easures, e
xpe
ri
m
e
nt
al
setup, a
n
d com
p
arative
analys
is of expe
rim
e
nt res
u
lts.
4.
1.
Da
ta
sets
For doi
n
g the
com
p
arison of EN
SDS and e
x
isting algorit
h
m
(Learn
++
.
N
SE) we
a
r
e usi
n
g
di
ffe
rent
d
a
tasets with differen
t
b
a
tch
sizes. Th
e propo
sed
algo
rith
m
is tested
o
v
e
r
syn
t
h
e
tic d
a
taset as well as
real ti
m
e
d
a
tasets.
1.
SEA: Th
e SEA d
a
taset [16
]
with
500
00
exa
m
p
l
es is
sy
nt
het
i
cal
l
y
gener
a
t
e
d w
h
i
c
h c
o
nsi
s
t
of t
w
o cl
asses
and three fe
atures.
In whic
h
on
ly two feat
u
r
es are
relev
a
n
t
,
an
d th
e t
h
ird
bein
g
no
ise.
2.
IB
M
_
E
OD:
The IB
M
_
EO
D dat
a
set
co
nt
ai
ns st
oc
k
dat
a
of IBM Com
p
any where
we are c
onsi
d
ering
ope
n,
hi
g
h
, l
o
w, cl
ose,
vol
u
m
e and
rat
e
o
f
c
h
an
ge
i
n
cl
o
s
i
n
g
p
r
i
ce t
o
fi
nd
out
t
h
e st
oc
k i
nde
x m
ove
m
e
nt
(
U
p
,
Do
wn)
fo
r
classi
f
i
catio
n
task. For
tr
ain
i
n
g
pur
po
se
d
a
ta fr
o
m
p
e
r
i
o
d
2-
Jan
-
2
000 to
1
6
-
F
eb
-2016
(
399
9 ex
am
p
l
es)
is
f
e
tch
e
d
an
d fo
r testing
p
u
rp
o
s
e
data fr
o
m
p
e
r
i
o
d
2-
Jan
-
20
01
t
o
1
6
-Feb-
201
6 (3802
exam
pl
es) i
s
fe
t
c
hed
usi
n
g
G
o
ogl
e
fi
na
nce.
The
pu
rp
ose
o
f
co
nsi
d
e
r
i
n
g s
t
ock
dat
a
i
s
as
we k
n
o
w t
h
at
st
ock m
a
rket
d
a
t
a
i
s
hi
gh
f
r
e
que
ncy
d
a
t
a
wh
ich
is co
m
p
lex
,
n
on-station
a
ry, ch
ao
tic
an
d non
-lin
ea
r and s
u
ites
our re
searc
h
t
opi
c .Conce
pt dri
f
t can
occurs i
n
t
h
e s
t
ock m
a
rket
for a
num
b
er of
reasons
for exa
m
ple trade
r
s
pre
f
ere
n
ce f
o
r st
ocks
cha
n
ge o
v
e
r
tim
e
, increases
in a
stoc
k’s
va
lue
m
a
y
be
fol
l
owe
d
by
decre
a
ses.
4.
2.
E
val
u
a
ti
on M
e
asures
In a
ddition of
accuracy,
othe
r evaluation metrics [1
7] are
also use
d
for e
v
aluation purpose. These
m
e
t
r
i
c
s are de
f
i
ned a
s
:
1.
Precision: It is a
m
easure of
exactness as
gi
ven i
n
Eq
. (4), whic
h states how m
a
ny posit
ive exam
ples are
actu
a
lly lab
e
lled
co
rrectly o
u
t
o
f
to
tal po
sitive pred
iction
.
(
4
)
2.
Recall: It is a
measu
r
e
o
f
com
p
le
ten
e
ss as
g
i
v
e
n
i
n
Eq
. (5),
wh
ich
states
h
o
w m
a
n
y
po
sitiv
e ex
am
p
l
es are
actu
a
lly lab
e
lled
co
rrectly.
(
5
)
3.
F-m
easu
r
e: It is u
s
ed
to
ev
al
uate th
e b
a
lan
c
e b
e
tw
ee
n Recall and Precision as give
n in E
q
. (6).
Here
i
s
a coe
fficient t
o
adjust the
relative im
portance
of
preci
si
o
n
v
e
rsus
recal
l
ca
n vary
fr
om
0
t
o
1
.
.
.
.
(
6
)
4.
G-Mean
: It is
used
t
o
ev
alu
a
te th
e
d
e
gree
o
f
i
n
du
ctiv
e
b
i
as i
n
term
s o
f
a rat
i
o
of
po
sitiv
e accu
racy and
negat
i
v
e
acc
ur
acy
as gi
ven
i
n
Eq
. (
7
).
It
i
s
u
s
ed t
o
m
easure
t
h
e
bal
a
nce
d
p
e
rf
orm
a
nce o
f
a m
odel
bet
w
e
e
n
the classes.
(7)
4.
3.
Experimental Setup
Fo
r ex
p
e
rim
e
n
t
an
alysis, all
tested
algo
rithm
s
are im
ple
m
ented in Java usi
n
g MOA and
WE
KA
fram
e
wor
k
.
Th
e ex
peri
m
e
nt
s were
co
n
duct
e
d
on
a m
achi
n
e equippe
d
with Process
o
r
I
n
tel(R) Co
re(T
M
)
i3
-
21
2
0
C
P
U @
3.
30
G
H
z, 2 C
o
re
(s)
,
4 L
ogi
c
a
l
Process
o
r
(
s
)
and 4
GB
of
R
A
M
.
He
re w
e
have u
s
ed
di
ffe
rent
bat
c
h
si
ze f
o
r
c
o
m
p
ari
s
on
p
u
r
pos
e.
Ho
we
ver
,
t
h
e
o
p
t
i
m
a
l
bat
c
h si
ze i
s
di
f
f
e
rent
fo
r eac
h s
t
ream
.
Fi
gu
re
3
sh
o
w
s
t
h
e
pe
rf
orm
a
nce i
m
provem
e
nt
of
EN
SD
S
o
v
er
. Lea
r
n
++
.NSE to classify
SEA dataset
whe
r
e
we are
consideri
n
g Naïve Bayes as
base classifi
er
s
,
di
ffe
rent
bat
c
h si
ze a
nd
n
o
pr
u
n
i
n
g st
rat
e
gy
i
s
use
d
.Lea
rn
++
.NSE give
s accuracy approxim
a
t
e 88%
while
EN
SDS
gives
accuracy
upt
o 95% on SE
A dataset
whic
h com
p
ris
e
s sudde
n
drift
s
. The
Precision,
Accuracy,
F-m
easure and G-m
ean of E
N
SDS is
high as
fore
each batc
h size as com
p
are to Learn
++
.NSE
. There is cont
inuous im
prove
m
e
nt in precision a
nd acc
uracy as
with
sligh
tly decrease
in recall o
f
ENSDS.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
SN
:
2
088
-87
08
I
JECE
Vo
l. 6
,
N
o
. 4
,
Au
gu
st 2
016
:
18
11
–
1
817
1
816
Figure
3. Comparative
analys
is of Learn
++
.
N
SE a
n
d ENSDS ove
r
SE
A
da
taset
A
f
ter an
alysis
o
f
ENSDS an
d Learn
++NSE
o
n
SEA
d
a
taset, w
e
can con
c
lu
d
e
t
h
at
2
500 is op
tim
a
l
b
a
tch
size fo
r
ENSDS al
go
rith
m
as we are ob
tain
ing
m
a
x
i
m
u
m
v
a
lu
es for all tested
ev
al
u
a
tio
n m
easu
r
es.
Fi
gure
4 de
pi
cts t
h
e perf
orm
a
nce of Lear
n
++
.NSE and ENSDS res
p
ectively to classif
y
the stock inde
x
m
ovem
e
nt
over IB
M
_
EOD
d
a
t
a
set
where we are consi
d
eri
ng Naï
v
e B
a
y
e
s as base cl
assifi
ers, di
fferent
bat
c
h
si
ze and no pruni
n
g
st
rat
e
gy
i
s
used
.
Here the Accuracy, Recall of ENSDS
is continous better
as co
mpare to
Learn
++
.NSE a
nd P
r
eci
si
on sh
ows fl
uct
u
at
i
o
n
for
b
o
t
h
Learn
++
.NSE as well As for ENSDS.
Figure 4.
Perform
ance
Analys
is of Learn
++
.
N
SE a
n
d ENSDS ove
r
St
ock dataset
A
f
ter
an
alysis o
f
EN
SD
S
and
Learn
++NSE o
n
Sto
c
k
d
a
taset, w
e
can
co
nclu
d
e
t
h
at 150 is o
p
tim
a
l
batch size
for ENSDS
al
gorithm
.
5.
CO
NCL
USI
O
N
From
t
h
e im
plem
ent
a
t
i
on an
d anal
y
s
i
s
of E
N
S
D
S
we can
concl
ude t
h
at
the perform
ance of ENSDS
on different e
v
aluation m
easures is
better as
com
p
are to Le
arn
++
.
N
SE
. T
h
e sel
ect
i
on
of
opt
i
m
al
bat
c
h si
ze i
s
v
a
ries fro
m
d
a
t
a
set to
d
a
tasets. For SEA d
a
taset th
e o
p
tim
al
b
a
tch
size is 25
00
for ENSDS alg
o
rith
m
an
d
for
St
ock
dat
a
set
t
h
e o
p
t
i
m
al
batch si
ze i
s
15
0 fo
r EN
SDS al
go
ri
t
h
m
.
The st
at
e of art
Learn
++
.NSE al
g
o
rith
m
p
e
rform
s
well
in
case o
f
su
dd
en
and
g
r
ad
u
a
l drif
ts bu
t p
e
rform
a
n
ce can
still
i
m
p
r
ov
ed
using
ENSDS
alg
o
rith
m
as
ti
me co
m
p
lex
ity is
less. Th
is p
a
p
e
r do
es no
t e
m
p
h
a
sis on
d
r
i
f
t d
e
tectio
n
m
ech
an
ism as we
Evaluation Warning : The document was created with Spire.PDF for Python.
I
J
ECE
I
S
SN
:
208
8-8
7
0
8
Incre
m
e
n
t
a
l
Le
arni
n
g
on
N
o
n
-
st
at
i
o
na
ry Dat
a
St
rea
m
Usi
n
g
E
n
se
mbl
e
...
. (
Meenaks
h
i
A
n
aru
g
Th
al
or)
1
817
bel
i
e
ve i
n
real
t
i
m
e
envi
r
o
n
m
ent
dri
f
t
oc
c
u
rs
ve
ry
fre
q
u
e
nct
y
so
dri
f
t
det
ect
i
o
n
m
e
chani
s
m
can
be
ad
de
d t
o
pr
o
pose
d
al
go
r
i
t
h
m
as fut
u
re
wo
rk
.
REFERE
NC
ES
[1]
Y. W
a
ng and
Q. L
i
, “
R
ev
iew o
n
the
S
t
udies
an
d Advances of
Machine Learning Approaches
,”
TE
LKOMNIKA
Indonesian Jour
nal of El
ectrical Engineering
, vol/issue: 12(2)
, pp
. 1487-1494, 201
4.
[2]
J.
M.
Torre
s,
et al.
, “
U
nif
y
ing v
i
ew on datase
t shift in
classifi
ca
tion,
”
Pattern Recognition,
vo
l.
45, pp. 521–530
,
2011.
[3]
M. A. Thalor a
nd S. T. Patil,
“
R
eview of Ensem
b
le
Based Classific
a
tion Algorithm
s
for Nonstationa
r
y
an
d
Imbalanced
Data,”
IOS
R
Journal
of Computer
En
gineering
, vol. 1
6
, pp
. 103-107
,
2014.
[4]
R. P
o
likar
, “
E
ns
em
ble Bas
e
d S
y
s
t
em
s
in Decis
i
o
n
M
a
king,”
I
E
EE Circuits and S
y
stems Magazine
, vol/issue: 6(3)
,
pp. 21-45
, 2006
.
[5]
J
.
Read
,
et al.
, “
B
atch-
i
ncrem
e
n
t
al vers
us
ins
t
anc
e
-incr
e
m
e
ntal learning in d
y
n
a
mic and
evolv
i
ng
data,”
IDA,
pp
.
313-323, 2012
.
[6]
F. Han,
et a
l
.
, “A New Incremental Support Vector Machine Algorithm,”
TELK
OMNIKA Indonesian Journal o
f
E
l
ec
t
r
i
c
al
E
n
gi
ne
e
r
i
n
g
,
vol/issue: 10(6), pp. 1171
-1178, 2012
.
[7]
W
.
N. S
t
reet a
nd Y. Kim
,
“
A
s
t
ream
ing ens
e
m
b
le algorithm
(S
EA) for large
-
s
cale
clas
s
i
fi
ca
t
i
on,”
In
tel
l
egen
t
Conference on
K
nowledge Disco
very
&
Data Mining,
pp
. 377-38
2, 2001
.
[8]
H. Wang,
et al.
,
“Mining concept-drifting data str
eam
s
us
ing ens
e
m
b
le clas
s
i
fiers
,
”
Proc. ACM SI
GKDD Int. Conf
.
Knowl.
Disc.
Data Min,
pp
. 226–
235, 2003
.
[9]
D. Brze
zins
ki
and J
.
S
t
ef
ano
w
s
k
i, “
A
ccurac
y
Upda
ted
Ens
e
m
b
le for Dat
a
S
t
ream
s
with Concept Dr
ift,
”
Pr
oceed
ings
of
t
h
e 6th
int
e
r
natio
nal con
f
er
enc
e
o
n
Hybr
id ar
ti
fic
i
a
l int
e
l
ligen
t s
y
s
t
ems
,
pp
. 155-16
3, 2011
.
[10]
R. Elwell and R
.
Polikar
, “Incremental Learning
of
Concept Drift in Non-station
a
r
y
Environmen
ts,”
IE
EE T
r
ans
.
on Neural N
e
tw
orks
, vol. 22
, pp
. 1517-1531, 201
1.
[11]
R. Elwell and R. Polikar, “I
ncremental Learn
i
ng of Vari
able
Rate Conc
ept Drift,
”
Internatio
nal Workshop o
n
Multipl
e
C
l
assifi
er Systems (
M
CS 2009)
in
Lectu
r
e Notes
in Com
puter Science
, v
o
l. 5519
, pp
. 142
-151, 2009
.
[12]
M. Karnick,
et al.
, “Learning
concept drift in n
onstationar
y
env
i
r
onments using an ensemble
of c
l
a
ssifiers ba
sed
approach
,”
International
Join
t C
onerence.on N
e
u
r
al Network
, pp
.
3455-3462, 200
8.
[13]
M. Muhlbaier
and R. Polikar
, “An Ensemble Approach for
In
cr
em
ental
Le
arnin
g
in Nonstation
a
r
y
Environm
ents
,”
Multipl
e
C
l
assifi
er Systems,
pp. 4
90-500, 2007
.
[14]
M. D. Muhlbaie
r and R Polikar, “
M
ultiple Clas
sifier
s Based Increm
enta
l Le
arn
i
ng Algorithm
For Learning In
Nonstationar
y
Environments,”
Proceed
ings of
the Six
t
h Int
e
rnati
onal Conf
erence on Ma
chi
n
e Learning and
Cy
be
rne
t
ic
s,
vol. 6, pp. 3618–362
3, 2007
.
[15]
D. Brzezins
ki an
d J
.
S
t
efanows
k
i, “
R
eacting to Di
fferent
T
y
p
e
s
of
Concept Drift: T
h
e Accurac
y
Up
dated Ens
e
m
b
le
Algorithm,”
IEEE Transactions
on Neural
N
e
tw
orks and Learning Systems
, vol/issue: 25(1), pp.
81-94, 2014
.
[16]
http:/
/users.rowa
n
.edu/~pol
ikar
/r
esearch
/nse/
for
SEA Dataset
.
[17]
J.
Da
vis a
nd M.
Goa
d
ric
h
,
“T
he Relationship between Precisi
on-
Recall and ROC Curves,” in
Pr
oceed
ings
of the
23rd internation
a
l con
f
eren
ce on
Machin
e
learning,
pp
. 233-240
, 2006.
Evaluation Warning : The document was created with Spire.PDF for Python.