TELKOM
NIKA Indonesia
n
Journal of
Electrical En
gineering
Vol. 14, No. 1, April 2015, pp. 163 ~ 1
7
2
DOI: 10.115
9
1
/telkomni
ka.
v
14i1.723
3
163
Re
cei
v
ed
De
cem
ber 2
4
, 2014; Re
vi
sed
March 20, 20
15; Accepted
March 28, 20
15
Near Optimal Convergence o
f
Back-Propagation
Method using Harmony Search Algorithm
Abdirashid Salad Nur*
1
, Nor Haiza
n
Mohd Ra
dzi
2
, Siti Mari
y
a
m Shamsuddin
3
Soft Computin
g Rese
arch Group, F
a
cult
y of
Comput
i
ng, U
n
iversiti T
e
knol
ogi Ma
la
ysi
a
(UT
M),
813
10 Sku
dai,
Johor, Mal
a
ysi
a
*Corres
p
o
ndi
n
g
author, e-ma
i
l
: salaa
d
n
uur@
y
a
h
o
o
.com
1
,
haiza
nradz
i@ut
m.m
y
2
, sitimariy
a
ms
@gmail.c
o
m
3
A
b
st
r
a
ct
T
r
ainin
g
Artifici
al Ne
ural N
e
tw
orks (ANNs) is
of gr
eat sign
ifi
c
ance
a
n
d
a dif
f
icult task in th
e field
of
superv
i
sed
le
arni
ng as its
perfor
m
a
n
ce
dep
ends
on
und
erlyi
ng tr
aini
ng
alg
o
rith
m as w
e
l
l
as
the
achi
eve
m
e
n
t of the training
process. In thi
s
paper,
three
trainin
g
alg
o
ri
thms na
mely
Back-Prop
agat
io
n
Algorit
h
m
, Har
m
o
n
y Se
arch
Algorit
h
m
(HS
A
) and hy
brid
BP and HSA c
a
lle
d BPHSA
a
r
e e
m
pl
oye
d
fo
r the
superv
i
sed trai
nin
g
of Multi-L
a
yer Perce
p
tro
n
feed
forw
ard type of Neu
r
al Netw
orks (NNs) by givin
g
speci
a
l atte
ntio
n to hybr
id BP
HSA. A suitab
l
e
struct
ure for
data re
pres
ent
ation
of
NNs is
imple
m
ente
d
t
o
BPHSA-MLP, HSA-MLP an
d
BP-MLP. T
he propos
ed
method is
e
m
pir
i
c
a
lly teste
d
an
d
verified
usin
g
five
benc
h
m
ark c
l
a
ssificatio
n
pr
o
b
le
msw
h
ic
h ar
e Iris, Glass,
Canc
er, W
i
ne
and T
h
yroi
d d
a
taset o
n
trai
n
i
n
g
NNs. T
he MSE, training ti
me,
and cl
assificati
on accur
a
cy of hybri
d
BPHSA
are co
mp
are
d
w
i
th the standa
r
d
BP an
d
meta-
h
euristic
HSA. T
he ex
per
iment
s show
ed th
at
prop
osed
meth
od h
a
s b
e
tter r
e
sults i
n
ter
m
s
of
conver
genc
e e
rror an
d class
i
fication
accura
cy compar
ed t
o
BP-MLP a
n
d
HSA-MLP
m
ak
ing th
e BPHS
A
-
MLPa pro
m
isin
g alg
o
rith
m for neur
al netw
o
rk
trainin
g
.
Ke
y
w
ords
:
arti
ficial n
eura
l
net
w
o
rks, harmo
n
y
search, back
p
rop
agati
on, cl
assificati
on
problem
Copy
right
©
2015 In
stitu
t
e o
f
Ad
van
ced
En
g
i
n
eerin
g and
Scien
ce. All
rig
h
t
s reser
ve
d
.
1. Introduc
tion
Artificial Neu
r
al
Networks (ANNs)
are
co
nsi
d
e
r
ed
to be
po
we
rful tool
s in
pattern
cla
ssifi
cation,
predi
ction of
future events, adapta
b
ility, noise filtering and a
b
ility to learn from
its
surro
undi
ngs
[2], [5-6], [11, 24].
The p
r
ocess of training
neural net
wo
rk i
s
an o
p
timization ta
sk in which
a set of
con
n
e
c
tion weights
of ANN is
determi
ned in o
r
d
e
r
to minimi
ze the erro
r. The conn
ecti
on
weig
hts a
r
e
initially given indi
scrimi
nately to ev
ery ne
uro
n
and late
r th
ese
wei
ghtsare
modifiedite
rat
i
vely until the desi
r
ed
or
ne
ar tar
geted
o
u
tput value i
s
obtaine
d thro
ugh alte
ring t
h
e
netwo
rk
weig
hts acco
rdi
n
g
l
y [14]. When
the trai
ning
pro
c
e
ss i
s
en
ded, un
see
n
data kn
own a
s
test dataset i
s
used to test
the generali
z
ation
ability
of the cl
assifi
er. The trai
ni
ng probl
em of
ANN
req
u
ire
s
more
po
we
rful optimi
z
atio
n metho
d
s
si
nce th
e dete
r
mination of
conne
ction
wei
ght
is co
nsi
dered
cru
c
ial task a
nd sig
n
ifica
n
tly contribute
d
to the output value [24].
Back-Propa
g
a
tion (BP) i
s
a gradi
ent b
a
se
d algo
rith
mand h
a
s b
een wi
dely use
d
for
training
artificial neu
ral n
e
tworks. The B
P
algor
ithm
compute
s
the
netwo
rk’
s
o
u
tput and
red
u
c
e
s
the mea
n
sq
uare
erro
r (M
SE) between
the a
c
tual
o
u
tput and
the
targete
d
out
put by adj
usti
ng
weig
hts [3, 6, 8, 14, 24]. T
he gra
d
ient
-b
ase
d
algo
rith
ms are de
sig
ned for local sea
r
ch wh
ere
the
sea
r
ch a
r
ea
is largely attributed by the
sea
r
ch sta
r
ti
ng point. Althoug
h there
are n
o
do
ubt
s
con
c
e
r
nin
g
BP performa
n
ce i
n
some
non
-linea
r
sep
a
ra
ble p
r
oblem
s, It has le
ad to
slow
conve
r
ge
nce openi
ng the p
o
ssibility of getting stuck in
local minim
a
.
In order to
overcome
th
e lo
cal
mini
mum p
r
o
b
le
m, re
sea
r
che
r
s have
ap
pl
ied m
e
ta-
heuri
s
tic gl
ob
al optimizatio
n algorithm
s
su
ch a
s
Gen
e
tic Algorith
m
s (GA
)
[7-9
], [12], Particle
Swarm Optimiz
a
tion (PSO) [9, 11,
13, 22], Ant Colony Optimization (A
CO) [1, 23] and rec
e
ntly
introdu
ce
d Harmo
n
y Search alg
o
rithm
(HSA) [6, 14,
20] to seek
the optimal n
e
twork weigh
t
s.
These glo
bal
sea
r
ch alg
o
rit
h
ms a
r
e
clai
med have
p
r
o
duced better results si
nce
they are able
to
expand the
search
spa
c
e i
n
ord
e
r to av
oid the
lo
cal
minima p
r
obl
em. Apart fro
m
training A
NN
with global
optimizatio
n algorithm
alone BP has also
been com
b
ined with
global
sea
r
chtechni
que
s such a
s
GA an
d PSO
as
ca
n be
se
en in th
e work
cond
ucte
d
by [1, 4, 10], [15-
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 23
02-4
046
TELKOM
NI
KA
Vol. 14, No. 1, April 2015 : 163 – 17
2
164
16], [18]. In these hybri
d
form
s of BP-GA
and BP-PSO, the GA and PSO a
r
e used to initi
a
lize
and modify the weig
hts of BP network.
In this pape
r, a hybrid BP
and HSA or
kno
w
n a
s
BPHSA is empl
oyed to train the feed
forwa
r
d
neu
ral network (F
FNN). Thi
s
al
gorithm
will
e
x
ploit the cap
ability of HSA in global
se
a
r
ch
to avoid l
o
cal
minima
p
r
obl
em fa
ced
by
BP. HSA is e
m
ployed to
a
d
just th
e
wei
ghts
and
bia
s
es
of the n
e
two
r
k
whe
never
BP fails to
ge
nerate
ne
ar
o
p
timal weight
s. Th
e
sche
me i
s
em
piri
cally
tested
and
ve
rified
with five
ben
chm
a
rk
d
a
taset
s
fr
o
m
t
he University of
Califo
r
nia
at
Irvine (UC
I
)
Machi
ne lea
r
ning Repo
sit
o
ry. These d
a
taset
s
in
cl
u
de: Iris, Ca
n
c
er, Gl
ass,
Wine a
nd Th
yroid
datasets. The
experime
n
tal
resu
lt
s of the prop
osed schem
e sh
owed that BPHSA is an efficient
and go
od can
d
idate for trai
ning feed fo
rward neu
ral
n
e
twork. The p
aper i
s
organi
zed a
s
follo
ws:
Section 2 de
scribe
s the training alg
o
rit
h
ms su
ch as BP, HSA and the propo
sed techni
que
of
hybrid BPHSA. The experim
ental setups, data
s
et
s,
st
ru
ct
u
r
e of
the network a
n
d
its
impleme
n
tation a
s
well a
s
para
m
eter tu
ning a
r
e p
r
e
s
ented in Se
ct
ion 3.Sectio
n
4 explain
s
t
h
e
experim
ental
analysis a
n
d
the perform
ance of t
he prop
osed mo
del. Finally, the con
c
lu
sio
n
of
this
work
is
summariz
e
d in Sec
t
ion 5.
2. Training
Alg
o
rithms
In this
study
seve
ral
algo
rithms a
r
e
u
s
ed
to train
ANNs: the
Back p
r
op
agat
ion
(BP)
algorith
m
, Ha
rmony Se
arch Algorith
m
(HSA) a
nd
a hybrid of BP and HSA
k
nown as
BP
HS
A. A
brief de
scripti
on on ea
ch al
gorithm i
s
given in the follo
wing p
a
ra
gra
phs.
2.1.
Bac
k
Propag
a
tion Algori
t
hm
Back-p
rop
a
g
a
tion (BP) is
one of most famou
s
su
pervised traini
ng
algorithm
s for FFNN
[8, 15]. BP aims to mini
m
i
ze the total
MSE (Mea
n
Square Error) bet
ween
th
e expe
cted a
n
d
actual outp
u
t. This MSE is applied to monitor the
exploratio
n of the BP algorithm in the wei
ght
spa
c
e. In ad
dition to the usa
ge of BP
for traini
ng A
NN, it is unn
ece
s
sary to pred
etermi
ne
the
exact de
sig
n
of ANN n
e
twork a
r
chite
c
ture an
d pa
rameters (11
)
. Equation (1) p
r
e
s
ent
s the
formulai
n whi
c
h is u
s
e
d
to adju
s
t the wei
ghts u
s
ing th
e BP algorith
m
:
w
k
1
w
l
ji
k
μ
∂jiw/
∂
(
1
)
Whe
r
e
w
is th
e co
nne
ction
weight to th
e neu
ron I in
l-1 layer
and
neuron j in l
a
yer l
and
μ
is the le
arnin
g
rate that is a positive number
wh
ich is utilized
to supervise
the learning
step
s and it is usually a min
o
r po
sitive nu
mber [19].
The trainin
g
pro
c
e
ss of BP consi
s
ts of
two mech
ani
sm
s whi
c
h are: (1) forward
and (2)
back
pro
pag
a
t
ion. In the fo
rwa
r
d
pa
ss, the in
put
info
rmation u
n
its
are
tran
smitt
ed from the
i
nput
layer via the
hidde
n layer
and the
n
tra
n
s
mitted to th
e output l
a
ye
r. In the b
a
ckward p
a
ss, the
errors are b
a
ck-p
ro
pag
ated alon
g the origin
al
co
nne
ction pat
h. Modifying the weight
s of
neuron
s in ea
ch layer
can redu
ce the error. BP has
the cap
ability of local se
arch
that generat
es
near lo
cal
o
p
timal weight
s; ho
weve
r, i
t
s ability
to
global
se
arch is too
wea
k
. The
r
efo
r
e,
to
acq
u
ire the b
enefits of both techniq
u
e
s
and to
avoid their we
akness to enha
nce the lea
r
n
i
ng
ability of the Multi-Laye
r
type of feed forward neu
ral n
e
twork, we are focu
sing o
n
chan
ge
s in the
weig
hts.
2.2. Harmon
y
Search Algorithm (HS
A
)
HSA is a n
e
w
meta
-heu
ri
stic o
p
timizat
i
on
algo
rithm
derived fro
m
the impro
v
isation
pro
c
e
ss
of the musi
cian i
n
a musi
cal co
llection.
It ha
s bee
n propo
sed by G
e
e
m
et al in 20
01
.
The solution
vector i
s
anal
ogou
s to ha
rmony in musi
c. Both sche
mes of lo
cal
and glo
bal se
arch
are a
nalog
o
u
s to mu
sical req
u
ire
m
e
n
ts. Thu
s
HSA can b
e
easily a
p
plied for
sol
v
ing
optimizatio
n
probl
em
s [25]
. For in
stan
ce
, the musi
cia
n
s te
st and
pl
ay a tone o
n
their in
strum
e
n
t
to sele
ct the
perfe
ct tone
(or out
com
e
) i
n
harmony
wi
th the re
st of
the ban
d. For cre
a
ting a
n
e
w
harm
ony, either
a tone f
r
om ha
rmo
n
y memo
ry is
played with modificatio
n
s,
an
exi
s
ting
t
one
from the
me
mory i
s
pl
ayed, or a total
l
y new t
one
from
a rang
e
of a
c
ceptabl
e tone
s i
s
used
(played
)
. O
n
l
y
the be
st
ha
rmoni
es in th
e mem
o
ry are
saved and
remem
b
e
r
ed
until
bette
r
o
nes
are di
scovere
d
and exch
a
nged
with the worst harmo
n
i
es in the me
mory. The followin
g
step
s of
HSA dem
on
strate the
pe
rfect
solutio
n
v
e
ctor
that ha
s
b
een
obtai
ned duri
ng search co
mpo
nent
Evaluation Warning : The document was created with Spire.PDF for Python.
TELKOM
NIKA
ISSN:
2302-4
046
Nea
r
Optim
a
l Con
v
ergen
ce of Back-Pro
pagatio
n Method u
s
ing
…
(Abdira
shi
d
Salad Nur)
165
value
which wa
s
o
p
timal usin
g some
optimal
o
b
je
ctive function
s asse
ssed fo
r this solutio
n
vec
t
or [6, 20].
Step 1
: Defi
ne and Initial
i
ze all HS pa
ram
e
ters
(HMS, HMCR,
PAR and NI) and the
probl
em
.
Step 2
: Initialize Harm
on
y Mem
o
ry (HM
)
with rand
om
valu
es of vect
ors.
W
i
(j
)= LB
j
+
r(U
B
j
-LB
j
) for j=1,2,…,n; i=
1,2,..HMS where r
(0, 1)
Step 3
: Use
obje
c
tive fun
c
tion to im
provise New
Harm
ony ve
ctor.
While (j
≤
n)
do
If (r
1
< HM
CR) then H
new
(j)=
H
ɑ
(j) whe
r
e
ɑ
∈
(1,2…,HMS) and r
1
(0, 1)
If (r
2
<
PAR) then H
new
(j)= H
ne
w
(j)±r3*B
W whe
r
e r
2
an
d r
3
(0, 1)
Els
e
H
new
(j)=
L
B
j
(j)+
r(
UB
j
(j
)-
LB
j
(j)) wh
ere
r
(0, 1)
End loop
Step 4
: If the
cre
a
ted ha
rm
ony i
s
better t
han the on
e in the m
e
m
o
ry, then repla
c
e
it.
Step 5
: If the
term
ination criteria we
re m
e
t t
hen stop; otherwise re
p
eat step
s 3 & 4.
Step 6
: The
best solution
of the obtained ha
rm
ony is store
d
in the harm
o
n
y
m
e
m
o
ry
(HM
)
.
2.3. The pro
posed
Algorithm (BP
H
SA)
Hybridi
z
atio
n
refers to the pre
s
e
n
ce of
probl
em d
epen
dent inf
o
rmatio
n in an overall
sea
r
ch sampl
e
. Hybridi
z
ati
on ca
n be di
stingui
sh
ed
i
n
to stro
ng an
d wea
k
hyb
r
i
d
izatio
n [21, 26].
The first o
n
e
refers kno
w
led
ge re
prese
n
tation u
s
ing a
spe
c
i
f
ic ope
rator
whe
r
ea
s we
ak
hybridi
z
ation
is u
s
ed
to
combi
n
e
several
algo
rith
ms to
imp
r
o
v
e the
re
sult
of a
nothe
r
one
sep
a
rately or
it can be use
d
as an ope
rator of
the other. The hybridizatio
n approach use
d
in this
work i
s
the
combinatio
n of
two alg
o
rith
ms (we
a
k
hybridi
z
ation
)
where
one
of them a
c
ts
as
an
operator
on t
he othe
r. He
nce,
we
com
b
ine
HSA wit
h
BP algo
rith
m kn
own a
s
BPHSA. HSA is
used to generate new weights
for M
L
P whenever BP failed. T
he hybrid BP
HSA will exploit
benefits obtai
ned f
r
om
bot
h alg
o
rithm
s
and
avoid
we
akn
e
ss to
e
n
han
ce th
e le
arnin
g
ability
of
the Multi-L
a
yer type of fee
d
forward ne
ural n
e
tw
o
r
k. The follo
wing
are the
step
s which mu
st b
e
followe
d to
u
s
e
of the
hy
brid
form
of
BPH
SA nam
ed
“BPHSA-MLP” i
n
the
trainin
g
of the
sup
e
rv
is
ed N
N
s:
Step 1
:
P
r
ep
rocess data
s
et
: The data
s
et is n
o
rm
alized in
ord
e
r to
scale
all dat
a in the
rang
e of [0,
1]. The n
o
rm
alize
d
data
a
r
e
rand
omly
divided into
trainin
g
a
nd t
e
sting
sa
mpl
e
s;
therefo
r
e the
cho
s
en d
a
ta are
remov
ed from initi
a
lize
d
data
while the
re
maining d
a
ta
are
assign
ed
as trainin
g
set
s
to i
n
vestig
ate t
he
effectiveness of t
he mo
del to
wards the
M
L
P
learni
ng.
Step 2
:
Dete
rm
ine num
ber of input, hidden an
d out
put node
s for MLP archite
c
ture
: a
three laye
r n
e
twork a
r
chit
ecture is
cre
a
ted co
rre
s
p
ondin
g
to the data
s
et re
quire
ment
s. For
instan
ce, nu
mber of inp
u
t node
s are d
e
termin
ed ba
sed o
n
data
s
et attributes
(variable
s
). The
output nod
es also d
epe
nd
on the data
s
et’s de
sirable
output whil
e the numb
e
r o
f
hidden n
o
d
e
s
are
determin
ed ba
se
d o
n
equatio
n 2; t
herefo
r
e th
e
numbe
r of i
n
put layer nod
es, hid
den
la
yer
node
s an
d ou
tput layer nod
es in ea
ch d
a
t
aset differ fro
m
each oth
e
r.
Step 3
:
Initi
a
lize
wei
ghts & bia
s
rand
om
ly
: weig
hts a
nd bi
as o
f
MLP are randoml
y
initialized. Inp
u
t pattern
s are pre
s
ente
d
to
the netwo
rk to start the training.
Step 4
:
Gene
rate n
e
t output of MLP and e
v
al
uate erro
r
: Here, the net
work
output is
cal
c
ulate
d
an
d com
p
a
r
ed t
o
the targ
et value an
d
the
n
errors a
r
e e
v
aluated by compa
r
ing to t
he
threshold val
ue. If
the erro
r is gre
a
ter th
en the
thresh
old value and
steady state
o
f BP is not
met;
hen
ce weight
s and bia
s
of
MLP are adj
usted u
s
in
g BP. However if the errors are not redu
ced
grad
ually to
contribute
to t
he n
e
two
r
k’
s
gene
rali
za
tio
n
ability after six attempt
s
without
ch
an
ge,
the BP is un
able to train t
he network ef
ficiently
due to local mi
nim
a
or over fitting problem.
We
cho
o
se six
ite
r
ation
s
as
the
value of stea
dy
st
ate
pa
ra
meter
be
cau
s
e if BP attem
p
ts to
de
crea
se
the error after six iterations
and the erro
rs rem
a
in un
chang
ed, the BP will surely
be stuck in lo
cal
minima
whi
c
h con
s
ume
s
conve
r
ge
nce
time befo
r
e
finally re
ach
i
ng the
maxi
mum n
u
mbe
r
of
learni
ng iterat
ions
un
su
cce
ssfully
; therefore, it doe
s n
o
t make
se
nse to wait u
n
til BP reache
s the
maximum iterations
without
achievin
g the
training g
o
a
l
or red
u
ci
ng the error
rate.
Step 5
:
Run
HSA to gen
e
r
ate n
e
w
valu
es to a
d
ju
st the weight
s a
nd bia
s
of M
L
P
: The
Harmony Se
arch Alg
o
rith
m is called
o
n
ly wh
en BP
fails to
train
the net
work,
has confu
s
io
n of
local
minima,
or the generalizing
ability
is poor caused by
sub
optimal weight
s or over fitting
probl
em. The
r
efore, early
stoppi
ng crite
r
ia is
re
q
u
ire
d
to prevent
the netwo
rk being trapp
e
d
in
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 23
02-4
046
TELKOM
NI
KA
Vol. 14, No. 1, April 2015 : 163 – 17
2
166
local mi
nima
or over fitted.
The BP stea
dy st
ate me
chani
sm is u
s
ed to exch
an
ge the trai
nin
g
scena
rio
fro
m
BP to
HS
A to st
op th
e
traini
ng
earli
er
if BP
do
es not
pa
ss thi
s
p
r
o
c
e
s
s
co
ntrol
after five iterations
whi
c
h
supp
orts th
e netwo
rk to
decid
e whet
her to co
ntin
ue or sto
p
the
training. T
he
error
rate i
s
rising i
n
ste
ad
of minimizi
ng
or th
e e
rro
r
rate i
s
u
n
ch
a
nged
after
si
x
iteration
s
whi
c
h
s
ho
ws the
failure
of BP; this i
s
wh
at
we
call
the
steady state
o
f
BP. If stead
y
stateof BP is met then HS
A is calle
d to prod
uce
ne
w
weig
ht values to continue l
earni
ng p
r
o
c
e
s
s
of MLP and p
r
event network trainin
g
dist
urba
nce and f
a
ilure.
Step 6
:
Upd
a
t
e weig
hts &
bias fo
r M
L
P
: The valu
es i
n
the
Harmon
y Memory g
e
nerate
d
by HSA are
use
d
as
wei
ghts a
nd bia
s
of ML
P ne
uron
s. Netwo
r
k
weig
hts a
nd bia
s
are the
n
adju
s
ted pro
bably due to BP’s failure; the update
d
weig
hts are use
d
in orde
r to decre
ase the
errors and to
finish the le
a
r
ning process
su
ccessfully.
Step 7
:
Stop
training
an
d Save re
sults
:
The training
i
s
stopp
ed if t
he trai
ning
go
al is
met
or the maxim
u
m iteration
s
is rea
c
h
ed. And t
hen, the
results obtai
ned from the
training p
r
o
c
e
s
s
are
retu
rne
d
t
o
prepa
re th
e
network for testing
and
to
com
p
a
r
e the
algo
rithms in
term
s of e
r
ro
r
conve
r
ge
nce and cl
assifica
tion accuracy
in
both traini
ng and te
stin
g accuraci
es.
3.
Training Neu
r
al Net
w
o
r
k
s
using BPHS
A
The feed forward traini
ng
pro
c
e
ss ma
inly involves decidi
ng th
e con
n
e
c
tion
weight
s
among
the
n
euro
n
s which
minimi
ze
s th
e e
rro
r. Altho
ugh
HSA
can
be
used
for
both
contin
uo
us
and di
screte
optimizatio
n probl
em
s, it can also be u
s
ed for NN wei
ghts
3.1. Data
Re
presen
ta
tion
The NN wei
g
hts
a
r
e
ta
ken from
the harmony
ve
cto
r
i
n
the ha
rmo
n
y
memory
(HM) which
contai
ns
several data
stri
ngs to re
pre
s
ent
we
ig
hts of input thro
ugh hidd
en l
a
yer proce
ssing
element
s, hid
den a
c
ro
ss o
u
tput pro
c
e
s
sing eleme
n
ts,
hidden a
nd o
u
tput biases [
26].
Figure 1. FFANN
sampl
e
for weig
ht vector re
pre
s
e
n
ta
tion
3.2. Problem Initializ
ation
The o
b
jective
(fitness) fu
n
c
tion of hyb
r
i
d
BPHSA used for trainin
g
NN is
MSE. Lowe
r
and u
ppe
r b
o
und
sare ta
ke
n as [-1, 1].
All soluti
o
n
vectors
(ha
r
m
onie
s
) i
n
the
HM a
r
e
ran
d
o
mly
cre
a
ted u
s
ing
equation (1).
These
gene
rated sol
u
tion
s are u
s
e
d
as weight
s for NN:
Wi
j
L
B
j
r
UBj
LBj
f
o
r
j
1
,
2
,
…,n
;
i
1
,
2
,..H
M
S
w
h
e
r
e
r
∈
0,
1
(1)
Evaluation Warning : The document was created with Spire.PDF for Python.
TELKOM
NIKA
ISSN:
2302-4
046
Nea
r
Optim
a
l Con
v
ergen
ce of Back-Pro
pagatio
n Method u
s
ing
…
(Abdira
shi
d
Salad Nur)
167
4. Experimental se
tup
4.2. Data
se
ts
The pe
rform
a
nce of the p
r
opo
sed m
ode
l is te
sted u
s
i
ng five datasets obtain
ed f
r
om the
University of California at
Irvi
ne (UCI)
Machi
ne le
arning Repo
sit
o
ry. Thes
e datas
e
t
s
are: Iris
,
Can
c
e
r
, Glass, Wine a
nd
Thyroid d
a
tasets.
Iris data
s
et: one of the most pop
ular
and well
kno
w
n data
s
et
s use
d
in cla
s
sificatio
n
probl
em
whi
c
h can
be
fou
nd in
the
patt
e
rn
re
co
gnitio
n
literature.
I
t
con
s
i
s
ts of t
h
ree
cla
s
ses
(1)
iris seto
sa, (2)
iri
s
ve
rsi
c
olor and,
(3)
iris virgini
c
a.
Each
cla
s
s h
a
s fifty types of iris plant.
Its
cla
ssifi
cation
is ba
sed o
n
sepal len
g
th, s
epal wi
dth, petal length an
d petal width.
Can
c
e
r
data
s
et: origi
nally
, this dataset wa
s i
n
vente
d
by Dr. Will
iam H. Wol
b
erg a
s
repo
rted
i
n
cli
n
ical ca
se
s a
t
the
Unive
r
si
ty of Wiscon
sin Ho
spital.
It
co
nsi
s
ts
of 6
99 in
stan
ce
s
in
whi
c
h 48
5 are benig
n
exa
m
ples a
nd 24
1 are mali
gna
nt examples.
Wine d
a
taset: wine ca
me from the an
al
ysis of
ch
emi
c
al of win
e
s
whi
c
h were
cultivated
in the same
regio
n
then
derived f
r
o
m
three
different cultivars (group
of pl
ants
cho
s
e
n
for
desi
r
abl
e feat
ure
s
). T
he da
taset con
s
ist
s
of 178 in
stan
ce
s
an
d
13 continuo
us attributes of
thre
e
cla
s
ses in
wh
ich
all cl
asse
s a
r
e
se
para
b
le; they
hav
e be
en u
s
e
d
by many oth
e
r
s fo
r
com
paring
sev
e
r
a
l cla
s
si
f
i
ers.
Glass d
a
tase
t: This data
s
et contai
ns
2
14 in
stan
ce
s and 9
co
ntinuou
s attrib
u
t
es. Thi
s
data was
use
d
to cl
assify glass ty
pe
s t
hat we
re
of intere
st in
cri
m
inologi
cal i
n
vestigatio
n. The
missi
ng
gla
s
s
can
be
be
nefitted a
s
e
v
idence at
t
he
scene
of
the
crim
e if
it is
co
rrect
l
y
determi
ned.
Many che
m
ical m
easurem
ents are empl
oyed to det
ermine ea
ch gl
ass sa
mple.
Thyroid
di
sea
s
e
data
s
et: th
is d
a
taset is t
he la
rge
s
t
da
taset a
m
on
g
the data
s
et
s
use
d
in
this exp
e
rim
e
nt. It con
s
ist
s
of
720
0 in
stan
ce
s
wi
th
21 featu
r
e
s
categori
c
al
an
d continu
o
u
s
in
whi
c
h 1
5
a
r
e
binary
data
with the re
mai
n
ing 6
co
nt
in
uou
s. The
ai
m of cla
s
sification is to de
cide
wheth
e
r a pa
tient referred
to the clinic i
s
hypothy
roi
d
. It contain
s
three
type
s of patients; hyp
e
r-
function, no
rmal functio
n
and subn
orm
a
l functi
on. T
he data a
r
e d
i
vided into si
ngle trai
ning
and
testing sets with 3772 an
d 3428 in
stan
ces re
sp
ectivel
y
.
4.3. Data
se
ts
Partition
Each data
se
t is divided according to
ten-
fold cro
ss partition sch
e
me. Therefo
r
e there
are two
sub
s
ets (1
0% and
90%) for trai
ning an
d test
i
ng purpo
se
s, respe
c
tively.
This process is
applie
d to
all
datasets
except thyroi
d d
a
taset. T
he
t
r
aining
set is resp
on
sible
fo
r
comp
uting t
he
gradi
ent and
updatin
g the bias a
nd the
weig
hts of
th
e netwo
rk. During trainin
g
,
the erro
rs
a
r
e
controlled at each
iteratio
n
;
usi
ng BP St
eady state. T
he test
set i
s
use
d
after t
r
ai
ning to
com
p
are
different models to
exami
ne the
ability of the cl
assifier to cl
assi
fy. T
he experiment
al results
obtaine
d fro
m
all three
algorithm
s forall data
s
e
t
s are de
picted unde
r e
x
perime
n
ts and
discu
ssi
on
se
ction. Th
e hy
brid BP
HSA and oth
e
r
alg
o
rithm
s
a
r
e e
x
ecuted i
n
five inde
pen
de
nt
run
s
to t
r
ain t
he al
gorithm
for go
od l
earning
cap
abilit
y and th
e ap
plied te
st
set
s
to
evaluate
the
gene
rali
zatio
n
ability of the pro
p
o
s
ed
sch
eme. Bu
t the thyroid dataset is no
t applied to this
pro
c
ed
ure be
cau
s
e it is ori
g
inally partitio
ned into si
ngl
e training a
n
d
testing set.
4.4. Implementa
tion
Thre
e layere
d network archite
c
ture fo
r feed
forward
NNs is
used
for all data
s
ets. The
numbe
r of n
o
des i
n
eve
r
y layer hi
ghly d
epen
ds
on
th
e data
s
et rep
r
esentation.
The
sele
ction
of
hidde
n
laye
r node
s
i
s
an open
subje
c
t and
th
ere
i
s
no stand
ard
f
o
rmul
a
o
r
pro
c
ed
ure
that can
sup
port th
e
optimal hid
d
en layer no
des.
Ho
weve
r, in this
wo
rk, Equ
a
tion
(2) i
s
used
to
determi
ne th
e numb
e
r of
hidden l
a
ye
r nod
es[1
7]. As a result
, the numbe
r of processi
ng
element
s in the hidd
en layer for Iris, Gl
ass, Wine,
Cancer an
d Th
yroid data
s
et are 3, 4, 6, 4 and
8 r
e
spec
tively,
Number
ohiddenneur
o
ns
√
InNod
e
∗
O
utNo
de
(
2
)
w
h
er
e
InNod
e
= n
u
m
ber of input n
ode
s
OutNo
d
e
=
nu
mber of outp
u
t node
s
More
over, th
e tangg
ented
sigm
oid tran
sfer fu
nctio
n
whi
c
h i
s
al
so
kn
own
a
s
ta
nsig i
s
use
d
to de
scribe th
e final
output of n
o
d
e
s. Th
e tan
s
i
g
functio
n
is
use
d
a
s
a transfe
r fun
c
tio
n
betwe
en inp
u
t
layer and hi
dden laye
r a
nd between
hidde
n layer
and outp
u
t la
yer. It is deri
v
ed
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 23
02-4
046
TELKOM
NI
KA
Vol. 14, No. 1, April 2015 : 163 – 17
2
168
from hype
rbo
lic tange
nt function a
nd h
a
s the ab
ility
to deal directl
y
with negati
v
e numbe
rs;
the
followin
g
is th
e equatio
n formula of tansi
g
transf
e
r of functio
n
:
F(x
)
= 2/
(1
+ex
p
(
-
2
*
n
)
)
-
1
(
3
)
In this wo
rk,Mean Squ
a
re
Erro
r (MSE) is em
ploye
d
to measure t
he erro
r for
NN. Thi
s
function
is co
ntinuou
s, mo
notono
us,
differentia
ble
as
well
as it take
s a
si
ngle
mi
nimum fu
ncti
on.
The MSE can
be defined a
s
:
MSE
∑
Desir
edOut
p
ut
N
et
w
o
rk
Output
(
4
)
Whe
r
e:
De
sire
d Outp
utis wa
nted o
r
target value
Network Outp
utis actu
al ou
t
put produ
ce
d
by the netwo
rk.
For compa
r
i
s
on rea
s
o
n
s a
stand
ard BP and sta
nda
rd
HSA is use
d
to train feed forward
NNs
with sa
me network a
r
chite
c
tu
re a
n
d
a fitnes
s fu
nctionto
eval
uate the p
r
op
ose
d
techniq
ue of
BPHSA. Learning rate is
set to 0.7, learning iterat
io
n is set to 5000
and error thresh
old is set to
0.005.
Th
e steady state
paramet
e
r
-whi
ch i
s
u
s
e
d
a
s
exchan
ge me
ch
ani
sm between
the
stand
ard BP and HSA – isset to 6. The meanin
g
of
steady state is
explained
in
se
ction 2.3 st
ep
5.The initial v
a
lues of weights are randomly se
le
cted
betwe
en -1 a
nd 1. All alg
o
rithms a
r
e
cod
ed
and exe
c
uted
on the same
comp
uter u
s
i
ng Matlab. Th
e HSA para
m
eters a
r
e
sho
w
n in Tabl
e 1
.
Table 1. HAS
Paramete
r setting
No.
Parameter
Value
1 LB
-1
2 UB
1
3 PAR
0.7
4 HMCR
0.95
5 HMS
21
6 NI
100
5. Experiments and
Disc
ussion
In this
se
ction, we p
r
e
s
e
n
t
the experim
ental
result
s
obtaine
d fro
m
the exe
c
uti
on of five
indep
ende
nt runs u
s
ing th
ree trai
ning
al
gorithm
s
whi
c
h are: sta
nda
rd BP, sta
nda
rd
HSA an
d t
he
prop
osed me
thod BPHSA for MLP. The
averag
e,
me
dian an
d sta
n
dard
deviatio
n
of MSE along
with cla
s
sifica
tion accu
ra
cy, learning ite
r
ation and
trai
ning time were taken int
o
consi
deration
as
the perfo
rma
n
ce p
a
ra
mete
rs of the algo
rithms.
In the
ne
w
scheme, BP
HSA-MLP, the
MLP net
wo
rk is initially trai
ned
with
stan
dard
BP.
The l
earning
pro
c
e
s
s of M
L
P is only
pa
ssed to
HSA
wh
eneve
r
we beli
e
ve BP
fails to
co
nve
r
ge.
In other wo
rd
s, the BP-ML
P halts the training whe
n
e
v
er a pred
efined num
ber
of iterations o
f
the
steady
state
para
m
eter i
s
met. To reiniti
a
te the
lea
r
ni
ng process of
MLP, the HS
A continu
e
s t
h
e
learni
ng by a
d
justin
g weig
hts and bi
ase
s
acco
rdin
gly.
The traini
ng pro
c
e
ss
can
be stop
ped when eithe
r
the minimum e
rro
r of 0.005 i
s
met or
the maximum
epo
ch of
500
0 is
rea
c
h
ed.
Acco
rdi
ng to t
he BP, the le
arnin
g
can
be
stopp
ed
whe
n
either the
aforem
ention
e
d
crite
r
ia a
r
e
met
or the
steady
state par
amete
r
equal
s 6.
The
para
m
eter,
st
eady state, i
s
use
d
eithe
r
t
o
stop
th
e tra
i
ning p
r
o
c
e
s
s of MLP if BP is involved
in
the net
wo
rk training
or to t
r
ansfe
r th
e tra
i
ni
ng
pro
c
e
ss of MLP
bet
ween BP
an
d
HSA alg
o
rith
ms
to avoid the local mini
ma.
Table 2. Re
sults of BPHSA-MLP,
HSA-MLP and BP-MLPon Iris
Dataset
MSE
Median
Std. De
v
.
A
c
c
u
rac
y
Time
(s)
Epoch
BPHS
A
-
M
L
P
0.00598
0.0050
0.00212661
7
98.5
16
1087
HS
A
-
MLP
0.0323
0.0097
0.03261924
3
93.9
61.2
5000
BP-MLP
0.06662
0.0715
0.01156058
96.82
Stopped at 75.6
Sec
Not converged
Evaluation Warning : The document was created with Spire.PDF for Python.
TELKOM
NIKA
ISSN:
2302-4
046
Nea
r
Optim
a
l Con
v
ergen
ce of Back-Pro
pagatio
n Method u
s
ing
…
(Abdira
shi
d
Salad Nur)
169
As
can
be
seen i
n
T
able
2, results
show
that BP
HSA-MLP
ha
s
conve
r
ge
d
at epo
ch
1087 i
n
16
seco
nd
s with
an a
c
cura
cy
of 98.5% and
MSE of 0.00
59. The
s
e
re
sults i
ndi
cate
that
BPHSA-MLP has the faste
s
t conve
r
ge
n
c
e sp
eed, th
e
highe
st accura
cy rate, the smalle
st error
and the short
e
st time com
pare
d
to BP-MLP and HS
A-MLP.
Table 3. Re
sults of BPHSA-MLP, HSA-MLP and BP-MLP on Gla
s
s Data
set
MSE
Median
Std. De
v
.
A
c
c
u
rac
y
Time
(s)
Epoch
BPHS
A
-
M
L
P
0.01493
0.0191
0.00592279
5
96.06
61.2
5000
HS
A
-
MLP
0.01634
0.0175
0.00401098
5
91.5
80.4
5000
BP-MLP
0.037406
0.0445
0.01156058
93.96
Stopped at 24 S
e
c
Not converged
The expe
rim
ental re
sult
s
pre
s
ente
d
in
Table 3
proved that both
BPHSA-MLP
and
HSA-
MLP have re
ach
ed the m
a
ximum lea
r
n
i
ng iteratio
n whi
c
h is
500
0 with 61.2 a
nd 8.4 secon
d
s,
respe
c
tively. Ho
wever, the
BPHSA-ML
P has
outp
e
rformed
HSA-MLP in term
s of MSE and
corre
c
t cla
s
si
fication rate.
On the other
hand, BP-MLP has
sto
pped the trai
ning due to the
steady state
para
m
eter to
prevent
the
netwo
rk b
e
in
g trappe
d at the local mini
ma. Conve
r
sely
the scheme,
BPHSA-MLP,
showed its abilit
y to escape from local minima.
Table 4. Re
sults of BPHSA-MLP, HSA-MLP and BP-MLP on Win
e
Dataset
MSE
Median
Std. De
v
.
A
c
c
u
rac
y
Time
(s)
Epoch
BPHS
A
-
M
L
P
0.004992
0.00499
0.00000447
214
98.66
7
439
HS
A
-
MLP
0.01138
0.00723
0.01186237
5
96.02
40
5000
BP-MLP
0.1896
0.189
0.08726457
94.74
Stopped at 21.6
Sec
Not converged
From T
able 4
,
it can be se
en that BPHS
A
-MLP
offers the fas
t
es
t
convergenc
e rate, the
smalle
st MSE and the high
est accu
ra
cy rate compa
r
e
d
to HSA-ML
P and BP-ML
P. The BPHSA-
MLP ha
s
con
v
erged
at iteration 43
9 in
7
se
con
d
s with
an MSE of 0.
0049
92 a
nda
n accu
ra
cy ra
te
of 98.66%. Howeve
r, the
HSA-MLP
re
ach
ed the
m
a
ximum lea
r
n
i
ng iteratio
n with an a
c
curacy
rate of 9
6
.02
%
and MSE
of 0.0113
8 in
40 second
s t
i
me. Unli
ke
HSA-MLP, the
BP-MLP ha
s
no
t
conve
r
ge
d to the minimum erro
r and
had stop
ped
the training p
r
ocess in 21.
6 se
cond
s wi
th
MSE of 0.1896 and
94.74
%accuracy
rate. Hence, the BPHSA
-M
LP has
capability of avoiding
the local mini
ma.
Table 5. Re
sults of BPHSA-MLP, HSA-MLP and BP-MLP on Ca
ncer Data
set
MSE
Median
Std. De
v
.
A
c
c
u
rac
y
Time
(s)
Epoch
BPHS
A
-
M
L
P
0.01752
0.0204
0.00606934
9
97.28
75
5000
HS
A
-
MLP
0.02342
0.0242
0.00633577
1
95.14
90
5000
BP-MLP
0.04896
0.0538
0.02401297
6
94.28
Stopped
at
68.4
Sec Not
converged
For Ca
nce
r
d
a
taset both BPHSA-MLP
and HSA-M
L
P
re
ach
ed
t
he
maximu
m
learning
iteration
a
s
shown in
Ta
bl
e 6.
Ho
weve
r, BPHSA o
w
ns th
e
small
e
st e
rro
r
rate
and th
e
sho
r
t
e
s
t
time as well a
s
the high
est
accuracy rate
in
comp
ari
s
o
n
to HSA-ML
P and BP-ML
P. But,
the BP-
MLP stop
ped
the trainin
g
at epo
ch 1
5
7
6
in 68.4
se
cond
s with
94.
28% accu
ra
cy rate a
s
well
as
0.0489
6 MSE erro
r rate.
Table 6. Re
sults of BPHSA-MLP, HSA-MLP and BP-MLP on Thyroid Data
set
MSE
Median
Std. De
v
.
A
c
c
u
rac
y
Time
(s)
Epoch
BPHS
A
-
M
L
P
0.0305
0.0307
0.00131263
1
94.3
306
5000
HS
A
-
MLP
0.05004
0.05
0.00096072
9
92.44
313.8
5000
BP-MLP
0.03796
0.0336
0.00738667
7
93.54
Stopped at 536.
4
Sec
Not converged
Acco
rdi
ng to
the re
sult
s prese
n
ted in
Ta
ble 6 b
o
th BPHSA-MLP
an
d HSA-M
L
P
met the
maximum iteration in 306 a
nd 313.8 second
s,
respe
c
tively. Althoug
h BP-MLP ha
s bee
n trapp
ed
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 23
02-4
046
TELKOM
NI
KA
Vol. 14, No. 1, April 2015 : 163 – 17
2
170
in local
mini
ma at epo
ch
4777, it ha
s the best
co
rrect
cla
ssifi
cation rate
an
d small
e
st e
r
ror
c
o
mpared to
HSA-MLP. Y
e
t HSA-MLP
is
better
than BP-MLP and BPHSA in s
t
d Dev. Als
o
, i
t
has th
e
sho
r
test time co
mpared to B
P
-MLP. For correct
cla
ssifi
cation
rate,
MSE and trai
ning
time, BPHSA-MLP is the b
e
st amon
g all
.
The re
sult
s showed that BPHSA is better than b
o
th stand
ard BP-MLP and HS
A-MLP i
n
terms of conv
erge
nce rate,
training time
and cla
s
si
fication accu
ra
cy with clear
result
s in all five
datasets which sho
w
s the
ability of tech
nique to
wa
rd
s the cl
assifie
r
’s p
e
rfo
r
man
c
e. Ho
weve
r
in
some
ca
se
s
BP is
better
than b
o
th
HSA and BP
HSA su
ch
a
s
sho
r
test
time
of trai
ning.
Yet,
BPHSA-MLP
outperform
s BP-MLP and
HSA-MLP
in most cases, therefore the ability of th
e
cla
ssifie
r
ha
s been in
crea
sed
and p
r
o
p
o
se
d mod
e
l
sho
w
e
d
its a
b
ility to train the MLP type of
FFANN. In a
ddition, the
p
r
opo
se
d m
o
d
e
l contrib
u
ted
to ge
nerate
accurate
re
su
lts an
d redu
ced
the amount of
erro
r in mo
st of the experim
ents of this
hybrid form of
BP and HSA.
In con
c
lu
sion
, the propo
sed tech
niqu
e
has sho
w
n
its capa
bility and the validity to
contri
bute to
the learni
ng
enhan
cem
e
nt of the
feed forward
NN towards a
ccurate results in
cla
ssifi
cation probl
em
s.
6. Com
p
aris
on bet
w
e
e
n
BP, HSA and
BPHSA
In this sectio
n, a compa
r
a
t
ive analysis i
s
ca
rri
ed out in orde
r to co
mpare the BP-MLP,
HSA-MLP
and BPHSA-MLP algorithms
.
This compa
r
ison
is
ba
sed
on the
corre
c
t cl
assification
percenta
ge
of ea
ch t
e
chniqu
e for
all five data
s
ets u
s
ed
i
n
this expe
riment. Figu
re 2
demon
strates the corre
c
t cl
assificatio
n
p
e
rcentag
e for all dataset
s.
For Iri
s
, Gla
s
s, Can
c
e
r
an
d Wine
data
s
ets,
the BP-MLP ispo
ore
s
t in both cl
a
ssifi
cation
accuracy a
n
d
co
nverg
e
n
c
e
spe
ed a
m
ong the
s
e
algorith
m
s.
Howeve
r BPHSA-MLP ha
s the
fastest
co
nve
r
gen
ce
spee
d and
the hi
ghe
st co
rrec
t
c
l
ass
i
fic
a
tion rate with
minimum
error in
rea
s
on
able ti
me for afore
m
entione
d d
a
taset
s
. For
Can
c
e
r
and
Thyroid dat
a
s
ets, re
sult
s sho
w
t
hat
B
P
H
S
A
-
MLP
ha
s b
e
t
t
e
r r
e
s
u
lt
s i
n
cla
ssif
i
cation accuracy co
mpared
to HSA-MLP
an
d BP-
MLP. Althoug
h BP-MLP ha
s stop
ped l
e
a
r
ning
at iterat
ion 477
7 in 6
8
.4 se
con
d
s t
i
me, it has the
least converg
ence error a
m
ong the
s
e
algorith
m
s
fo
r Can
c
e
r
dat
aset. Furth
e
rmore, HSA-MLP
has the
wo
rst
results in co
rre
ct cla
s
sification in co
mp
arison with B
P
HSA-MLP a
nd BP-MLP for
Can
c
e
r
and
Thyroid d
a
ta
sets b
u
t for the co
nverg
e
n
ce
spe
ed HSA-MLP outp
e
rform
s
the
BP-
MLP in all aforeme
n
tione
d results of dat
aset
s.
Figure 2. Correct
cla
ssifica
tion percenta
ge
com
p
a
r
iso
n
betwe
en le
arnin
g
algo
rithms
In terms of converg
e
n
c
e rate, BPHSA-MLP has the fastest conve
r
gen
ce
spe
e
d
among
these
algorit
h
ms i
n
most
of
these ex
perim
ents
wit
h
the ability
of escapi
ng l
o
cal
minima
and
redu
cin
g
the
erro
rs a
c
c
o
rdingly
.
Ho
w
e
v
e
r, BP-
ML
P suffered from lack of contin
uity of th
e
training p
r
o
c
e
ss o
w
ing to the need to me
et the st
eady state paramet
ers in
a num
b
e
r of spe
c
ifie
d
iteration
s
(wh
i
ch m
ean
s
errors a
r
e u
n
ch
ange
d in thi
s
pred
efined
ite
r
ation
s
) whi
c
h force
s
the
BP
Evaluation Warning : The document was created with Spire.PDF for Python.
TELKOM
NIKA
ISSN:
2302-4
046
Nea
r
Optim
a
l Con
v
ergen
ce of Back-Pro
pagatio
n Method u
s
ing
…
(Abdira
shi
d
Salad Nur)
171
to halt the tra
i
ning. HSA-MLP ha
s be
e
n
in second
p
l
ace i
n
mo
st of these
expe
rimental re
sul
t
s.
Ho
wever,
no
local
minim
u
m ha
s b
e
e
n
found
in
HSA-MLP i
n
all be
nchm
ark data
s
et
s. The
combi
nation
of BP-MLP a
nd HSA
-
MLP
ha
s solved
t
he lo
w p
e
rfo
r
mance of BP
-MLP. Thi
s
n
e
w
scheme
abl
e
to pe
rform
global
se
arch and
pote
n
tial to avoid
l
o
cal
minimu
m problem.
As a
result, BPHSA-MLP ha
s b
e
tter co
nverg
ence sp
eed
a
nd the be
st cl
assificatio
n
a
c
cura
cy.
7. Conclusio
n
In this pap
er,
a hybrid Ba
ck prop
agatio
n
and
Harm
ony
Search Algo
rithm calle
d BPHSA-
MLP is em
pl
oyed fo
r
NNs of feed
forward
type to
cl
assify proble
m
s. Th
e train
i
ng p
e
rfo
r
ma
nce
and ge
ne
rali
zation a
b
ility of the BPHSA schem
e we
re veri
fied and te
sted u
s
ing f
i
ve
cla
ssif
i
cat
i
on
ben
chma
r
k
d
a
t
a
set
s
.
The
sum of
sq
u
a
red erro
rs, trai
ning time a
n
daccu
ra
cy were
comp
ared
with stan
da
rd
HSA and
stan
dard
BP.
The
experim
ental
re
sults
sh
ow that BPHSA-
MLP schem
e
can
su
cce
ssf
ully train feed
forwa
r
d type
NNs
with re
a
s
on
able time,
MSE and hig
h
accuracy. T
h
e BPHSA-M
L
P is b
e
tter th
an
comp
are
d
algo
rithm
s
in
trainin
g
a
nd
cla
ssifi
cation
of
test pattern
s.
Therefore, B
P
HSA techni
que
can
b
e
a
good
can
d
id
ate to train
NN type of fee
d
f
o
rwa
r
d f
o
r
cl
as
sif
i
cat
i
o
n
p
r
oble
m
s.
The
sch
em
e
can
also be u
s
e
d
in training
both su
pervi
sed
and un
su
perv
i
sed mo
del
s.
Referen
ces
[1] Wei
G.
Evolution
a
ry Ne
ural
Netw
ork Based o
n
New
Ant Colo
ny Al
g
o
rith
m
. Eight
Internatio
na
l
S
y
mp
osi
u
m on
Computati
o
n
a
l
Intellig
enc
e an
d Desi
gn. 20
08
; 1: 318-32
1.
[2]
S Nara
ya
n, GA
T
agliari
ni,
EW
Page. E
nha
ncin
g M
L
P net
w
o
rks
using
a dis
t
ributed
dat
a
repres
entati
on.
IEEE Trans. S
yst., Man, Cybern. B, Cybern
.
1996; 2
6
: 143.
[3]
Castro PAD,
F
J
Von Z
ube
n.
T
r
aini
ng
multilay
e
r p
e
rce
p
trons w
i
th a
Gaussia
n
Artificial I
m
mu
n
e
System
. Evol
ution
a
r
y
Comp
utation (CE
C
),
IEEE Congr
ess. 201
1: 125
0-12
57.
[4]
Z
W
Geem, JH Kim, C Bae.
T
r
enchless W
a
ter Pi
pe Co
n
d
itio
n Assess
ment Using Ar
tificial Ne
ural
Net
w
ork. Bosto
n
: Massachus
e
t
s. 2007: 1-9.
[5]
Suni
la Godar
a, Nirmal. Intelli
g
ent and Effecti
v
e
Decisi
on Su
pport S
y
st
em Using Mu
ltila
ye
r perceptro
n.
Internatio
na
l jo
urna
l of eng
ine
e
rin
g
researc
h
and a
p
p
licati
o
n
s
(IJERA).
2011; 1(3): 513-5
1
8
.
[6]
Katten A, Abdullahi
R, Salam A.
Harmon
y
Search
bas
ed su
pervis
e
d
traini
ng of ar
tificial n
eur
a
l
netw
o
rks
. In
pro
c
e
e
d
i
ng
s o
f
th
e
fi
rst i
n
te
rna
t
i
o
na
l
co
n
f
e
r
en
ce
on
in
te
l
l
i
g
e
n
t
sy
ste
m
s, mo
d
e
l
i
n
g
and
simulati
on (IS
M
S 2010). 20
1
0
; 7(10): 10
5-1
10.
[7]
Ming
Z
,
W
Li
po.
Intel
l
i
gent
tradi
ng
usi
n
g
sup
port v
e
ctor
re
gressi
on
and
multi
l
ay
er
perc
eptro
ns
opti
m
i
z
e
d
w
i
th gen
etic alg
o
rith
ms
. Neur
al Net
w
o
r
ks (IJCNN), the Inte
rnation
a
l Joint Co
nfer
ence. 20
10
:
1-5.
[8]
Meng J
oo E,
L F
an.
Gen
e
tic Algorit
h
m
s for
MLP Ne
ural
Netw
ork para
m
eters
opti
m
i
z
ation
. Chinese
Contro
l and D
e
cision C
onfer
e
n
ce (CCD
C). 2009: 36
53-
365
8.
[9]
Correa BA, A
M
Gonzalez.
Evoluti
onary A
l
gorit
hms for S
e
lecti
ng
the Ar
chitecture of a
MLP Neura
l
Netw
ork: A Credit Scori
ng C
a
se
. IEEE 11th International Conferen
ce on Data Mining Workshops
(ICDMW
)
. 2011: 725-7
32.
[10]
Lia
ng L, L S
h
i-
bao, et a
l
.
A C
o
mbi
natori
a
l S
earch Met
hod
Based
on H
a
r
m
o
n
y Se
arch
Algorit
h
m
an
d
Particle Sw
ar
m Opti
mi
z
a
tio
n
in S
l
o
pe
Stabil
i
ty Ana
l
ysis
. CiSE In
ternatio
nal
Co
nferenc
e o
n
Comp
utation
a
l
Intelli
genc
e an
d Soft
w
a
re En
gin
eeri
ng. 20
0
9
: 1-4.
[11]
Haza
Nuzl
y A
bdu
ll H
a
me
d
SMSNS. Parti
c
le S
w
a
rm Optimizati
on for
Neur
al N
e
t
w
ork Le
arni
n
g
Enha
ncem
ent.
Jurna
l
teknol
og
i
. 2008; 4
9
(D): 13-2
6
.
[12]
Raul
Ram
o
s-P
o
lla
n NG
dPMA
GL. Optimizing
the
ar
ea
un
de
r the ROC c
u
r
v
e in
multi
l
a
y
er
perce
ptron-
based classifiers.
IARIA
. 2011
: 75-81.
[13]
Masoo
d
Z
a
ma
ni, Alireza S
a
deg
hia
n
. A Variatio
n
of Particle S
w
arm Optimizati
on for
T
r
aining of
Artificial Ne
ura
l
Net
w
orks. C
o
mputati
o
n
a
l Intelli
ge
nce a
n
d
Moder
n Heu
r
istics, Al-Dah
oud Al
i (Ed.),
ISBN: 978-
953-7619-28-
2, InT
e
ch,
Available from: http://www
.
i
ntec
hopen.com/books/
c
o
mputational-
intell
ig
ence-
an
d-mod
e
rn-h
eur
istics. 2010
[14]
Kull
uk S,
L Oz
bakir,
et a
l
. T
r
aini
ng
n
eura
l
n
e
t
w
o
r
ks
w
i
th
h
a
rmon
y
searc
h
al
gorithms
for
class
i
ficatio
n
prob
lems.
Eng.
Appl. Artif.Intell
. 2011; 2
5
(1): 11-1
9
.
[15]
Xi
n Su
n, Qio
n
g
xin
Li
u, Le
i
Z
hang.
A BP
Neur
al N
e
tw
ork Mode
l Bas
e
d on
Gen
e
tic
Algorit
h
m
for
Co
mpre
he
nsiv
e Eval
uati
on.
Circu
its,
Co
mmu
n
icati
ons
an
d Syste
m
(PACCS).
Third P
a
cific-As
ia
Confer
ence. 2
011: 1-5.
[16]
H Sha
y
e
g
h
i
, HA Sha
y
a
n
far
,
G Azimi. A h
y
br
id P
a
rticl
e
S
w
arm Opti
mizatio
n
Back
Propag
atio
n
Algorit
hm for S
hort T
e
rm Loa
d F
o
recasti
ng.
Internatio
na
l jo
urna
l on, T
e
ch
nical
an
d p
h
ysi
cal pr
obl
e
m
s
(IJTPE)
. 2010; 4(2): 12-2
2
.
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 23
02-4
046
TELKOM
NI
KA
Vol. 14, No. 1, April 2015 : 163 – 17
2
172
[17]
Siti Mari
ya
m S
hamsu
ddi
n, R
a
zan
a
Al
w
e
e,
Maslin
a
Darus
.
Stud
y
of Cos
t
F
unctions
i
n
T
h
ree T
e
rm
Backpro
pa
gati
on for
Cl
assific
a
tion
Prob
lems
.
IEEE, w
o
rld c
ongr
ess o
n
Na
ture & B
i
ol
og
ic
ally I
n
spir
ed
Co
mp
uting
. 2
0
09; 978(
1): 564
-570.
[18]
Jing-R
u
Z
h
an
g
,
Jun Z
h
an
g,
T
a
t-Ming Lock,
Mich
a
e
l
R
Lyu
.
A hy
brid
Parti
c
l
e
S
w
arm
Op
ti
mi
za
ti
on
-
Back-Prop
agat
ion Al
gorithm f
o
r F
eed F
o
r
w
a
r
d Neur
al Net
w
ork T
r
aining.
El
sever, Appl
ied
Mathe
m
atic
s
and C
o
mput
ati
o
n
. 200
7: 102
6
-
103
7.
[19]
Shul
in W
a
n
g
, Shua
ng Yi
n, Mingh
ui Jia
ng.
H
y
brid N
eura
l
N
e
tw
ork Based
on GA-BP for Person
al Cre
d
i
t
Scoring. IEEE
,
F
ourth Internati
ona
l Conf
erenc
e on
Natur
a
l C
o
mputati
on. 20
08: 209-
21
4.
[20]
Saee
d T
a
vako
li, Ehs
an V
a
l
i
a
n
, Sha
h
ram
Moha
nn
a. F
eedf
or
w
a
rd
N
eura
l
Net
w
o
r
k Us
in
g Intel
lig
en
t
Globa
l Harmo
n
y
Se
arch. Spri
nger. 20
12: 12
5-13
1.
[21]
E Alb
a
, JF
C
h
i
c
ano.
T
r
a
i
ni
ng
ne
ural
n
e
tw
orks w
i
th GA hy
brid
al
gorith
m
s
. In Proc.
of GECCO 2004,
ser. LNCS, K. Deb et al., Ed. Sprin
ger Ve
rl
a
g
, Berlin, Germ
an
y. 200
4; 310
2: 852-8
63.
[22]
Marcio Carvalho, T
e
resa B Lude
rmir. Sev
ent
h Internati
o
n
a
l
Confer
enc
e
on
H
y
brid I
n
tell
ig
ent S
y
stems,
IEEE. 2007: 33
6-33
9.
[23]
Masa
ya Y
o
shi
k
a
w
a, Kaz
uo
Otani.
Ant Col
ony Opti
mi
z
a
ti
on Ro
utin
g Al
gorith
m
w
i
th T
abu S
earc
h
.
Procee
din
g
s of
Internati
o
n
a
l Multi-Co
nfere
n
c
e
En
gin
eers
a
nd
Comp
uter S
c
ientists.
Ho
n
g
Ko
ng.
20
10
;
3(1): 1-4.
[24]
Alireza Alk
a
rz
ade
h, Alireza
Re
zaza
deh.
Artificial ne
ura
l
net
w
o
rk trai
nin
g
usin
g a ne
w
effici
en
t
optimiz
ation a
l
gorithm.
App
lie
d Soft Computi
n
g
. 201
3; (13): 120
6-12
13.
[25]
Quan-Ke P
an,
PN Suga
ntha
n,
M F
a
tih T
a
sgetire
n
, JJ Li
ang. A self-a
d
aptive g
l
o
bal
best searc
h
alg
o
rithm for contin
uo
us opti
m
izatio
n prob
le
ms.Appli
ed ma
themati
cs an
d computati
on.
Elsevier
. 20
10
;
(216): 83
0-8
4
8
.
[26]
Abdir
a
shi
d
Sa
l
ad N
u
r, Nor
Haiza
n
Mo
hd
Radzi,
As
hraf
Osman Ibra
hi
m. Artificial N
eura
l
N
e
t
w
o
r
k
Weight Optimization: A Revie
w
.
T
E
LKOM
NIKA Indon
esi
an Jour
na
l
of Electrical E
ngi
neer
ing.
2
014;
12(9): 68
97-
69
02
Evaluation Warning : The document was created with Spire.PDF for Python.