TELKOM
NIKA
, Vol. 11, No. 10, Octobe
r 2013, pp. 5
588 ~ 5
593
ISSN: 2302-4
046
5588
Re
cei
v
ed Ap
ril 18, 2013; Revi
sed
Jun
e
23, 2013; Accepted July 1
0
,
2013
A Novel Method to Optimize the Structure of BP Neural
Networks
Chan
gming Qiao, Shuli
Sun*
Institute of Electronic Eng
i
ne
erin
g,
Hei Lo
ng
jian
g
Un
iversit
y
(XU)
Harbi
n
, 150
08
0, Chin
a, Ph.:
+
86-13
796
20
4
098
*Corres
p
o
ndi
n
g
author, e-ma
i
l
: hlju
501
@1
26
.com*
A
b
st
ra
ct
It has bee
n a
l
ong ti
me that t
here
is not
a s
o
go
od
met
hod
to deter
mi
ne t
he n
u
mber
of n
euro
n
s in
hid
den l
a
yer f
o
r BP neur
al
netw
o
rk. F
o
r this pro
b
le
m, a
novel a
l
g
o
rith
m b
a
sed o
n
A
k
aike Infor
m
ati
o
n
Criterio
n (AIC)
to opti
m
i
z
e
th
e
structure of th
e BP n
eur
on
n
e
tw
orks is pro
p
o
sed
in th
is p
a
per. At the s
a
me
time, this pa
pe
r gives the up
per an
d low
e
r bou
nds fo
r classical AIC to overco
me its sh
ortcomin
gs. T
h
e
simulati
on
ex
p
e
ri
ment
show
s
that th
is
met
hod
ca
n
se
lec
t
a
mor
e
s
u
ita
b
le
netw
o
rk st
ructure, a
n
d
c
a
n
ensur
e the min
i
mal o
u
tput err
o
r w
i
th t
he optima
l
structure o
f
the netw
o
rk.
Ke
y
w
ords
:
BP
neuro
n
netw
o
r
ks, AIC, netw
o
rk structure
Copy
right
©
2013 Un
ive
r
sita
s Ah
mad
Dah
l
an
. All rig
h
t
s r
ese
rved
.
1. Introduc
tion
The al
gorith
m
of neu
ral
n
e
tworks h
a
s
been
pr
o
p
o
s
ed for
many
years an
d ha
s a
c
hieve
d
quite good result
s in many fields, su
ch as, a
r
tificial intelligen
ce, informatio
n fusion, pat
tern
recognitio
n
, fault diag
no
si
s, intellige
n
t
control a
nd
so
on [1
-3]. Among all
th
e algo
rithm
s
of
neural netwo
rks until no
w, BP
neural netwo
rks
(BPNN) is a
pplied mo
re
frequently and
wide
sp
rea
d
. It is a kind
of typical multi-l
a
yer f
eed forward n
eural ne
tworks, a
nd can solve m
a
n
y
difficult pro
b
l
e
ms
with very
compl
e
x non
linear.
Ho
wev
e
r, so
me sho
r
tcoming
s
for
cla
ssi
cal BPNN
algorith
m
hav
e bee
n fou
n
d
in the l
ong
-term
use of
it. One
of them
is
can
not u
s
e
an effe
ctive
method to de
termine the n
u
mbe
r
of neu
ron
s
in
the hi
dden laye
r, in many pra
c
tical appli
c
ation
s
,
the method o
f
“trial and error” o
r
“em
p
iri
c
al form
ul
a” i
s
still used. F
o
r this p
r
obl
e
m
, many experts
and sch
o
lars
have studi
ed
some diffe
ren
t
solution
s.
Referen
c
e [4
] propo
se
d that to determine a
ra
ng
e for the nu
mber of ne
urons in the
hidde
n layer
with the empi
rical fo
rmul
a first,
and the
n
expand the
range a
nd to find the optim
al
value
within it
. But the e
ssence of thi
s
method i
s
st
il
l the tri
a
l a
n
d
erro
r, so its
pra
c
tical
valu
e is
not wid
e
spre
ad. Refe
ren
c
e [5] pro
p
o
s
e
d
an a
daptive
mergi
ng a
n
d
gro
w
ing
algo
rithm (AM
G
A), it
is the
com
b
in
ation of the g
enetic
algo
rithm and
growth algorith
m
. Although thi
s
algorith
m
ha
s
a
certai
n value
,
the proble
m
s of doe
s
not kno
w
when to termi
nate the alg
o
rithm an
d high
comp
utationa
l compl
e
xity are very
obviou
s
, especi
a
lly whe
n
dealin
g with large
scale
cla
ssifi
cation
probl
em
s. Re
feren
c
e [6-8]
prop
osed an
algorith
m
based on Agent. This alg
o
rith
m
is rel
a
tively good an
d ca
n
be used wi
de
ly, but the co
mputational
complexity of this alg
o
rithm
is
so hig
h
, not suitabl
e for the ha
rdware
implement
ati
on of the ne
ural net
wo
rks. Refere
nce [9]
prop
osed a
method to d
e
t
ermine th
e scale
of hidd
e
n
layer fo
r th
e sin
g
le hid
d
en layer
bina
ry
neural networks. Th
e theory of th
is method is so rig
o
rous, an
d ha
s high theo
retical si
gnifican
c
e
for p
r
a
c
tical
appli
c
ation
s
.
But in the
e
nd, the
pap
e
r
al
so
poi
nte
d
out
that th
e up
pe
r b
o
u
n
d
determi
ned b
y
this metho
d
wheth
e
r i
s
the ce
rtainly
uppe
r bo
und
need
s to be
discu
sse
d a
nd
proved.
For thi
s
probl
em, this pa
p
e
r p
r
op
osed
to use AIC
which i
s
u
s
e
d
for dete
r
mine
model
orde
r in syst
em identificat
ion theory to
opt
imize the
numbe
r of neuron
s in h
i
dden laye
r for
BPNN. Mo
re
over, this p
a
p
e
r detail
ed a
nalyse
s
t
he
shortcomin
gs
of AIC and gi
ves the
soluti
on.
The sim
u
lation experim
e
n
t sho
w
s tha
t
this me
thod can sele
ct the most su
itable numb
e
r of
neuron
s in hi
dden laye
r q
u
ickly, and the mean
squ
a
r
e erro
r (MS
E
) of the entire net
work o
u
tput
is minimal. At the same tim
e
, it is easy to implement a
nd ha
s low
co
mputational
complexity.
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 23
02-4
046
TELKOM
NIKA
Vol. 11, No
. 10, Octobe
r 2013 : 558
8 –
5593
5589
2. BPNN a
n
d
Its Def
ect
s
In 197
4, P.
We
rbo
s
pro
p
o
se
d a
lea
r
ni
ng al
gorith
m
suitabl
e for
multi-layer ne
tworks in
his do
ctoral thesi
s
[15]; Later in 1986,
the
U.S. PD
P team studied the algorit
hm deeply a
nd
prop
osed BP algorith
m
, so
the neural ne
tworks
wh
i
c
h
trained by thi
s
algo
rithm is named BPNN.
The typical B
P
NN h
a
s a n
e
twork top
o
lo
gy with 3 layers [1
0], inclu
d
ing inp
u
t layer, hidd
en lay
e
r
and output la
yer, as sh
own in Figure 1.
Where,
12
,,,
n
x
xx
are netwo
rk
in
put
s,
12
,,
,
l
y
yy
are
netwo
rk outp
u
ts,
ij
W
and
ki
V
are conne
ction we
ights.
Ge
nera
lly,
the BP algorithm i
n
clu
des fo
ur
step
s: the
forward
sp
rea
d
of the
sam
p
le
s, the
calc
ula
t
ion of th
e o
u
t
put erro
r, the
ba
ck
spread
of
the error an
d the adju
s
tme
n
t of the weig
hts and th
reshold
s
.
The hid
den l
a
yer of BPNN can be
co
n
s
ide
r
ed
as
an
internal i
n
terpretation fo
r t
he input
layer. It is m
a
inly used to
extract th
e
chara
c
te
risti
c
s
between a kind
of
in
put mode and
ot
her
input mode
s,
and pa
ss th
em to the ou
tput layer.
This pro
c
e
s
s ca
n be se
en a
s
the pro
c
e
ss
of
weig
hts ada
p
t
ive adjustme
n
t. According
to the Ko
lmogorov theo
re
m, if a neural network with
3
layers i
s
con
s
ist of the Si
gmoid
-
type n
euro
n
s
and t
he num
ber
of neuron
s in the hidd
en lay
e
r is
enou
gh,
thi
s
neu
ral
net
work can co
mplete an a
r
bitra
r
y nonli
near ma
ppin
g
with arbitrary
accuracy. But it does not m
ean that the
more the
b
e
tter, the perfo
rmance of a n
eural n
e
two
r
k is
usu
a
lly evaluated by cal
c
u
l
ating t
he out
put error of te
st sam
p
les.
B
a
rroll co
nsi
d
e
r
s that the error
come
s from t
w
o a
s
pe
cts [11]: the approximation
error (bia
s) and
estimation e
rro
r (vari
a
n
c
e
)
.
When the number of neurons in
hi
dden layer increases, the ap
proximation erro
r will
decrease
grad
ually; but
the e
s
timatio
n
erro
r will
gradually
in
cre
a
se
sim
u
ltan
eou
sly, so it
need
s a
bala
n
ce
betwe
en th
e
m
. If the nu
m
ber of n
euron
s in
the
hidd
en laye
r i
s
e
x
cessive, it i
s
po
ssi
ble fo
r
the
netwo
rk t
o
tra
i
n noi
se a
nd
other
red
und
ant inform
at
io
n incl
ude
d in
the data. It wi
ll result in ov
er-
training
and
the netwo
rk
will fall into local mi
nimu
m points
wit
h
a high
pro
bability. On the
contrary, if th
e nu
mbe
r
of
neuron
s in
th
e hid
den
laye
r i
s
too
little, the a
c
curacy
of the n
e
two
r
k
output is lo
w, cannot reflect the nonli
near
re
latio
n
s
hip b
e
twe
e
n
the input and outp
u
t data.
Therefore,
ho
w to
dete
r
min
e
an
a
ppropri
a
te nu
mbe
r
o
f
neu
ron
s
i
n
t
he hi
dde
n lay
e
r i
s
th
e
key t
o
c
o
ns
truc
t effec
t
ive BPNN.
3. AIC and It
s Def
e
c
t
s
The overvie
w
of AIC:
In 1973, Ja
pane
se schol
a
r
Akaike p
r
op
ose
d
a sele
ction
criteri
o
n
for statisti
cal
model called
Akaike Inform
ation Cr
ite
r
io
n (AIC) [14
-
1
5
], as sho
w
n i
n
formula (1).
()
()
2
l
n
2
L
A
IC
k
k
(1)
Whe
r
e,
k
is t
he numb
e
r of
paramete
r
s that c
an be a
d
j
usted ind
epe
ndently in the model,
reflectin
g
the compl
e
xity of the model;
L
is the maximum of model
likeliho
od fun
c
tion, refle
c
ting
the fitting accuracy. The
pro
c
e
ss
of this algo
rith
m is
: firs
t es
timates
the parameters
with the
method of m
a
ximum likeli
hood, an
d th
en calcul
ates
the value of
the likelih
oo
d functio
n
an
d
AIC(k), the
m
odel
will
be
th
e be
st
fitted model
when
the value of AIC(k) is mini
mal. AIC will
give
a suita
b
le m
odel throug
h
make
a bal
ance bet
we
e
n
fitting accu
racy a
nd m
o
del complexit
y
.
Specifically, assume th
e real num
be
r o
f
adjusta
ble
para
m
eters i
n
the sy
stem
is
n
0
, calcul
ates
the value of
AIC (k) from
k
=
1. In the begin
n
ing,
k
is
far less
than
n
0
, the fitting acc
u
rac
y
mus
t
be poo
r, so t
he value of the forme
r
pa
rt
()
2l
n
L
in the formula (1
) is g
r
eate
r
and pl
aying a
leadin
g
role;
with the i
n
cre
a
sin
g
of
k
, th
e value
of
()
2l
n
L
gradually d
e
cre
a
se
s
whe
n
k
clo
s
e
t
o
the
n
0
, while
the latter
part 2
k
gra
duall
y
increa
se
s
and
playing t
he le
ading
ro
le. The
r
efore, a
minimal point
appea
rs at
n
0
.
The a
nalysi
s
of defe
c
ts:
Firstly, the
cl
as
sic
a
l A
I
C f
o
cu
se
s
on t
h
e ef
f
e
ct
s
cau
s
ed
by
resi
dual
s
and
model
order j
u
st fro
m
the
mathemati
c
al
aspe
ct, lack t
he phy
sical u
nderstan
ding
of
the model. T
herefo
r
e, it may lead to cann
ot
iden
tify all modes duri
ng the
modes a
nal
ysis.
More
over, th
e AIC
ha
s n
o
lower bo
und
of mod
e
l o
r
d
e
r,
so it
is so
possibl
e that
the mo
del
order
given by the AIC is less th
an the real va
lue in the app
lication.
Secon
d
ly, althoug
h there i
s
an
upp
er b
ound
of
the
model o
r
d
e
r i
n
pra
c
tical ap
plicatio
n
gene
rally, it
may be
in
some extreme cases. If so, t
he cal
c
ul
ate process
of the AIC
will
endle
s
sly, that means the
AIC will not converg
e
.
Evaluation Warning : The document was created with Spire.PDF for Python.
TELKOM
NIKA
ISSN:
2302-4
046
A Nov
e
l Method to Optimiz
e
the Struc
t
u
r
e of BP Neural Networks
(Shuli Sun)
5590
4. Deriv
a
tion and Impro
v
ement
Theo
retical d
e
rivation: It can be
seen
from
the fo
rmu
l
a (1
) that th
e
maximum
of model
likeliho
od fun
c
tion
L
is the
only value need
s to be calcul
ated wh
en the AIC is appli
ed to the
BPNN.
Gene
rally, a
s
sume
the
paramet
er
nee
ds
to
be a
d
ju
sted
in a
neu
ral
netwo
rk mo
d
e
l
is
m
R
, including
all con
n
e
c
tio
n
weig
hts an
d thre
shold
s
,
input variabl
es
k
X
R
and outp
u
t
variable
s
l
YR
. So
a neural network m
odel
ca
n be expre
ssed as:
(,
)
;
,
,
mk
l
Yf
X
R
X
R
Y
R
(2)
Assu
me the trainin
g
sam
p
l
e
set of the netwo
rk i
s
(,
)
,
1
,
2
,
,
x
yi
N
ii
, where
N
is the
numbe
r of
tra
i
ning sam
p
le
s;
(1
)
(
2
)
(
)
,
P
ii
i
xx
x
x
i
is the
input
vecto
r
,
P
i
s
t
he n
u
mbe
r
of neu
ron
s
in the input l
a
yer;
(1
)
(
2
)
(
)
,
K
ii
i
i
yy
y
y
is the
output vecto
r
,
K
is the nu
mber
of neu
rons in th
e
output laye
r. Also, a
s
su
me the
re
sp
onse fun
c
tio
n
of o
u
tput
neuron
s i
s
S
i
gmoid. After fully
trained, whe
n
the i
th
sample is input, if the actual
input and the
ideal input of the k
th
output
neuron a
r
e
()
k
i
c
and
()
k
i
d
respe
c
tively, the input error of this o
u
tput neuron i
s
:
()
(
)
()
kkk
ii
i
cd
(3)
It can
be
known fro
m
t
he
central li
mit theorem
that e
rro
r
()
k
i
c
a
n b
e
se
en
as
indep
ende
nt rand
om vari
a
b
les, a
nd its prob
abilit
y distrib
u
tion o
bey normal
distrib
u
tion, that
is
()
2
~(
,
)
k
ik
k
N
, where,
()
1
1
N
k
ki
i
N
,
2(
)
2
1
1
()
N
k
ki
k
i
N
(4)
The conditional probabilit
y density function of
()
k
i
d
is
[14]:
2
()
()
()
()
1
(|
,
)
(
|
)
e
x
p
22
k
kk
k
ik
ii
i
kk
fd
x
f
(5)
Acco
rdi
ng to
the Sigmoi
d tran
sfer fu
ncti
on, we can
get
the
con
d
itional p
r
oba
bility
den
sity of output neuron
s:
2
()
1
()
()
2
(
)
()
()
()
1
ln
1
(|,
)
2
e
x
p
2
k
i
k
k
k
i
kk
k
i
ii
i
k
k
i
k
a
a
a
fa
x
a
a
(6)
whe
r
e,
()
k
i
a
is the
actual outpu
t of
the k
th
output neuron of the i
th
sample.
A
ssum
e
t
he
netwo
rk o
u
tp
ut erro
r is:
()
()
()
kk
k
ii
i
ry
a
(7)
then:
()
()
()
()
(
|
,)
(
|
,)
kk
k
k
ii
i
i
fr
x
f
a
x
(8)
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 23
02-4
046
TELKOM
NIKA
Vol. 11, No
. 10, Octobe
r 2013 : 558
8 –
5593
5591
the maximal likelih
ood fun
c
tion is:
()
()
11
(|
,
)
NK
kk
ii
ik
Lf
r
x
(9)
Take th
e formula (9
) into the formul
a (1
):
()
(
)
11
_
2l
n
(
(
|
,
)
)
(
1
)
NK
kk
ii
ik
A
IC
B
P
f
r
x
P
K
M
(10
)
whe
r
e,
P
,
M
and
K
a
r
e th
e num
ber
of
neuron
s in th
e input, hid
d
en an
d outp
u
t layer
r
e
spec
tively.
The imp
r
oved
method: Co
n
s
ide
r
the sho
r
tcoming
s
of A
I
C said i
n
se
ction 3.2, that is, the
para
m
eter
M
in the formula
(10) la
ck of lowe
r bou
nd
a
nd its uppe
r boun
d is not clea
r. Usi
ng the
followin
g
formula to determine the lower bou
nd of
M
:
mi
n
2
lo
g
P
M
(11
)
Gene
rally, th
e pa
ramete
r
M
ha
s an
upp
er b
ound,
but
in so
me extreme
ca
ses,
e
s
pe
cially
whe
n
the
det
ermin
a
te lo
wer b
oun
d i
s
n
o
t so
goo
d, it is e
a
sy to
le
ad to
cal
c
ulat
e en
dlessly. In
fact, it doe
s n
o
t need
to ex
ecute to
the
real up
pe
r bo
u
nd of the
mo
del in th
e cal
c
ulatio
n p
r
o
c
ess
,
only need
s a
stage u
ppe
r b
ound of
M
an
d given by the followin
g
formula:
max
M
PK
a
(12
)
whe
r
e:
a
is consta
nt between 0
~
10.
The impl
em
entation
ste
p
s: Acco
rdi
ng
to the
method d
e
scrib
ed a
b
ove, the
impleme
n
tation step
s a
s
follows:
Step 1: Determine the
P
and
K
accordin
g to the pra
c
tical ap
plication of the syste
m
;
Step 2: Calcu
l
ate the initial lowe
r bou
nd
M
min
and upp
er bou
nd
M
max
of the
M
;
Step 3: Circu
l
ating train th
e network an
d cal
c
ul
ate th
e AIC_BP values i
n
the ra
nge of
M
min
~
M
max
;
Step 4: If the minimal
poin
t
of AIC_BP doe
s not
app
ear,
set
M
min
=
M
max
,
M
ma
x
=2
M
max
,
then go to Step 2; Otherwi
se, trainin
g
st
oppe
d.
5. Experiment An
aly
s
is
The expe
rime
nt is to compl
e
te a compl
e
x nonlinea
r function reg
r
e
s
sion. The fun
c
tion is:
22
11
2
2
20
10
cos(
2
)
1
0
cos
(
2
)
yx
x
x
x
(13
)
1000
gro
u
p
s
of input d
a
ta ra
ndomly
gene
rated fi
rstly, whe
r
e 8
00 g
r
oup
s
a
s
trai
ning
sampl
e
s, 2
0
0
grou
ps
as te
sting
sampl
e
s. The
n
, tests the algo
rith
m 3 times
wit
h
10%, 20%
and
30% noi
se
re
spe
c
tively, the noi
se i
s
ad
ded
rand
omly
. Acco
rdin
g to formul
a (13
)
, it is sure th
at
there a
r
e
2 n
ode
s in the i
nput layer, a
nd 1 n
ode
i
n
the
output
la
yer.
We ca
n also cal
c
ul
ate
the
initial lower b
ound (
M
min
) a
nd uppe
r bou
nd (
M
max
) of
neuron
s in the hidden lay
e
r are 1 an
d
5
respe
c
tively, whe
n
the value of
a
is 3. More
over, tra
i
ns the net
wo
rk follo
wing t
he algo
rithm
in
se
ction
4.3, calcul
ates the
value of AIC_BP and
out
put MSE, the
re
sult
s a
s
sh
own
in Fi
gu
re 1
and 2.
It can be se
en from Fig
u
r
e 1 that the
minimal value of AIC_BP appea
rs
when the
numbe
r of n
e
u
ron
s
a
r
e
36,
44 an
d 54
re
spe
c
tively. Moreove
r
, with
the increa
sin
g
of neu
ron
s
in
the hidd
en la
yer, the AIC_
BP decrea
s
e
s
first and t
h
e
n
increa
se
s.
Whe
n
the p
e
rcenta
ge of n
o
ise
increa
se
s, the requi
red n
e
u
ron
s
al
so in
cre
a
se fo
r the
best stru
ctu
r
e of BP
NN. The re
sults
cle
a
rly
sho
w
that th
e algo
rithm p
r
opo
se
d in th
is pa
per
is
correc
t. At the s
a
me time, the output MSE
sho
w
s the same tren
d wi
th AIC_BP, as sh
own
in F
i
gure
2. Also
, the minimal
value of MSE
appe
ars
at th
e same
mom
ent with
AIC_
BP. At las
t, te
sts the BP
NN with
the
stru
cture
of
2x36
x1.
Evaluation Warning : The document was created with Spire.PDF for Python.
TELKOM
NIKA
ISSN:
2302-4
046
A Nov
e
l Method to Optimiz
e
the Struc
t
u
r
e of BP Neural Networks
(Shuli Sun)
5592
Figure 3
sho
w
s that the
fitting re
sult i
s
very go
od,
a
l
most
coin
cid
e
s
with th
e real ima
ge
of
the
function, the
r
e are
a little e
rro
rs on th
e e
dge of
the i
m
age. Th
us, th
e nonli
nea
r m
appin
g
ability
o
f
the netwo
rk i
s
very good
with this st
ru
cture.
0
10
20
30
40
50
60
70
80
90
10
0
-20
0
20
40
60
80
100
120
t
h
e nu
m
b
e
r
o
f
ne
uron
s
i
n
hi
dd
en l
a
y
e
r
AIC
-
BP v
a
l
u
e
10
%
N
o
i
s
e
20
%
N
o
i
s
e
30
%
N
o
i
s
e
Figure 1.
The
chan
ge tren
d
of AIC_BP
0
10
20
30
40
50
60
70
80
90
100
-10
0
10
20
30
40
50
60
70
80
90
t
h
e num
b
e
r
of
ne
ur
ons
i
n
hi
dden l
a
y
e
r
MS
E
v
a
l
u
e
10%
N
o
i
s
e
20%
N
o
i
s
e
30%
N
o
i
s
e
Figure 2.
The
chan
ge tren
d
of output MSE
-2
0
2
-2
0
2
0
10
20
30
40
50
60
-2
0
2
-2
0
2
0
10
20
30
40
50
60
-2
0
2
-2
0
2
0
10
20
30
40
50
60
(a)
Real fun
c
t
i
on image (b) BP neural ne
twork fitting image (c) Error ima
g
e
Figure 3. Fitting re
sults a
n
d error
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 23
02-4
046
TELKOM
NIKA
Vol. 11, No
. 10, Octobe
r 2013 : 558
8 –
5593
5593
6. Conclusio
n
There is a
probl
em that
the num
be
r of neu
ron
s
in the hi
dd
en laye
r can
only be
determi
ned e
m
piri
cally wh
en usin
g the
BP neural
network, this
pape
r by means of the AIC
crite
r
ion
for
model
order i
n
the i
n
form
a
t
ion theo
ry, its u
ppe
r a
nd l
o
we
r b
oun
ds are al
so
given,
then the optimal sele
ction
method of BP neural
n
e
twork st
ru
cture ba
se
d o
n
improving
AIC
crite
r
ion i
s
propo
sed. The
simulatio
n
re
sult sh
ow
s th
at we ca
n sel
e
ct the optim
al model stru
cture
suitabl
e for t
he p
r
a
c
tical
probl
em
s wit
h
this m
e
tho
d
, and
get very satisfied
output results with
this structure. But it is very
importa
nt to determi
ne
the initial lo
wer
bou
nd o
f
AIC criteri
o
n,
otherwise the
calculation
may not conv
erge, if so
, it sho
u
ld be p
r
operly redu
ce
the initial lower
boun
d cal
c
ul
ated from the
formula (1
1)
and t
hen
start
to select net
work st
ru
cture.
Ackn
o
w
l
e
dg
ements
This
work
wa
s su
ppo
rted i
n
part by Natural Sci
e
n
c
e
Found
ation o
f
China u
nde
r Gra
n
t
NSFC-60
874
062, Program
for
High
-qu
a
lified Tale
nts
unde
r G
r
a
n
t
Hdtd2
010
-03,
and
Elect
r
on
ic
Enginee
ring
Province Key Laboratory.
Referen
ces
[1]
H Z
hao, J Z
h
ang. Pi
pel
in
ed
Che
b
y
shev F
unctio
nal
Li
nk
Artificial R
e
c
u
rrent N
eura
l
Net
w
ork for
Nonl
in
ear A
d
a
p
tive F
i
lter.
IEEE Trans. System
s, Man, and
Cy
ber
netics, P
a
rt B: Cybernetics
. 20
10;
40
:
162-
172.
[2]
Omer De
perl
i
o
g
lu,
Utku K
o
s
e
. An
ed
ucati
ona
l to
ol for
artificial
n
eura
l
net
w
o
rks.
Co
mp
uters an
d
Electrical E
ngi
neer
ing
. 2
011;
37: 392-
40
2.
[3]
Abe T
,
Saito
T.
An appro
a
ch t
o
pred
iction
of spatio-te
m
pora
l
patterns b
a
se
d on bi
nary n
e
u
ral n
e
tw
orks
and ce
llu
lar a
u
tomata
. IEEE Internatio
nal J
o
in
t Conferenc
e
o
n
Neur
al Net
w
orks. 2008; 2
4
94-2
499.
[4]
Yan h
ong, Gu
an Ya
n-pi
ng.
Method to D
e
termine th
e Qu
antit
y
of Intern
al No
des of Ba
ck Propag
atio
n
Neur
al Net
w
o
r
ks and Its Demonstratio
n.
Con
t
rol Engi
ne
erin
g of Chin
a
. 200
9; 16(S1): 10
0-
102.
[5]
Islam MM, Sat
t
ar MA, Amin
F
,
et al. A
Ne
w
Ad
aptive
Me
rgin
g a
n
d
Gro
w
i
n
g
Al
gorithm
for D
e
sig
n
in
g
Artificial N
eur
a
l
Net
w
orks.
IE
EE Trans on
Systems, Ma
n
,
and
Cy
bern
e
t
ics—Part B: Cyber
netics
.
200
9; 39(3): 70
5-71
8.
[6]
Gao Pen
g
-
y
i,
Che
n
Ch
ua
n-b
o
, Qin she
ng.
A Nove
l Alg
o
rit
h
m to Optimiz
e
the H
i
d
den
L
a
y
er of N
eur
al
Net
w
orks.
Co
mp
uter Eng
i
n
e
e
rin
g
& Scienc
e, Chin
a
. 201
0; 32(5): 30-3
3
.
[7]
YU Z
h
ij
un. R
B
F
Neura
l
N
e
t
w
orks O
p
timi
zation A
l
g
o
rith
m and A
p
p
lic
ation
on T
a
x F
o
recastin
g.
T
E
LKOMNIKA Indon
esi
an Jou
r
nal of Electric
al Eng
i
ne
eri
n
g
.
2013. 1
1
(7).
[8]
Patricia Me
lin,
Victor Herrera,
Dann
iel
a
Rom
e
ro
, Fevrier Va
ldez, Oscar Ca
stillo. Genetic
Optimizatio
n
of Ne
ural
Net
w
o
r
ks for P
e
rs
on R
e
co
gniti
on
bas
ed
on th
e
Iris.
T
E
LKOMNIKA Indo
nesi
an J
ourn
a
l
of
Electrical E
ngi
neer
ing
. 2
012;
10(2): 30
9-3
2
0
.
[9]
Lu
ya
ng, Y
a
n
g
jua
n
, W
a
n
g
q
i
ang. T
he U
p
p
e
r
Bou
nd
of T
he Min
i
mal
Nu
mber of
Hid
de
n Ne
uro
n
s for
T
he Parit
y
Pro
b
lem i
n
Binar
y
Neur
al Net
w
o
r
ks.
Science Ch
ina
. 20
12; 42(
3
)
: 352-36
1.
[10]
Li
yi
ng
W
ang
zhe
ng, A
o
Z
h
i-g
uan
g. Opti
mi
zatio
n
for B
r
eako
u
t Pred
ic
tion S
y
stem
o
f
BP Ne
ur
a
l
Net
w
ork.
Co
ntrol an
d Decis
i
o
n
, Chin
a
. 201
0; 25(3): 453-
45
6.
[11]
Barron AR. A
ppro
x
m
a
tion
a
nd estimati
on
bou
nds for arti
ficial n
eura
l
n
e
t
w
o
r
ks.
Mach
ine L
ear
nin
g
.
199
4; 14: 115-
133.
[12]
Han
nan
e J. T
h
e Estimatio
n
of
the Order of an ARMA Proc
ess.
T
he Ann
a
l
s of Statistics
. 198
0; 8(5)
:
107
1-10
81.
[13]
Stoica P, Sel
en Y. Model
-
o
rder se
lectio
n: a revie
w
o
f
information
criterion r
u
les.
IEEE Signal
Processi
ng Ma
ga
z
i
n
e
. 20
04; 21: 36-4
7
.
[14]
Xi
e Xia
o
-he
ng,
He You-hu
a.
On the Error of Kernel Estimation for Con
d
i
t
iona
l PDF
and Its
Optima
l
Band
w
i
dth S
e
l
e
ction.
Or T
r
ansactions, Ch
in
a
. 2008; 1
2
(3): 13-2
2
.
[15]
W
e
rbos PJ. B
e
yon
d
re
gressi
on: Ne
w
to
ols
for
pre
d
ictio
n
and
a
nal
ys
is
in the
b
ehav
io
ral sci
ences
.
[PhD thesis]. Cambrid
ge (MA)
:
Harvard Un
iv
ersit
y
, 1
974.
Evaluation Warning : The document was created with Spire.PDF for Python.