TELKOM
NIKA Indonesia
n
Journal of
Electrical En
gineering
Vol. 14, No. 1, April 2015, pp. 130 ~ 1
3
9
DOI: 10.115
9
1
/telkomni
ka.
v
14i1.727
2
130
Re
cei
v
ed
Jan
uary 1, 2015;
Re
vised Ma
rch 10, 2015; A
c
cepted Ma
rch 25, 2015
Hardwa
re Implementation of FIR Neura
l
Network for
Applications in Time Series Data Prediction
Kuldee
p S.
Ra
w
a
t
1
, G. H. Massiha*
2
1
Departme
n
t of
T
e
chnol
og
y, E
lizab
eth Cit
y St
ate Univ
ersit
y
,
North Car
o
li
na,
USA 2790
9
2
Departme
n
t of Industrial T
e
chno
log
y
, U
n
ive
r
sit
y
of Lo
uisi
a
na at Lafa
y
ette
, Louisi
an
a 705
04
*Corres
p
o
ndi
n
g
author, e-ma
i
l
: ksra
w
at
@ma
il.ecsu.e
du
1
, massih
a
@l
ouis
i
a
na.ed
u
2
A
b
st
r
a
ct
T
i
me
seri
es
da
ta pre
d
ictio
n
is
used
in
sev
e
ral
ap
plic
atio
ns i
n
the
area
of sc
i
ence
an
d
en
gi
neer
ing.
T
i
me s
e
ri
es pr
edicti
on
mo
de
l
s
have
be
en i
m
p
l
e
m
e
n
ted
u
s
ing statistic
a
l
appr
oach
e
s, b
u
t recently,
ne
ura
l
netw
o
rks are b
e
in
g ap
pl
ied fo
r times s
e
ries
pred
iction
du
e
to their i
n
h
e
re
nt prop
erties
a
nd ca
pa
bil
i
ties.
A
variati
on
of a st
and
ard
ne
ural
netw
o
rk call
ed
as finite
i
m
p
u
ls
e resp
ons
e (F
IR) ne
ural
n
e
tw
ork has
prov
en
to
be h
i
gh
ly succ
essful i
n
ach
i
e
v
ing
hig
her
de
gree
of pr
e
d
icti
on acc
u
racy w
hen
use
d
over
vario
u
s time se
ri
e
s
pred
iction
ap
pli
c
ations. T
hes
e
app
licati
ons
a
r
e time
critica
l
and invo
lve hu
ge
a
m
ounts
of computati
on th
at
are slow
er w
h
e
n
run on a g
e
n
e
ral p
u
rpos
e pr
ocessor a
nd h
ence, a de
dica
ted hardw
ar
e i
s
require
d. In thi
s
pap
er, auth
o
rs
prese
n
t har
dw
are i
m
ple
m
ent
ation
of an F
I
R
neur
al n
e
tw
ork for app
licati
o
ns in ti
mes ser
i
es
data pre
d
ictio
n
.
T
he impl
e
m
e
n
tation is d
i
vid
ed into (i
) off-b
oard, w
here th
e traini
ng al
go
rithm a
nd n
eur
al
netw
o
rk confi
gurati
on is i
m
pl
e
m
ented i
n
Matrix Labor
atory (MAT
L
AB) and si
mu
l
a
ted w
i
th vari
ous
benc
h
m
ark ti
me series
data
set and (i
i) on
-boar
d,
w
here
the entir
e system is
mode
le
d in a
hardw
a
r
e
description language
(HDL).
The sim
u
lation exper
im
ent, hardware
building blocks, th
e implementation
framew
ork, a
n
d
the
har
dw
are des
ig
n flow
are d
i
scuss
ed
in this
pa
per.
T
he har
dw
are
resourc
e
uti
l
i
z
atio
n
and ti
mi
ng inf
o
rmati
on ar
e als
o
reporte
d in th
e pap
er.
Ke
y
w
ords
: ne
ural n
e
tw
ork, FPGA, time-seri
e
s dat
a, HDL,
electro
n
ics, rap
i
d prototyp
in
g
Copy
right
©
2015 In
stitu
t
e o
f
Ad
van
ced
En
g
i
n
eerin
g and
Scien
ce. All
rig
h
t
s reser
ve
d
.
1. Introduc
tion
Time se
rie
s
i
s
a set of observatio
n
s
ge
nerate
d
sequ
entially in time. Often the probl
em of
intere
st in
the
s
e
ob
se
rvations i
s
th
e
pre
d
iction
of
so
me future val
ues ba
se
d o
n
the
re
cent
p
a
st.
This
proble
m
of time
seri
es p
r
edi
ction
ha
s
a
pplication
s
i
n
such a
r
e
a
s
as hydrology,
transpo
rtation
,
teleco
mmun
i
cation
s, q
uali
t
y control,
fin
anci
a
l
worl
d,
and i
ndu
strial
processe
s [1
].
The goal of time se
ries p
r
edictio
n is to find a function
N
f
:
to obtain an estimate of
x
at
time
d
t
, s
o
that:
))
1
(
),...,
1
(
),
(
(
)
(
N
t
x
t
x
t
x
f
d
t
x
(1)
))
(
(
)
(
t
y
f
d
t
x
,
(2)
Whe
r
e
)
(
t
y
is the
N
-a
ry vector
of lagged
x
values.
Artificial
neu
ral
netwo
rks (ANNs) are g
ener
al fun
c
tion app
roxim
a
tors th
at ha
ve been
applie
d in
pat
tern
re
cog
n
ition, cl
assification, an
d
p
r
o
c
ess
cont
rol [1
, 2]. Re
cently
, they are b
e
i
ng
u
s
ed
in th
e ar
e
a
s
o
f
pr
e
d
i
c
t
io
n
,
w
h
er
e
r
e
gr
es
s
i
on a
nd
oth
e
r
rel
a
ted statistical techni
que
s h
a
ve
traditionally b
een u
s
ed. O
n
e cla
s
s of ANNs i
s
t
he fee
d
forward net
works, whi
c
h
has
one o
r
m
o
re
hidde
n laye
rs of neu
ro
ns,
also
refe
rred
to as hidd
en
units [2]. Th
e
nonlin
ea
ritie
s
of th
e hid
d
en
units all
o
w th
e network to
extract hi
ghe
r-order
stati
s
tics and are
p
a
rticul
arly
val
uable wh
en
t
he
size of the input layer i
s
large.
T
he
architectural
graph shown i
n
Fi
gure 1 illust
rates the layout
of
a multilayer feedforwa
rd n
eural
net
wo
rk. For brevity, the netwo
rk in Figure 1 is
referred to a
s
a
4-3
-
3-4 n
e
twork in that it h
a
s
4
source
n
ode
s, 2 hi
dde
n layers of
3
hidde
n ne
uro
n
s, an
d 4
out
put
neuron
s.
Evaluation Warning : The document was created with Spire.PDF for Python.
TELKOM
NIKA
ISSN:
2302-4
046
Hardware Implem
entation of FIR Neu
r
al
Netwo
r
k
fo
r Applicatio
ns i
n
Tim
e
… (Kuldeep S. Ra
wat)
131
Figure 1. Fully conne
ct
ed f
eedforwa
rd n
e
twork
The
stan
dard ne
ural
net
work metho
d
of pe
rformi
ng time
seri
es
pre
d
ictio
n
is to
approximate the
function
f
in neu
ral net
work a
r
chite
c
ture, u
s
in
g a set of
N
-tuple
s
(finite
seq
uen
ce
of
data p
o
ints)
as i
nput
and
a si
ngle
outp
u
t as the
targ
et value
of th
e net
wo
rk.
T
h
is
method is of
ten calle
d the sliding
win
dow te
chniq
u
e
s a
s
the
N
-tuples in
put slide
s
over t
he
training
set [2]. The b
a
si
c archite
c
ture
of a sli
d
ing
windo
w meth
o
d
for time
series p
r
e
d
ictio
n
is
s
h
ow
n
in
F
i
gu
r
e
2
.
Figure 2. Sliding win
d
o
w
ba
sed time
seri
es predi
ctor
Time se
rie
s
d
a
ta predi
cto
r
can b
e
either
impleme
n
ted
in softwa
r
e o
r
hard
w
a
r
e. Software
solutio
n
s are
com
paratively
slower as large
num
bers of
comp
uta
t
ions a
r
e i
n
volved. Several
resea
r
chers
have ad
opte
d
ha
rd
ware
impleme
n
ta
tion with
grea
t su
ccess [3
, 4] (Restre
po,
Hoffmann,
Pere
z-Uri
be,
Tue
s
che
r
& Sanche
z, 200
0; Bei
u
, 199
6).
These h
a
rd
ware
impleme
n
tations fa
cilitate
the use
of n
eural
net
work in real
-time
appli
c
ation
s
. These real-ti
m
e
appli
c
ation
s
rang
e from
predi
cting
vo
ice tra
ffic d
e
m
and, tem
p
eratu
r
e
pre
d
i
c
tion
of a bl
ast
furna
c
e, a
nd
predi
ction
of
prod
uct
qualit
y in a c
hemi
c
al process, to
rainfall
-ru
nof
f modelin
g [1].
These ap
plications involv
e huge
amo
unts of co
m
putation that
are sl
ower
whe
n
ru
n on
a
gene
ral
-
pu
rp
ose
pro
c
e
s
so
r [5]. Usi
ng d
edicated h
a
rdwa
re, on
e can a
c
hieve
hi
gher
sp
eed
a
nd
real
-time a
nal
ysis
of data
a
nd p
r
edi
ction
results.
Re
ce
ntly,
reco
nfig
urabl
e
h
a
rd
ware sol
u
tion
s in
the form
of
F
P
GAs offer high performance with th
e
ability to be
electri
c
ally reprogram
m
ed to
perfo
rm ch
an
ges in d
e
si
gn
and algo
rith
m [6-8].
This
work fo
cu
se
s on FP
GA impleme
n
tation of finite impulse
re
spo
n
se (FIR) neural
netwo
rk
for time
serie
s
data pre
d
icti
on. The
first step involves simul
a
tion of temporal
backp
rop
agat
ion training
algorith
m
usi
ng Ma
trix Laboratory (M
ATLAB). The outcom
e
of
MATLAB sim
u
lation i
s
th
e
de
sign
pa
ra
meters o
r
th
e
FIR
neu
ral
n
e
twork topol
o
g
y for th
e b
e
s
t
predi
ction. Th
is inform
ation
is later u
s
ed
for
hardware
impleme
n
tation, whe
r
e d
e
s
ign i
s
mod
e
l
ed
in hard
w
a
r
e d
e
scriptio
n lan
guag
e (HDL
).
The re
st of the pap
er is
orga
nized a
s
follows. Th
e
next section
pre
s
ents FI
R neu
ral
netwo
rk m
o
d
e
l. Section 3 pre
s
ent
s in d
e
tail the bu
ild
ing blocks of FIR neu
ral ne
twork. In sect
ion
4 auth
o
rs di
scu
s
s the
hardwa
r
e i
m
ple
m
entation
f
r
a
m
ewo
r
k. Bot
h
sim
u
lation
experim
ent a
nd
FPGA imple
m
entation a
r
e
discusse
d
in this se
ction.
Finally releva
nt con
c
lu
sion
is pre
s
e
n
ted.
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 23
02-4
046
TELKOM
NI
KA
Vol. 14, No. 1, April 2015 : 130 – 13
9
132
2. FIR Neur
al Net
w
o
r
k
Finite impul
se re
sp
on
se
(FIR)
neu
ral
netwo
rk, first
pro
p
o
s
ed
by Eric
Wa
n, h
ad g
r
eat
su
ccess
at th
e Santa
Fe In
stitut
e (SFI) ti
me
seri
es p
r
e
d
iction
comp
etition [9, 10].
The
FIR neu
ral
netwo
rk de
si
gn outp
e
rfo
r
med oth
e
r
co
mpetitors in t
he p
r
edi
ction
task
s. In a
n
F
I
R ne
ural
net
work
each ne
uron
is extend
ed t
o
be
able to
pro
c
e
s
s te
mp
oral fe
ature
s
by repl
aci
ng
each syn
apti
c
weig
ht by an
FIR filter [2, 10]. An FIR neuron
mo
de
l is sh
own in
Figure 3, wit
h
co
rre
sp
ond
ing
time delay ne
ural net
wo
rk repre
s
e
n
tation
sho
w
n in Fig
u
re 4.
Figure 3. FIR neuron mo
del
An
F
I
R
ne
ur
al n
e
t
w
o
rk
’s
inp
u
t
la
ye
r
c
ons
is
ts
of FI
R filters, fee
d
ing t
he data
into n
euro
n
s
in hidden lay
e
r. Similar to conve
n
tional
feedfor
wa
rd netwo
rks, an FIR neural n
e
twork may h
a
ve
one o
r
several hidd
en lay
e
rs. T
he o
u
tp
ut layer con
s
i
s
ts of FI
R ne
uron
s that
re
ceive thei
r in
puts
from p
r
eviou
s
hidd
en laye
r.
The
network
sho
w
n
in Fi
g
u
re
4
con
s
i
s
ts of th
ree
lay
e
rs with
a
sin
g
le
output neu
ro
n and two n
e
u
ron
s
in hid
d
en layer.
Figure 4. A time delay neu
ral re
pre
s
e
n
tation of FIR n
eural n
e
t
As seen i
n
Fi
gure
4, all th
e co
nne
ction
s
a
r
e d
e
layed
(time p
r
o
c
e
s
sed
)
b
e
fore
p
a
ssing
on
to neuron
s in
the next layer. In effect, the netwo
rk
is
unabl
e to lea
r
n tempo
r
al f
eature
s
that
are
longe
r tha
n
it
s filter l
ength
s
summ
ed to
gether.
Co
nseque
ntly, sel
e
ction
of the
length
s
of F
I
R
filters is quite
critical in ac
hi
eving goo
d predictio
n perfo
rman
ce.
3. Building Blocks of FIR
Neur
al Net
w
ork
The ba
sic co
mputing mod
u
les u
s
ed in
the FI
R neural network d
e
sig
n
are, a
ddition,
multiplicatio
n, and unit del
ay. Such circuits as
the a
dder tree, multiplier-a
ccu
mulator unit, and
the FIR filter
can b
e
de
sig
ned u
s
ing th
e
s
e ba
si
c mo
d
u
les. In this
work, auth
o
rs used
digit-se
rial
architectu
re that combi
n
e
s
the area efficien
cy of
a bit-se
rial archit
ecture with the time efficiency
of bit-pa
rall
el archite
c
ture
[11, 12]. In
the di
git-serial ap
pro
a
ch, data
wo
rd
s
are
divided
i
n
to
digits, having
a digit size
N
, which a
r
e proce
s
sed in o
ne clo
c
k cycl
e. Archite
c
tures ba
se
d on the
digit-seri
al ap
proa
ch
offers a better
ove
r
all
so
lution
consi
deri
ng th
e trad
eoffs b
e
twee
n spee
d,
efficient area
utilization, th
roug
hput, I/O pin limit
ation
s
and
po
wer
con
s
um
ption.
The digit-se
rial
approa
ch al
so lead
s to a
regula
r
layout
and the
po
ssibility of buil
d
ing a
pipeli
n
e with it. A brief
Evaluation Warning : The document was created with Spire.PDF for Python.
TELKOM
NIKA
ISSN:
2302-4
046
Hardware Implem
entation of FIR Neu
r
al
Netwo
r
k
fo
r Applicatio
ns i
n
Tim
e
… (Kuldeep S. Ra
wat)
133
descri
p
tion of
digit-seri
al a
r
chite
c
tu
re of
eac
h comp
u
t
ational unit
use
d
in FI
R
neural net
wo
rk
implementation is
as
follows
.
3.1. Digit-s
er
ial Adder an
d Digit-serial
Multiplier
A basi
c
el
e
m
ent in a
di
git-se
rial
arit
hmetic
im
ple
m
entation i
s
the digit-se
rial add
er
sho
w
n
in Fi
g
u
re
5(a). T
he
two o
perand
s,
A
and
B
, a
r
e fed
one
digi
t at a time i
n
to the
digit-serial
adde
r. The a
ddition is do
n
e
N
bits at a time, with th
e carry rippli
ng from on
e full adder to the
next. The
ca
rryout from
th
e digit-se
rial
adde
r i
s
fed
back i
n
to the
first full a
dde
r duri
ng th
e n
e
xt
clo
c
k cy
cle,
when th
e n
e
xt digits
of the i
nputs have
arrived. Th
e di
g
i
t-se
rial
add
er is fu
rthe
r u
s
e
d
to form digit-serial multiplier.
(a)
(b)
Figure 5. a) A digit-se
rial a
dder m
odul
e. b) Digit
-
serial
multiplier mo
dule
Multipliers
are used in m
u
l
t
iplying weights va
lue
s
to
the in
comi
ng t
i
mes
se
rie
s
d
a
ta. The
simple
st way is ad
d-shift multiplicatio
n as
sh
own
in Figu
re 5
(
b). T
he
N
-b
it opera
n
d
s
for
multiplicatio
n
are stored i
n
two regi
ste
r
s an
d multip
lied with ea
ch other in
N
step
s. A 2-inpu
t
AND gate g
e
nerate
s
ea
ch
partial p
r
od
uct.
3.2. Digit-serial Pipeline
d
Multiplier
In orde
r to incre
a
se the throug
hput of digit-seri
al multiplier, ea
ch
digit-se
rial
multiplie
r
module (DS
MM) is conn
ected in a systolic ar
ray fashio
n to impleme
n
t a very fine-grai
ned
pipeline [1
3]. The bits
of the multiplie
r
are
sup
p
lied
one di
git at a time, startin
g
with the le
ast
signifi
cant di
git, while the
bits of the multiplica
nd
are
suppli
ed
as a pa
rall
el word. Each p
a
rtial
prod
uct i
s
shi
fted and the
n
adde
d to the
previou
s
p
a
rtial
prod
uct
s
. Figure
6 sho
w
s a
digit-se
rial
pipeline
d
m
u
l
t
iplier
co
nne
cted in
sy
stoli
c
a
r
ray fa
shi
on. Pipeli
n
ing
is do
ne i
n
o
r
de
r to
limit t
h
e
critical p
a
th p
r
opa
gation
d
e
lays
betwee
n
re
giste
r
s. In the
digit-se
rial pi
pelin
ed
multiplier sho
w
n
in Figure 6, the pipeli
n
ing
limits the pro
pagatio
n to
a 2-bit adde
r i
n
the digit-se
rial multiplie
r with
N=
2
.
Figure 6. Digi
t-se
rial pip
e
lined multiplie
r
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 23
02-4
046
TELKOM
NI
KA
Vol. 14, No. 1, April 2015 : 130 – 13
9
134
3.3. Multiplier-accumulator Unit
In multiply-a
ccumul
ate (M
AC) o
p
e
r
atio
n the
synapt
ic filter
coeffi
cient
s (wei
gh
ts) a
r
e
multiplied
with time serie
s
data an
d the
results
are
ad
ded to a
n
a
c
cumulato
r. Fig
u
re
7 sho
w
s t
h
e
pipeline
d
stru
cture of multi
p
ly-accu
m
ula
t
e unit. MA
C unit con
s
i
s
ts
of a multiplier followed by an
adde
r and a
n
accumul
a
to
r regi
ster
whi
c
h sto
r
e
s
the
result when
clock
ed. Th
e output of the
regi
ster is fe
d ba
ck to
on
e inp
u
t of th
e ad
der,
so
t
hat on
ea
ch
clo
c
k cy
cle t
he o
u
tput of
the
multiplier i
s
a
dded to the registe
r
.
Figure 7. Multiply-accu
m
ula
t
e unit
3.4. Digit-s
er
ial FIR Filter Design
As discu
s
sed
earlie
r, in th
e ca
se
of an
FIR neu
ral
netwo
rk, a fi
nite impul
se
respon
se
filter repla
c
e
s
the synaptic
weig
hts [10]. An FIR filter can be im
plem
ented usi
ng j
u
st three di
gital
hard
w
a
r
e
ele
m
ents, a
unit
delay (a latch), a m
u
ltiplie
r, and
an a
d
d
e
r. Figu
re
8 shows a ta
ppe
d
delay line i
m
plementatio
n
of an FI
R filter. The
unit
delay si
mply
update
s
its
o
u
tput on
ce p
e
r
sampl
e
pe
rio
d
, using th
e
value of the input a
s
its ne
w outp
u
t
value. An FIR filter ca
n be
rep
r
e
s
ente
d
as the co
nvo
l
ution sum a
s
giv
en in Equation (3). Notice that at each
k
we have
ac
ce
ss t
o
t
h
e
M
+1 s
a
mple
s
)
(
),...,
2
(
),
1
(
),
(
M
k
x
k
x
k
x
k
x
.
M
m
m
k
x
m
h
k
y
0
)
(
)
(
)
(
(
3
)
Figure 8. Tap
ped del
ay line impleme
n
ta
tion of FIR filter
3.5. Sigmoid
Genera
tor
The a
c
tivation function o
r
squ
a
shing fu
ncti
on limit
s the output of
neuron. The
activation
function
u
s
ed
in this imple
m
entation i
s
a si
gmoi
d
fun
c
tion. Sigm
oid fun
c
tion
wa
s
cho
s
e
n
a
s
i
t
is
differentiabl
e, which is an i
m
porta
nt pro
perty
for appl
ying learni
ng
algorith
m
. A sigmoi
d funct
i
on
is define
d
in Equation (4)
.
x
e
x
f
1
1
)
(
(
4
)
Evaluation Warning : The document was created with Spire.PDF for Python.
TELKOM
NIKA
ISSN:
2302-4
046
Hardware Implem
entation of FIR Neu
r
al
Netwo
r
k
fo
r Applicatio
ns i
n
Tim
e
… (Kuldeep S. Ra
wat)
135
W
h
er
e
λ
is th
e sl
ope
pa
ra
meter
of the
sigmoi
d fu
ncti
on. Thi
s
fu
nct
i
on i
s
a n
onli
near fun
c
tion
of
x
and is limited
betwee
n
the values of 0 or 1. T
he sigm
oid functio
n
wa
s implem
e
n
ted as a lo
o
k
up
tablet (LT
)
. The LT impl
e
m
entation h
a
s
a majo
r
ad
vantage in th
at it does not
requi
re
spe
c
ial
hard
w
a
r
e. Ot
her im
pleme
n
t
ation su
ch
a
s
pie
c
e
w
i
s
e a
pproxim
ation
is al
so a g
o
o
d
ch
oice for
an
inexpen
sive h
a
rd
wa
re impl
ementation, b
u
t is
more
sui
t
able for ha
rd
wire
d de
sign
[14].
3.6. The Arc
h
itec
ture o
f
a Neur
on
The m
o
st im
portant
pa
rt of a ne
uron i
s
t
he
pipeli
n
ed multipli
er,
whi
c
h
pe
rforms hi
gh-
spe
ed
multipl
i
cation of
syn
aptic sign
als (time
serie
s
da
ta
)
w
i
th
filter w
e
igh
t
s
as
sh
o
w
n
in
F
i
gu
re
9. An 8-bit d
a
ta format i
s
cho
s
e
n
for
synapti
c
si
gn
als a
s
well a
s
for the
wei
ghts. As
see
n
in
Figure 9, ea
ch ne
uro
n
h
a
s a lo
cal re
ad only mem
o
ry (ROM) th
at store
s
a
s
many coeffici
ent
values a
s
the
r
e are co
nne
ctions (filter ta
ps) to the p
r
e
v
ious layer.
An eighteen
-bit accumula
tor is used to add the si
gnal
s from the pipelin
e with the
neuron’
s bi
a
s
valu
e, whi
c
h i
s
stored
in a
sep
a
ra
te re
giste
r
. T
he o
u
tput of
the a
c
cumul
a
tor
regi
ster i
s
b
u
ffered b
e
fore pa
ssin
g
o
n
to sigm
oid
gene
rator.
A regi
ster h
o
lds th
e neu
ron’
s
comp
uting
re
sult u
n
til it is
ready to
write
the va
lue
to
a shared
out
put bu
s. All n
euro
n
s of a
la
yer
use the
sam
e
bus to a
d
d
r
ess the
sig
m
oid ge
ner
ator. The
sigm
oid functio
n
is programme
d a
s
lookup table t
hat is do
wnlo
aded
with the
bitstream file
during FP
GA synthesi
s
.
Figure 9. Architecture of neuro
n
with its input and out
put sign
als
In
Figu
re 9,
the controlle
r block gen
erat
es
a p
r
op
er seque
nce of
si
gnal
s to
co
ntrol the
timing for no
des in
ea
ch l
a
yer. It ge
nerates
add
re
sses fo
r th
e
co
efficient
ROM
and
controls th
e
time of multi
p
licatio
n, a
c
cumulation,
using th
e
outp
u
t
bus an
d p
r
e
l
oadin
g
the
b
i
as, a
nd
ena
ble
and read
sig
m
oid fun
c
tion
LT. The con
t
roller i
s
mod
i
fied acco
rdin
g to the num
ber of ne
uro
n
s
that want to use the comm
on bu
s.
In a laye
red
feedforwa
rd
n
e
twork, n
o
t a
ll t
he n
euron
s h
a
ve to
be
co
nne
cted
togethe
r;
rathe
r
, data
are
pa
ssed f
r
om
one l
a
yer to th
e n
e
x
t. Every neuron
ha
s it
s
multiplier, a
n
d
all
neuron
s may
receive on
e
data word i
n
parall
e
l, feed
it to a pipeli
ne multiplie
r,
accumul
a
te
th
e
multiplicatio
n
results
and
write them to
the commo
n b
u
s le
adin
g
to
the next layer one
after e
a
c
h
other.
4. Hard
w
a
re
Implementation Frame
w
ork for FIR
Neural Ne
t
w
o
r
k
Whe
n
imple
m
ented in h
a
rd
wa
re, neu
ral net
works can ta
ke fu
ll advantage
of their
inherent p
a
ra
llelism
and
run o
r
de
rs of
magnitud
e
fa
ster than
software im
plem
entation
s
[3,
6,
15]. The impl
ementation of
FIR neural n
e
twork is divi
ded into two
parts, off-b
o
a
r
d sim
u
lation
and
on-b
o
a
r
d ha
rdwa
re impl
e
m
entation, as sho
w
n in Fig
u
re 10.
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 23
02-4
046
TELKOM
NI
KA
Vol. 14, No. 1, April 2015 : 130 – 13
9
136
Figure 10. A compl
e
te imp
l
ementati
on framework for
FIR neu
ral ne
twork
4.1. Off-boar
d
Simulation Experiment
The first part
is an off-boa
rd si
mulation,
wh
e
r
e the te
mporal ba
ckp
r
opa
gation l
e
arnin
g
algorith
m
is i
m
pleme
n
ted i
n
MATLAB.
Detail
s
on
th
e tempo
r
al
b
a
ckpropa
gati
on alg
o
rithm
can
be found in [
2
], [9-10]. The simulatio
n
prog
ram
rea
d
s a time se
ri
es data file, with the network
config
uratio
n sup
p
lied by the user. The
time seri
es d
a
ta sets u
s
e
d
for the experiments
were the
ben
chma
rk sets of load i
n
electri
c
al
net
and fluc
tu
ations i
n
a far-i
n
frared la
se
r. In addition,
one
set of
synth
e
t
ically ge
ne
ra
ted time
se
ri
es
wa
s
al
so use
d
.
Thi
s
ti
me seri
es was gen
erated
by
nume
r
ically in
tegrating th
e
equatio
n of m
o
tion for
a da
mped, d
r
iven
particl
e in the
potential fiel
d.
The eq
uation
of motion was integ
r
ate
d
with a sim
p
l
e
fixed-ste
p
4
th
orde
r Run
ge-Kutta routi
n
e
that generate
d
100,00
0 dat
a points.
Table
1
sh
ows the
nu
mbe
r
of data
sam
p
les u
s
ed
to
train
netwo
rks a
nd to
te
st
netwo
rk
gene
rali
zatio
n
ability for e
a
ch tim
e
seri
es. Trainin
g
data sta
r
ts f
r
om the b
egin
n
ing of the
serie
s
and test dat
a starts fro
m
the end of
the trai
ning
data. The no
rmali
z
ed me
an squ
a
re error
(NMSE) give
n in Equation
(5) was u
s
ed
as a perfo
rm
ance mea
s
u
r
e for pre
d
ictio
n
accuracy. T
he
idea is to minimize the m
ean sq
ua
re e
rro
r (MSE) so as to achi
e
v
e better pre
d
iction results. In
Equation (5),
σ
2
is the varia
n
ce of the de
sire
d output
s
i
d
and
N
is the
numbe
r of pa
tterns.
N
i
i
i
d
x
N
NMSE
1
2
2
1
(
5
)
Table 1. Training an
d test data set si
ze
s
Time Series
Load in electrical
net
Fluctuations in a far-
infrared laser2
S
y
nthetically gen
erated time
ser
i
es
Training Set
1500
1000
5000
Test Set
500
1000
5000
The FIR
neu
ral network
co
nfiguratio
n th
at wa
s u
s
ed
durin
g si
mula
tion had t
w
o
hidde
n
layer an
d on
e
nonlin
ear
out
put neu
ron.
Different co
mbi
nation
s
of pre
d
iction
ord
e
r
and n
u
mbe
r
o
f
neuron
s in hi
dden laye
r were trie
d in effort to fi
nd th
e config
uratio
n that would
model the da
ta
most effectively.
The out
puts of
off-b
oard
le
a
r
nin
g
are filter
coeffici
ents
and th
e d
e
sign
para
m
eters (numbe
r of hidden laye
rs,
numbe
r of hi
dden u
n
its, numbe
r of filt
er taps) of the best
possibl
e FI
R
neural n
e
two
r
k topol
ogy f
o
r th
e time
s
seri
es d
a
ta i
nput. Thi
s
i
n
formatio
n i
s
l
a
ter
use
d
for ha
rdwa
re impl
e
m
entation. T
he final
impl
emented
architecture was base
d
on th
e FIR
neural net
wo
rk topol
ogy ob
tained for sy
nthetically
g
e
nerate
d
time
seri
es. T
he fi
nal archite
c
tu
re
FIR neu
ral ne
twork co
nfigu
r
ation obtai
ne
d wa
s a
1:10:
10:1 fully con
necte
d feedfo
r
wa
rd n
e
two
r
k
with the topol
ogical de
scrip
t
ion sho
w
n in
Table 2.
Evaluation Warning : The document was created with Spire.PDF for Python.
TELKOM
NIKA
ISSN:
2302-4
046
Hardware Implem
entation of FIR Neu
r
al
Netwo
r
k
fo
r Applicatio
ns i
n
Tim
e
… (Kuldeep S. Ra
wat)
137
Table 2. Top
o
logy of the
final FIR Ne
ural Network
Topolog
y
# of taps per s
y
n
apse
Training set er
ror
Test set erro
r
1:10:10:1
20:4:4
0.5683
0.5838
4.2. On-boa
r
d
Hard
w
a
re I
m
plementa
tion
Equippe
d wit
h
de
sign p
a
rameter valu
e
s
obtain
ed from off-bo
ard
simulatio
n
, one
can
pro
c
ee
d to the on-boa
rd h
a
rd
wa
re impl
ementati
on.
The targ
et d
e
vice u
s
ed fo
r implem
enta
t
ion
wa
s XILINX XC400
0 se
rie
s
FPGA. Figu
re 11
sh
o
w
s XILINX XC40
00 se
rie
s
FP
GA developm
ent
board [16, 17
].
Figure 11. XILINX XC40
00
serie
s
FPGA
developme
n
t board
The XILINX
XC400
0
seri
es FP
GA a
r
chitecture i
s
shown in
Figu
re 1
2
. It co
n
s
ist
s
a
n
array of pro
g
ramm
able f
unctio
n
units called
Co
nfigura
b
le L
ogi
c Blocks
(
C
LB
s),
lin
ked
by
prog
ram
m
abl
e interconn
ect resou
r
ce
s. The intern
al si
gnal line
s
inte
rface to the p
a
ckag
e throu
g
h
prog
ram
m
abl
e Input/Outp
u
t Blocks
(IOBs).
FPGA
s a
r
e
config
ured
by
setti
ng e
a
ch of
their
prog
ram
m
abl
e element
s (i.
e
. logic cells,
routing n
e
two
r
k an
d I/O cel
l
s) to a de
sire
d state. The
s
e
are p
r
og
ram
m
ed via pro
g
rammabl
e switche
s
(PSM
).
The XILINX
XC400
0
seri
e
s
FPGA
used
for t
he i
m
ple
m
entation
ha
s 57
6
CLBs i
n
form
of
a 24
x 2
4
m
a
trix. In additio
n
to thi
s
it
ha
s 1,5
3
6
flip-fl
ops an
d
192
IOBs. Ea
ch
CLB
co
nsi
s
ts of
function g
ene
rators, flip-flo
ps, SRAM, a
nd fast ca
rry l
ogic for a
dditi
on [16, 17].
Figure 12. XILINX XC40
00
serie
s
a
r
chitecture
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 23
02-4
046
TELKOM
NI
KA
Vol. 14, No. 1, April 2015 : 130 – 13
9
138
The
hardware imple
m
ent
ation involve
s
m
odelin
g each comp
u
t
ation
mo
dul
e
in a
hard
w
a
r
e d
e
s
cription la
n
guag
e (HDL
). In this
implementation, Very-Hi
g
h
-
Sp
eed-Inte
grated
-
Circuit HDL (VHDL
) wa
s use
d
to mod
e
l
ha
rd
wa
re building
bl
ocks
an
d
the
f
unctio
nality wa
s
tested in Mo
d
e
lSIM [18]. All the module
s
were
then in
tegrated to fo
rm the compl
e
te FIR neu
ral
netwo
rk
syst
em. The VHDL descri
p
tion
of the fi
nal topology was
synthesi
z
ed o
n
to FPGA usin
g
XILINX synthesi
s
tool (XST). The FPG
A
design flo
w
is summ
ari
z
ed in Figu
re 1
3
.
Figure 13. FPGA synthe
sis steps
The h
a
rdware de
scriptio
n
wh
en
run
th
roug
h XST
g
enerates a
n
e
tlist of the
h
a
rd
wa
re
desi
gn i
n
XIL
I
NX netli
st fo
rmat
(X
NF
).
This netlist
file de
scrib
e
s t
he
con
n
e
c
tivity of FIR n
e
u
r
al
netwo
rk de
si
gn a
nd i
s
pa
ssed th
ro
ugh
XILINX Pl
a
c
ement
and
Routing
(PAR) tool fo
r effici
ent
desi
gn layout
. The output of a placem
e
n
t and routin
g operation is a bitstream
file that contai
ns
all the FPGA config
uratio
n informatio
n. This bi
tsre
am is su
bsequ
ent
ly downlo
ade
d on to XILINX
XC4000 seri
es FPGAs. T
he
resource utilization fo
r the final
FIR neural
net
work design
as
repo
rted
by
XST is
sum
m
ari
z
ed
in T
able
3.
The
maximum crit
ical path dela
y
wa
s
repo
rt
ed
a
s
220n
se
c. Th
us, FIR neu
ral netwo
rk b
oard
c
an b
e
operated at maximum sp
eed of 4.54
MHz
(1/220
nsec). The implem
e
n
ted
FIR ne
ural net
work
hard
w
a
r
e ca
n work with
a host PC a
s
a
dedi
cated
co
pro
c
e
s
sor, offloadin
g
t
he majority of the comp
utation.
Table 3. Hardware re
source utilization fo
r FIR neural network
Resources
Number of
units used
Configurable Lo
g
i
c Blocks (CLBs)
2204
Input/O
ut
put Blo
cks (IO
B
s)
182
Flip-flops or Latc
hes
3862
5. Conclusio
n
This p
ape
r investigate
d
finite impulse
re
spon
se
(FI
R
) n
eural ne
twork de
sign
and its
h
a
r
dw
ar
e imple
m
e
n
t
a
t
io
n fo
r
time se
r
i
es p
r
ed
ic
tion
. T
he n
e
two
r
k in
corpo
r
ate
s
FI
R filters, which
gives
dynam
ic
con
nectivit
y
to the ne
twork a
nd f
a
cilitate
s
te
mporal p
r
o
c
essing
of si
gnal
prop
agatio
n i
n
the
synap
ses. To
dem
o
n
strate
pote
n
t
ial appli
c
atio
ns, three diff
erent time
s
series
data set
s
we
re sele
cted. One of the d
a
ta set, sy
nt
hetically ge
n
e
rated,
wa
s use
d
to obtai
n the
final FIR n
e
u
r
al n
e
two
r
k t
opolo
g
y for h
a
rd
wa
re
impl
ementation.
A compl
e
te d
e
sig
n
fram
ework
Evaluation Warning : The document was created with Spire.PDF for Python.
TELKOM
NIKA
ISSN:
2302-4
046
Hardware Implem
entation of FIR Neu
r
al
Netwo
r
k
fo
r Applicatio
ns i
n
Tim
e
… (Kuldeep S. Ra
wat)
139
for impleme
n
t
ing FIR neu
ral network
on FPGA
was presented
. The trainin
g
algorith
m
wa
s
impleme
n
ted
off-boa
rd u
s
in
g MATLAB.
On-b
oa
rd o
r
hard
w
a
r
e im
plementatio
n
pro
c
e
dure
wa
s in
com
p
liance with t
he FPGA
desi
gn flow.
The co
mplet
e
VHDL d
e
scriptio
n
wa
s
synthe
sized
on to XILINX XC4000
se
ries
FPGA usi
ng
XILINX synth
e
si
zing
to
ol (XST). The
synthesi
s
tool
also
re
porte
d the resource
utilization and
maximum critical
path delay
. The fi
nal FIR neural network
desi
gn
can
be
operated
at a
maximum f
r
eque
ncy of
4
.
54 MHz.
Re
config
ura
b
ility, adapta
b
ility, and
scala
b
ility
are
the
main
feature
s
of th
is FI
R n
e
u
r
al
netwo
rk
hardwa
r
e
de
sign
. For a
ne
w
appli
c
ation,
o
n
ly
the
filter coef
ficients
and
biases nee
d to
be re
confi
gure
d
o
n
the
FPGA with
o
u
t ch
angin
g
t
h
e
basi
c
d
e
si
gn.
The de
sig
n
can
be e
a
sily
expand
ed b
y
just addi
ng
more
nod
es with the
sa
me
config
uratio
n. The h
a
rd
wa
re
can
wo
rk indep
end
ent
ly or a
c
t as a co
processor with
a h
o
st
comp
uter run
n
ing off-bo
ard temporal ba
ckpro
pag
atio
n algorith
m
.
Referen
ces
[1]
W
i
dro
w
B, Ru
melh
art D, Lehr M. Neural n
e
t
w
o
r
ks: Appl
i
c
ations i
n
ind
u
s
tr
y
,
busin
ess,
and scie
n
ce.
Co
mmun
icati
o
ns of the ACM
. 1994; 3
7
: 93-1
05.
[2]
Ha
ykin S. Ne
u
r
al Net
w
orks:
A Compre
hens
ive F
oun
dati
o
n
.
2
nd
Ed. Uppe
r Saddl
e, NJ: Prentice H
a
ll.
199
9.
[3]
Restrep
o
HF
,
Hoffmann
R,
Perez-Uri
be
A
,
T
euscher C, Sanc
hez E.
A netw
o
rked
F
P
GA-based
hardw
are i
m
pl
ementati
on of
a neura
l
ne
tw
ork applicati
on.
In Procee
din
g
s of the 200
0 IEEE
S
y
mp
osi
u
m on
F
i
eld-Pro
g
ram
m
abl
e Custom
Comp
uting Ma
chin
es. 200
0: 337-3
38.
[4]
Beiu V. Optim
a
l VLSI im
ple
m
entatio
ns of
neur
al
n
e
t
w
ork
s
: VLSI-friendl
y l
earn
i
n
g
al
go
rithms. Neura
l
Net
w
orks a
nd
T
heir Applicati
ons. J. G.
T
a
y
l
or.
Chich
e
ster, UK: John W
ile
y. 19
96: 25
5-2
76.
[5]
W
a
ldeck P, B
e
rgma
nn N.
E
v
alu
a
ting
softw
are a
nd
hardw
are i
m
ple
m
ent
ations
of sig
n
a
l-proc
essi
ng
tasks in an FP
GA.
In Proceedin
g
s of the 2
0
04 IEEE Intern
at
ion
a
l C
onfere
n
ce o
n
Field-P
r
ogramma
bl
e
T
e
chnolog
y. 2
004: 29
9–
30
2.
[6]
Jung S, Kim S. Hard
w
a
r
e
imp
l
eme
n
tation
of a r
eal-time
neu
ral net
w
o
rk co
ntroll
er
w
i
t
h
a
DSP and a
n
F
P
GA for nonli
near s
y
stems.
IEEE Transactions on Industrial Electronics
. 200
7; 54(1): 26
5-27
1.
[7]
Jiho
ng L,
Deq
i
n L.
A surv
ey o
f
F
P
GA-based
hardw
are i
m
pl
ementati
on of
ANNs
. In Proc
eed
ings
of th
e
200
5 Internati
o
nal C
onfere
n
ce
on Neur
al Ne
t
w
o
r
ks an
d Brai
n, ICNN&B. 20
05; 2: 915-
918.
[8]
Z
hu JM, Gunther BK.
T
o
w
a
rds an F
P
GA base
d
reco
nfig
urab
le co
mp
uti
ng env
iron
me
nt for neural
n
e
t
wo
rk i
m
p
l
em
en
ta
ti
on
s
. In Procee
din
g
s
of the 9
th
Internatio
nal
Conf
erenc
e on Arti
ficial N
eur
a
l
Net
w
orks. 19
9
9
; 2: 661-6
66.
[9] Wan
E.
T
e
mp
oral
back
p
ro
pa
gatio
n for F
I
R
neur
al
netw
o
rk
s.
In Proce
e
d
i
ng
of the
Inter
natio
nal
Jo
int
Confer
ence
on
Neura
l
Net
w
or
ks (IJCNN). 1990; 1: 575-
580.
[10]
W
an E. F
i
nite Impulse R
e
sp
o
n
se Ne
ura
l
Ne
t
w
orks
w
i
t
h
Ap
plicati
ons i
n
T
i
me Series Pr
e
d
ictio
n
.
PhD
Dissertati
on. Stanfor
d U
n
ivers
i
t
y
; 199
3.
[11]
Aggo
un A, Ibrahim MK, Ash
u
r A. Bit-le
vel pip
e
li
ned dig
i
t-serial arra
y
pr
ocessors.
IEEE Transactions
on Circuits and System
s II: Analog
and Digital Sign
al Processing.
19
98; 45(
7): 857-8
68.
[12]
Artle
y
RI, Parh
i
KK. Digit-seria
l Comp
utation.
Boston, MA: Klu
w
er Aca
demi
c
. 1995.
[13]
Kim CH, K
w
o
n
S, Ho
ng
C
P
.
A fast digit-serial systolic
multip
lier
for
finite field GF(2/sup
m
/
).
In
Procee
din
g
s of
the 200
5 Asia
and S
outh Pac
i
fic De
sig
n
Auto
mation C
onfer
ence (ASP-DA
C’05). 2
0
0
5
;
2: 1268-
12
71.
[14]
Zhang M, Vas
s
ilia
dis S, Del
g
ado-
Fri
a
s JG.
Sigmoi
d g
ener
ators for neur
a
l
computi
ng us
i
ng pi
ece
w
i
s
e
appr
o
x
imati
ons
.
IEEE
Transac
tions on Com
p
uters
. 2002; 4
5
:
1045-1
0
4
9
.
[15]
Rafael G Cerdá J,
Bal
l
estar F
,
Mochol
i A.
Arti
fi
ci
a
l
n
e
u
r
al
netwo
rk i
m
pl
em
entation on a single FPGA
of
a pip
e
li
ned o
n
-
line b
a
ckpr
o
p
agati
on.
Proceeding ISSS '00
Proceedings
of the 13th international
s
y
mp
osi
u
m on
S
y
stem s
y
nth
e
s
is. 2000: 2
25-
230.
[16]
XILIN
X
. T
he Progr
amm
abl
e L
ogic D
a
ta Book
. Xili
n
x
, Inc. 1996.
[17]
Parne
ll K, Meh
t
a N. Program
mabl
e Log
ic D
e
sig
n
Quick Start Hand
bo
ok. Xi
lin
x, Inc. 200
3.
[18]
Ashen
de
n PJ.
T
he Desig
ner
's Guide to
V
HDL (S
ystems
on S
ilic
on).
2
nd
Ed. San F
r
ancisc
o
, CA:
Morga
n
Kaufm
ann Pu
blis
her.
200
2.
Evaluation Warning : The document was created with Spire.PDF for Python.