TELKOM
NIKA
, Vol. 13, No. 4, Dece
mb
er 201
5, pp. 1145
~1
152
ISSN: 1693-6
930,
accredited
A
by DIKTI, De
cree No: 58/DIK
T
I/Kep/2013
DOI
:
10.12928/TELKOMNIKA.v13i4.1894
1145
Re
cei
v
ed
Jul
y
14, 201
5; Revi
sed Septe
m
ber
15, 201
5; Acce
pted
Octob
e
r 2, 20
15
FPGA Implementation ofLow-Area Square Root
Calculator
Aiman Zak
w
an Jidin*
1
, Tole Sutikno
2
1
F
a
cult
y
of Eng
i
ne
erin
g T
e
chnolo
g
y
, U
n
ivers
i
ti
T
e
knikal Ma
l
a
y
s
ia Me
laka
2
Departme
n
t of Electrical En
gi
neer
ing, Un
iver
siti Ahmad D
a
h
l
an
*Corres
p
o
ndi
n
g
author, e-ma
i
l
: aimanz
ak
w
a
n@utem.e
du.
m
y
1
, tole@e
e.
uad.ac.i
d
2
A
b
st
r
a
ct
Squar
e ro
ot i
s
one
of the
mat
h
e
m
atic
al
oper
ations
w
h
i
c
h are
w
i
dely
used
in
di
git
a
l si
gn
al
process
i
ng. Its
imple
m
entati
o
n
on hardw
are s
u
ch as FP
GA
w
ill provid
e sev
e
ral adv
anta
g
e
s
comp
are to th
e
perfor
m
a
n
ce
offered i
n
softw
are. T
here ar
e s
e
vera
l al
gor
it
h
m
s w
h
ich
can
be uti
l
i
z
e
d
for t
h
is ca
lcul
atio
n, but
they are difficu
lt to be imp
l
e
m
e
n
ted i
n
F
P
GA. T
h
is
pape
r presents a mode
l of F
P
GA
base
d
squar
e root
calcul
ator, w
h
ich requ
ires ver
y
low
resource
s usage,
thus
occupy
ing very
low
area of F
P
GA. T
he mod
e
l is
designed to s
u
it t
he needs of
medium
-s
peed
and low-speed ap
plications whic
h don’
t
need very
hig
h
process
i
ng s
p
e
ed, w
h
ile o
p
ti
mi
z
i
ng th
e n
u
m
b
e
r of res
ourc
e
s
utili
z
e
d. T
h
e
mo
difi
ed n
on-r
e
storin
g al
gorit
hm
is use
d
in th
is
desi
gn to c
o
mpute the s
q
u
a
r
e
root. T
he d
e
s
ign is c
o
d
ed i
n
RT
L VHD
L
,
and
i
m
pl
e
m
ent
ed i
n
Altera DE
2-b
o
a
rd for
hardw
are va
lid
atio
n. T
he i
m
p
l
e
m
entatio
n pr
od
u
c
ed very
prec
ise sq
uar
e ro
ot
calcul
atio
n, w
i
th low
latency c
o
m
putati
on a
n
d
low
area co
n
s
umptio
n, fo
r vario
u
s inp
u
t da
ta w
i
dth tested.
Ke
y
w
ords
: F
P
GA, VHDL, Square R
oot, Area Optimi
z
a
tio
n
Copy
right
©
2015 Un
ive
r
sita
s Ah
mad
Dah
l
an
. All rig
h
t
s r
ese
rved
.
1. Introduc
tion
Square root i
s
a
n
a
r
ithmeti
c
o
peration
which
is
wid
e
ly used in
vario
u
s
appli
c
atio
ns
su
ch
as ima
ge a
nd audi
o proce
s
sing, scientific co
mp
utation, com
puter g
r
ap
hi
cs
and di
gital
comm
uni
cati
ons [1
- 2
-
3
-
4]. Recently, there
ar
e
ma
ny re
sea
r
che
s
which impl
e
m
ent the
squ
a
re
root calcul
ato
r
in hardware
like Field Pro
g
ramm
abl
e G
a
te Array (FP
G
A) in order t
o
achi
eve hig
h
spe
ed
comp
u
t
ation. The m
a
in interest of
impleme
n
tin
g
sq
uare root
cal
c
ulatio
n in
hard
w
a
r
e i
s
to
redu
ce the d
e
l
ays pre
s
e
n
t in its com
puta
t
ion,
thus pro
duci
ng a very
fast comp
uta
t
ion.
In many VLSI applicatio
ns nowa
days, i
t
is
vital to create d
e
si
gn
s whi
c
h
are
not only
prod
uci
ng
co
rre
ct a
n
d
a
c
curate
re
sult
s, but
also
to provide
de
sig
n
s with
very high pro
c
e
s
si
n
g
spe
ed,
whe
r
e
the exe
c
utio
n del
ay
is typically in
the
o
r
de
r of
a
fe
w
nano
se
co
nd
s or even
fast
er,
like in th
e case
of a co
mputer
gra
p
h
ic a
ppli
c
atio
ns. However,
in ord
e
r to
achi
eve the
best
perfo
rman
ce
from the
syst
em, certain
criteria
s li
ke
t
he p
o
wer co
nsum
ption
an
d the
re
so
urces
utilization
ne
ed to b
e
sa
crificed.Fo
r i
n
stan
ce
, th
e
com
putation
sp
eed i
n
h
a
rd
wa
re
can
be
increa
sed
by
introdu
cin
g
te
chni
que
s
su
ch a
s
pip
e
linin
g an
d pa
rall
el
com
puting
[5
]. The latter will
certai
nly sp
e
ed up the
proce
s
s, howe
v
er creating
multiple pa
ral
l
el path for th
e sam
e
ope
ration
will cause the increase i
n
the
number of
reso
urces (i.e. Adders,
Regi
sters)
used, and i
n
con
s
e
que
nce
,
the area or t
he si
ze of the
desig
n will al
so be
com
e
bi
gger.
This
ca
se i
s
certai
nly be
comin
g
an u
nne
ce
ssary i
s
sue which i
s
en
cou
n
tere
d whe
n
developin
g
medium
-spee
d and lo
w-speed a
ppli
c
at
ions,
with
the sam
p
li
ng or
ope
ra
ting
freque
ncy is
less than 10
MHz, which do not req
u
ired very high
spee
d com
p
utation with very
small d
e
lays,
while
still ne
ed to utilize t
he same am
ount of re
so
u
r
ce
s a
nd thu
s
o
c
cupying t
he
same
are
a
i
n
the hardware. Fo
r exa
m
ple, appl
i
c
a
t
ions like Di
rect To
rque
Control (DT
C
) for
machi
n
e
s
, which impl
eme
n
ted the sq
u
a
re root cal
c
ulation in FP
GA to achiev
e better esti
mation
of flux and to
rque,
only op
erated
at a
minimum
sa
mpling p
e
ri
od
of 5 µs [6].
In this case,
the
redu
nda
ncy
whi
c
h ha
d b
een introdu
ced by the
p
a
rallel com
p
utation
ca
n be
elimin
ate
d
and
therefo
r
e, opt
imizing the
re
sou
r
ces u
s
a
g
e
.
Beside
s, othe
r strate
gie
s
that need be
co
nsid
ere
d
is th
e algorith
m
to be impleme
n
t
ed in
hard
w
a
r
e, fo
r sq
uare ro
o
t
calcul
ator.
Unli
ke othe
r basi
c
math
ematical
ope
ration
s such
as
addition,
su
b
s
tra
c
tion
or m
u
ltiplication,
it is ve
ry
difficu
lt to implem
e
n
t a
squ
a
re root calculatio
n in
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 16
93-6
930
TELKOM
NIKA
Vol. 13, No
. 4, Decem
b
e
r
2015 : 114
5 – 1152
1146
hard
w
a
r
e, a
s
its algo
rithm
is more com
p
lex than
the
others. Often, it is quite
hard to
obtai
n an
exact re
sult throu
gh ha
rd
ware impl
emen
tation [7].
In fact, there
are
vario
u
s algo
rithms
whi
c
h
can
b
e
chosen i
n
orde
r to i
m
pl
ement a
squ
a
re
root in FPGA. For example, there a
r
e seve
ral method
s,
which are called estim
a
tio
n
method
s, su
ch
as Newto
n
-Raph
so
n
method [8
], Babylonian
method
[9] and
T
a
ylor-S
erie
s
expan
sion m
ehod [10]. Th
ere a
r
e also method
s call
ed digit-by-di
g
it calcul
ation
methods, wh
ich
are
more
suit
able fo
r FPG
A
Implement
ation p
u
rp
ose
.
Vedic
dupl
e
x
method,
wh
ich i
s
b
a
sed
on
16 fo
rmula
e
s
from a
n
ci
ent I
ndian
math
e
m
atics,
u
s
e
d
uplex o
p
e
r
ation in
o
r
de
r to
find th
e
squ
a
r
e
root of
a
num
ber. T
he
algo
rithm in
clu
d
e
s
seve
ral
ste
p
s
su
ch
a
s
th
e divisi
on
of the o
perand
i
n
to
grou
ps of 2
bits, the i
n
sp
ection
of e
a
ch group
s, a
n
d
the
quotie
n
t
extraction
[11-1
2
]. Besi
d
e
s,
there a
r
e
re
sea
r
che
s
whi
c
h h
ad impl
emented
re
st
oring
method
and no
n-re
storin
g meth
od.
Ho
wever,
the
non
-resto
rin
g
meth
od i
s
more
prefe
r
able th
an
re
storin
g m
e
th
od, o
w
ing
to
its
capability to reduce the number of
hardware resources
utilization,
since it
does not restore t
he
remai
nde
r [13- 14].
This p
ape
r p
r
ese
n
ts a
n
effective way to des
i
gn a lo
w-area
squ
a
re
root calculato
r
, whi
c
h
is imple
m
ent
ed by u
s
ing
FPGA. The
main contri
bu
tion of this p
aper i
s
the
d
e
velopme
n
t o
f
the
squ
a
re
root
calculato
r
by using th
e modi
fied
non-re
stori
n
g
method, which i
s
cod
ed in
synthe
sizable
VHSIC Ha
rd
ware Descri
p
t
ion Lang
uag
e (VHDL
)
. In addition, the i
m
pleme
n
tatio
n
strategy
prop
ose
d
in
this
pape
r i
s
by
sha
r
in
g
com
m
on h
a
rdware re
so
urce, t
hus elimin
ating
circuit red
u
n
dan
cy and o
p
timizing the
design a
r
ea
. In
the resu
lt and analysis
se
ction, the
perfo
rman
ce
of the p
r
o
p
o
s
ed
de
sig
n
, i
n
term
s
of reso
urce
s u
s
age,
spe
ed, l
a
tency
and
a
l
so
power
dissip
ation, will
be
analy
z
ed
a
nd
comp
ared
to tho
s
e
ob
tain from
the
non
-optimi
z
ed
desi
gn, whi
c
h
was d
e
velop
ed previo
usly
by using the
same al
go
rith
m [13].
2. Rese
arch
Metho
d
In this se
ctio
n, the theory of the algorith
m
for comp
uting sq
uare ro
ot result
s in h
a
rd
wa
re
is explai
ned.
Next, the impleme
n
tatio
n
strate
rgy i
n
FPGA will
be de
scrib
ed. The
wh
ole
impleme
n
tation i
s
d
e
si
gne
d by u
s
in
g V
H
DL. All di
git
a
l computatio
ns
are
pe
rformed in
un
sig
n
e
d
binary, sin
c
e
squ
a
re
root can only acce
pt positive nu
mbers a
s
the radi
can
d
.
2.1. Modified
Non-Re
stori
ng Metho
d
A squa
re ro
ot equation
can
be written a
s
follows:
√
(1)
Whe
r
e D i
s
the radi
can
d
, and Q is sq
ua
re root
of D. In digital comp
utation, D is denote
d
by n-bit un
si
gned n
u
mb
er, represe
n
ted
as D
= D
n-1
D
n-2
…D
1
D
0
.
For eve
r
y pa
ir of bits of the
radi
can
d
, the
intege
r
pa
rt
of the
sq
uare
ro
ot ha
s
on
e
bit. The
r
efo
r
e, the
re
sulte
d
squa
re
root
Q
sho
u
ld be rep
r
esented by
m = n/2 bits: Q = Q
m-1
Q
m-2
…Q
1
Q
0
.
In the conve
n
tional no
n-restori
ng di
git-by-digit
calcu
l
ation, only a pair of bits f
r
om the
radi
can
d
is t
a
ke
n in o
r
de
r to comp
ute
the partial
sq
uare ro
ot
re
sult
at
each iteration,
starti
ng
from the the most signifi
cant bit. This pair is
app
ende
d to the current rem
a
inde
r, whi
c
h
is
prop
erly shift
ed
first.
Th
e pro
c
ed
ure co
nsi
s
ts of
a
p
p
endin
g
0
1
to
the p
a
rtial
squ
a
re
ro
ot o
b
tai
ned
so fa
r. It is
properly
shifted
and
then
su
b
t
racted
from
the
curre
n
t re
mainde
r. If th
e ne
wly resulted
remai
nde
r i
s
positive, then
the ne
wly d
e
v
eloped
sq
ua
re
root bit i
s
1, else the
bit is
set to
0 a
n
d
the 11 is a
p
pend
ed to th
e cu
rre
nt pa
rtial sq
uar
e root and a
n
addition
al op
eration
will b
e
perfo
rmed
at
the next itera
t
ion,
instea
d
of su
btra
ction
.
The o
per
ation will
keep
on iterating u
n
til
the last bit of the squ
a
re
ro
ot is cal
c
ulate
d
.
Figure 1
de
monst
r
ate
s
h
o
w th
e
sq
uare root of
16
9
(D
=
101
01
001
2
) is calculated
by
usin
g digit-by-digit n
on-re
storing
algo
rith
m. In th
is
ca
se, it can
be
o
b
se
rved th
at the sq
ua
re
ro
ot
is equ
al to 13
(Q = 11
01
2
).
Evaluation Warning : The document was created with Spire.PDF for Python.
TELKOM
NIKA
ISSN:
1693-6
930
FPGA Im
ple
m
entation ofLow-A
r
ea Sq
u
a
re Root Cal
c
ulator
(Aim
an Zakwan
Jidin
)
1147
Figure 1. The
example of non-re
stor
in
g digit-by-digit cal
c
ulatio
n to comp
ute the squ
a
re
root o
f
169
Ho
wever, th
e
r
e i
s
slight
di
fferences whi
c
h
ca
n b
e
fo
und i
n
the
m
odified
non
-restori
ng
algorith
m
wh
en solving a
squ
a
re
ro
ot p
r
oble
m
, it pro
v
ides
simple
r solutio
n
whi
c
h o
n
ly perfo
rm
subtract o
peration and
ap
pend
s 01. T
hus, the a
ddi
ng ope
ration
is rem
o
ved
and thu
s
le
sser
hard
w
a
r
e
re
source
s
are
required
he
re
. In this ca
se, if the
re
sulted
re
sult i
s
n
egative,
no
subtractio
n will take pla
c
e
.
As illustrat
ed in
Figu
re
2, the sam
e
cal
c
ulatio
n
is done a
s
th
e
previou
s
, by usin
g the mo
dified non
-restoring alg
o
rith
m.
Figure 2. The
example of modified no
n-resto
r
in
g di
git
-
by-di
g
it cal
c
ulation to co
mpute the sq
uare
root of 169
Figure 3
sho
w
s the
pseu
docode
of th
e calcul
ating
pro
c
e
d
u
r
e i
n
orde
r to
solve the
squ
a
re
root
probl
em
s by using the
modified no
n
-re
stori
ng alg
o
rithm. He
re,
the radican
d
is
denote
d
by n-bit bina
ry. Thus, n/2 itera
t
ion w
ill be p
e
rform
ed bef
ore
compl
e
te
d cal
c
ulatin
g the
final squ
a
re
root result Q, whi
c
h is d
eno
ted by n/2-bit binary.
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 16
93-6
9
30
TELKOM
NIKA
Vol. 13, No
. 4, Decem
b
e
r
2015 : 114
5 – 1152
1148
Figure 3. The
pseu
do
cod
e
of modified n
on-re
storin
g digit-by-digit cal
c
ulatio
n to solve squa
re
root co
mputat
ion
2.2. Hard
w
a
r
e
Implementation Stra
teg
y
As can
be
se
en fro
m
the
a
l
gorithm
whi
c
h ha
s b
een
e
x
plained i
n
S
e
ction
2.1, th
e squa
re
root
cal
c
ulat
or d
e
si
gn
ca
n pa
rtitioned
into two dif
f
erent g
r
o
u
p
s
, na
mely P
a
rtial
Rem
a
i
nder
Cal
c
ulato
r
(P
RC), whi
c
h
calcul
ates th
e
remai
nde
r value, an
d Pa
rtial Squa
re
Root
Cal
c
ula
t
or
(PSC), which determi
ne
s the value of ea
ch bits in the
squ
a
re
root result.
In orde
r to a
c
hieve a very
high
spee
d square
root ca
lculatio
n in h
a
rd
wa
re, pai
rs of PRC
and PS
C
blocks shall
be
im
plemente
d
in
parall
e
l, wh
e
r
e ea
ch
pai
r i
s
used
to
cal
c
ulate the
pa
rti
a
l
squ
a
re
ro
ot result g
ene
rat
ed by ea
ch
pair of
bi
ts o
f
the radi
ca
n
d
. The n
u
mb
er of p
a
irs to
be
impleme
n
ted
depen
ds
on
the numbe
r of bits of the ra
dicand;
n-bit ra
dica
nd re
quires
the
impleme
n
tation of n/2 pairs of PRC a
n
d
PSC in par
al
lel. Figure 4 il
lustra
te
s the
block dia
g
ra
m
s
of the hard
w
a
r
e implem
ent
ation for a sq
uare
root calculator with 8
-
bit radicand.
Figure 4. The
hard
w
a
r
e im
plementatio
n of
8-bit radi
ca
nd sq
uare ro
ot calculator
As the matte
r of fact, all those p
a
irs are exa
c
tly similar
and
perfo
r
ming t
he sa
me
operation. Th
e only differe
nce
s
a
r
e th
ei
r inp
u
ts a
nd
outputs. T
h
e
r
efore, in
orde
r to o
p
timize
the
utilization of t
he hardware
resources, it
i
s
prop
osed to share a
singl
e pai
r
of PRC and PSC block
Evaluation Warning : The document was created with Spire.PDF for Python.
TELKOM
NIKA
ISSN:
1693-6
930
FPGA Im
ple
m
entation ofLow-A
r
ea Sq
u
a
re Root Cal
c
ulator
(Aim
an Zakwan
Jidin
)
1149
whi
c
h is co
mmon to all computatio
n
of square r
oot bits. This method will
eliminate all the
redu
nda
nt circuity in the
desi
gn, and
thus
will
sig
n
ificantly re
d
u
ce th
e ha
rd
ware reso
urces
usa
ge, espe
cially for the case of squa
re
root cal
c
ulat
ors
with big radicand
s.
Figure 5 sho
w
s the blo
c
k diagram
s rep
r
esent
ing the
hard
w
a
r
e im
plementatio
n of low-
area
sq
ua
re
root calculato
r
. In ord
e
r to
allow
the
ha
rdware shari
ng, ea
ch
co
mputation of
the
squ
a
re
root b
i
t will be con
d
u
cted
seq
uen
tially. C
onse
q
uently, some
latency will b
e
introdu
ce
d in
the squa
re
root calculatio
n time, a
s
o
n
ly one
bit
of
the
squ
a
re root is resulte
d
at e
a
ch
clo
c
k
cy
cle.
Figure 5. The
hard
w
a
r
e im
plementatio
n of
area
-optimi
z
ed squ
a
re ro
ot
calculator
The computa
t
ion latency can be dete
r
m
i
ned by lo
o
k
i
ng at the nu
mber of bits
use
d
for
the radi
ca
nd.
For n
-
bit ra
d
i
can
d
, the sq
uare
root
cal
c
ulatio
n sh
all
be co
mplete
d after n/2 cl
ock
cycle
s
. Even
the propo
se
d
implem
entati
on p
r
ovide
s
sl
owe
r
squ
a
re
root cal
c
ulato
r
than
p
r
evio
us
resea
r
che
s
, it shoul
d be
suitabl
e for appli
c
ation
s
whi
c
h d
o
n
o
t requi
re v
e
ry high
sp
eed
processi
ng ti
me, while the area
of the design
or
the hardware resources
utilization
can be
optimized.
3. Results a
nd Analy
s
is
The
pro
p
o
s
e
d
de
sig
n
wa
s
su
ccessfull
y
impleme
n
ted o
n
Altera
DE2
Development
Board,
whi
c
h
use
Cy
clon
e
II FPGA.Syn
thesi
z
abl
e V
H
DL code
is
use
d
to
confi
gure
the d
e
si
gn.
More
over, th
e de
sig
n
codi
ng m
e
thod
m
a
ke
it
scalabl
e,
thus it i
s
v
e
ry
simple
to
vary the
widt
h of
the squ
a
re
ro
ot radicand in
orde
r to perf
o
rm differe
nt tests.
3.1. Functio
n
al Simulation Verificati
on
In ord
e
r to ve
rify the fun
c
tionality an
d to
analy
z
e th
e
perfo
rman
ce
of the d
e
si
gn,
seve
ral
simulatio
n
s a
nd also hard
w
are verificati
on by usin
g FPGA had bee
n con
d
u
c
ted.
Figure
6
sho
w
s
the sim
u
l
a
tion results of
the sq
u
a
re
ro
ot calculati
on fo
r 1
2
-bit
radi
can
d
squ
a
re
root
computation
b
y
usin
g the
p
r
opo
se
d
d
e
si
gn.Base
d o
n
these
re
sult
s,
the o
u
tputs
of
the com
putati
on are
co
rrect and accu
rat
e
. The laten
cy in resulting
the squ
a
re
ro
ot can al
so b
e
observed from this
res
u
lt.
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 16
93-6
930
TELKOM
NIKA
Vol. 13, No
. 4, Decem
b
e
r
2015 : 114
5 – 1152
1150
Figure 6. The
simulation
re
sult of squ
a
re
root com
pute
d
by the prop
ose
d
de
sign
Mean
while, F
i
gure
7 sho
w
s the 8
-
bit ra
dica
nd
squ
a
re root
cal
c
ula
t
ion re
sults
which
had
been
obtain
e
d
from th
e h
a
r
dware valida
t
ion, by
usi
n
g
FPGA. Both
the ra
dicand
and the
outp
u
t
from FPGA h
a
ve bee
n di
splayed in th
e
SignalTa
p II Logi
c Analyzer. Based o
n
these
re
sult
s, it
sho
w
s that the squa
re
ro
ot has
bee
n
corre
c
tly implemente
d
in
FPGA hard
w
are, produ
cing
ac
cur
a
t
e
re
su
lt
s.
Figure 7. The
hard
w
a
r
e val
i
dation re
sult
of s
qua
re roo
t
computed b
y
the propo
se
d desi
gn,
displ
a
yed in
SignalTa
p II
Logi
c Analyzer
3.2. Design
Performan
c
e
Analy
s
is
Table 1 shows the resource ut
ilization and the processing time
for
various
configuration
tested.Th
ese
data are obtai
ned from the
compil
at
ion report, which had be
en ge
n
e
rated
on
ce the
desi
gn
com
p
i
l
ation is com
p
leted in
Alte
ra Q
u
a
r
tus II Software.Fro
m
this ta
ble,
we
ca
n cl
ea
rly
see th
at the n
u
mbe
r
of logi
c elem
ents
(L
E) us
ed in a
r
ea-o
p
timized
desi
g
n
s
are
much l
o
wer t
han
those used in speed-opti
m
ized desi
gns, althou
gh the number regi
ster
s utilized are
slightly
bigge
r.
Table 1. Co
m
pari
s
on b
e
tween no
n-o
p
timzed v
s
. are
a
-optimi
z
e
d
d
e
sig
n
s, in terms of re
sou
r
ces
utilization and computation speed
Op
timiz
a
ti
on
Radica
nd
w
i
dth
(
b
it)
Logi
c
Elemen
ts
Regist
ers
Fmax
(MHz
)
Late
nc
y
(clock
c
y
cl
es)
Minimum
com
put
atio
n ti
me
(ns)
NO OPTIMI
ZATI
ON
8 52
15
100.1
1
9.99
16 196
27
46.8
1
21.3
32 713
51
19.1
1
52.3
64 2710
99
7.1
1
140.8
AREA
8 39
27
174.3
4
22.9
16 78
48
88.8
8
90.1
32 123
89
110.8
16
144.4
64 243
170
77.6
32
412.4
Evaluation Warning : The document was created with Spire.PDF for Python.
TELKOM
NIKA
ISSN:
1693-6
930
FPGA Im
ple
m
entation ofLow-A
r
ea Sq
u
a
re Root Cal
c
ulator
(Aim
an Zakwan
Jidin
)
1151
In term of co
mputation tim
e
, the sp
eed
-optimiz
e
d
de
sign
s p
r
od
uce the results
faster th
an th
e
area
-o
ptimize
d
de
sign. Th
e
cal
c
ulation
o
f
the
minimu
m com
putatio
n time above
is ba
se
d pu
re
ly
on the Fmax obtaine
d, wh
ere
Minimum computa
t
ion ti
me = Latenc
y
* 1 / Fmax
(2)
If the same
clo
ck f
r
eq
ue
ncy is u
s
ed
fo
r both
sp
eed-optimi
z
e
d
and
area
-optimize
d
desi
g
n
s
, the comp
utation t
i
me for the fo
rmer
sh
a
ll be
n/2 times big
ger tha
n
the computation ti
me
of the latter, whe
r
e n is th
e width of the
radicand in b
i
t.
3.3. Po
w
e
r
Dissipation Analy
s
is
Since a
r
ea
-o
ptimized
conf
iguratio
n pro
duced
sm
alle
r desi
gn with
fewer
re
sou
r
ces than
non-optimi
z
e
d
co
nfiguratio
n, the po
wer
con
s
um
ption
in the form
er
shall
be le
sse
r
than the l
a
tter,
as
prove
n
in
Table
2.Th
e
inform
ation
sho
w
n
in thi
s
tabl
e a
r
e
collecte
d
fro
m
the Po
we
rPlay
Powe
r Analyzer T
ool, whi
c
h is a featu
r
e that is av
ailable in Altera
Quartu
s II Software. Thi
s
will
certai
nly cont
ribute in overcomin
g
the p
r
oble
m
of overhe
ating, wh
ich
ha
s bee
n
enco
untered
by
many appli
c
a
t
ions.
Table 2. The
comp
ari
s
o
n
o
f
power di
ssip
ation betwee
n
non-optimi
z
ed and a
r
ea
-optimize
d
s
q
ua
r
e
r
o
o
t
de
s
i
gn
Radica
nd
w
i
dth (bit
)
Therm
al p
o
w
e
r
dissipa
tio
n
f
o
r
non
-
opti
m
iz
ed c
o
n
f
i
gurati
on
(mW
)
Therm
al p
o
w
e
r
dissipa
tio
n
f
o
r
area-
opti
m
iz
ed c
o
n
f
i
gurati
on
(mW
)
8 85.4
85.2
16 87.2
86.6
32 90.6
89.1
64 97.1
94.6
4. Conclusio
n
This p
ape
r h
a
s de
scri
bed
an altern
ative in the ha
rd
ware impla
n
tation of a sq
uare
root
cal
c
ulator whi
c
h consume
l
o
w desi
gn area
or
low
hardware resources
utiliz
ation, with low power
dissipatio
n, b
y
usin
g the
modified
non
-re
stori
n
g
alg
o
tirhm. T
he
desi
gn fu
ncti
onality ha
s
b
een
verified via si
mulation
s an
d also h
a
rd
ware verifi
catio
n
by using FP
GA, where it prod
uced correct
and
accu
rate
output
s. Th
e an
alysi
s
al
so
sh
own
im
provem
ent of
fered
by p
r
o
posed
de
sign
, in
terms of re
so
urces u
s
a
ge as well a
s
th
e powe
r
co
n
s
umptio
n. De
spite bein
g
sl
owe
r
than other
previou
s
de
si
gns, it
may
well contrib
u
te
in lo
w-spe
ed
appli
c
ation
which
may
req
u
ire
low de
si
gn
area a
nd lo
w power con
s
u
m
ption.
Referen
ces
[1]
Vije
ya
kumar K
N
, Sumath
y
V,
Vasakipri
y
a
P
,
Dinesh Ba
bu
A.
F
P
GA impl
ementati
on of Low
Pow
e
r
High
Sp
eed s
quar
e ro
ot cir
c
uits.
IEEE International Conference
on Computat
ional Intelligence
&
Comp
uting R
e
search (ICCIC)
. Coimbator
e. 201
2: 1-5.
[2]
Kachh
w
al P,
Rout BC.
N
o
ve
l
sq
ua
re
ro
o
t
a
l
g
o
r
i
t
hm
and
i
t
s FPGA I
m
pl
em
en
ta
ti
on.
International
Confer
ence
on
Signa
l Prop
ag
ation a
nd C
o
m
puter
T
e
chnol
o
g
y
(ICSPCT
)
.
Ajmer. 201
4: 158-1
62.
[3
]
Ya
mi
n L
,
Wa
nmi
n
g C
.
I
m
pl
e
m
e
n
tatio
n
of S
i
ngl
e Pr
ecisi
o
n
F
l
oati
ng P
o
i
n
t Squ
a
re
Ro
ot
on F
P
GAs
.
IEEE S
y
mpos
i
u
m on FPGA for Custom Co
m
putin
g Machi
nes. Nap
a
. 199
7: 226-2
32.
[4]
Xi
ao
jun W
.
Varia
b
le Pr
eci
s
ion F
l
oatin
g-
Point D
i
vid
e
and S
q
u
a
re
Root for Effi
cient F
P
GA
Impleme
n
tatio
n
of Image a
nd Sig
n
a
l
Pro
c
essin
g
Alg
o
rit
h
ms. PhD T
hesis. Boston:
Northe
astern
Univers
i
t
y
B
o
ston, Massach
us
etts; 2007.
[5]
Xi
umin W
,
Ya
ng Z
,
Qiang Y
,
Shihu
a
Y.
A New
Algorith
m
for D
e
sig
n
in
g Squ
a
re R
oot
Calcu
l
ato
r
s
Based
on
F
P
GA w
i
th Pipe
l
i
ne T
e
chn
o
l
o
g
y
.
Ninth I
n
ter
natio
nal
Co
nfe
r
ence
on
H
y
b
r
i
d
Intel
lig
ent
S
y
stems. Shengy
ang. 2009; 1: 99-102.
[6]
Sutikno T
,
Idris NRN, Ji
di
n
AZ, Jidin A.
A Mode
l of
FPGA-based
Direct T
o
rqu
e
Co
ntroll
er.
T
E
LKOMNIKA Indon
esi
an Jou
r
nal of Electric
al Eng
i
ne
eri
ng.
2013; 1
1
(2): 7
47 – 75
3.
[7]
Don
g
-Guk H,
Doo
ho C,
Ho
w
on K.
Improv
e
d
Co
mputato
n
of Squ
a
re
R
o
ots in S
pecific
F
i
ne F
i
e
l
ds.
IEEE
T
r
ansactions on C
o
mp
uters. 2009; 5
8
: 188-
196.
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 16
93-6
930
TELKOM
NIKA
Vol. 13, No
. 4, Decem
b
e
r
2015 : 114
5 – 1152
1152
[8]
Lia
ng-Ka
i W
,
Schulte MJ.
Deci
mal flo
a
tin
g
-
poi
nt Square
Root Usin
g Ne
w
t
on-Raphs
on
Iteration.
16
th
IEEE International Conf
erence on
Applic
ation-S
pecific
S
ystems, Architechture Proces
sors (ASAP)
.
Samos. 200
5: 309-
315.
[9]
Kosheleva
O.
Babyl
oni
an
Me
thod
of C
o
mp
uting
T
h
e
Squ
a
re
Root: J
u
sti
f
ications
Bas
e
d o
n
F
u
zz
y
T
e
chni
ques
a
nd o
n
C
o
mp
u
t
ationa
l C
o
mpl
e
xity
. Ann
ual
Meetin
g of th
e North
Amer
ican F
u
zz
y
Information Pr
ocessi
ng Soci
e
t
y
(NAF
IPS). Cincin
nati. 2
009:
1-6.
[10]
T
aek-Jun K, S
ond
ee
n J, Dra
per J.
F
l
o
a
tin
g
-
P
oint D
i
visi
on
and
Squ
a
re
R
oot Impl
e
m
ent
ation
Usi
ng A
T
a
ylor-Seri
e
s Expans
ion
Alg
o
rith
m
. 15t
h IEEE International Confer
enc
e
on Electronics, Circuits an
d
S
y
stems (ICE
CS). St. Julian’
s. 2008:7
02-7
0
5
.
[11]
Baner
jee
A, G
hosh
A, D
a
s
M. Hig
h P
e
rfo
rmance
Nov
e
l
Squ
a
re
Ro
ot
Architecture
U
s
ing
Anci
ent
India
n
Mathem
atics for High
Spee
d Sig
nal
Processi
ng.
Advanc
es in Pu
re Mathe
m
atic
s.
2015; 5(8)
:
428-
441.
[12]
Kaur J, Gre
w
a
l
NS. Desi
gn a
nd F
P
GA Impl
em
entati
on
of a Nov
e
l Sq
uar
e Ro
ot Evalu
a
t
o
r bas
ed o
n
Vedic M
a
them
atics.
Internati
ona
l Jo
urna
l o
f
Inform
ati
on
& Co
mp
utatio
n T
e
chn
o
l
ogy.
201
4; 4(1
5
)
:
153
1-15
37.
[13]
Sutikno T
,
Jidin AZ, Jidin A,
Idris NRN. Si
m
p
lifie
d VHD
L
C
odi
ng of Modifi
ed No
n-Restor
ing Sq
uar
e
Root Ca
lcul
ato
r
.
Internation
a
l
Journ
a
l of Rec
onf
ig
urab
le a
n
d
Embed
de
d Systems.
20
12; 1(1): 37-4
2
.
[14]
Rahm
an A, Abdu
lla
h AK.
New
efficient h
a
rdw
a
re des
ig
n method
ol
ogy
for mod
i
fied
non-r
e
storin
g
squar
e ro
ot a
l
gorit
hm.
Int
e
r
natio
nal
Co
nferenc
e o
n
Inf
o
rmatics, Elec
tronics & V
i
si
on (ICIEV).
F
u
kuoka. 20
14
: 1-6.
Evaluation Warning : The document was created with Spire.PDF for Python.