Intern
ati
o
n
a
l
Journ
a
l of
Re
con
f
igur
able
and Embe
dded
Sys
t
ems
(I
JRES)
V
o
l.
1, N
o
. 3
,
N
o
v
e
m
b
er
2
012
, pp
. 10
8
~
12
2
I
S
SN
: 208
9-4
8
6
4
1
08
Jo
urn
a
l
h
o
me
pa
ge
: h
ttp
://iaesjo
u
r
na
l.com/
o
n
lin
e/ind
e
x.ph
p
/
IJRES
Desi
gn and Devel
o
pment of T
e
xt
ure Filtering Architecture for
GPU Ap
plicati
o
n Using Re
configurable Computing
Krishn
a Bhus
han
Vutu
kur
u*,
S
a
nke
t
De
ssai
**
* Department of
Computer Engin
eering
,
M. S.
R
a
maiah School of
Advanced
Stud
ies, Beng
aluru
** Departmen
t
o
f
Computer
Engineering
,
M. S.
R
a
maiah Schoo
l o
f
Advanced Stud
ies, B
e
ngaluru
Article Info
A
B
STRAC
T
Article histo
r
y:
Received
J
u
n 17, 2012
Rev
i
sed
O
c
t 11
, 20
12
Accepted Oct 28, 2012
Graphical Processing Units (GPUs) have
become an in
tegral par
t
of today
’
s
m
a
ins
t
ream
com
puting s
y
s
t
em
s
.
The
y
ar
e als
o
be
ing us
ed as
repr
ogram
m
a
ble
General Purpose GPUs
(GP
-
GPUs)
to pe
rform
com
p
lex scientif
ic
computations. Reconfigur
ability
is
an attr
act
i
v
e
appro
ach to
em
bedded
s
y
stem
s allowin
g
hardware
l
e
v
e
l m
odi
fication. Hence, ther
e is a high
demand for GPU designs base
d on rec
onfigurable hardwar
e
.
The textur
e
filter unit is designed to process geom
etric data like ver
tices and convert
these into p
i
xels
on the scre
en.
This
process inv
o
lves number of operations,
like ci
rcl
e
and c
ube genera
tion
,
rotator
,
and scal
i
ng. The tex
t
ure f
ilter uni
t is
designed with all necessar
y
hard
ware to
dea
l
wit
h
all th
e differ
e
nt filt
ering
operations. Th
e
designed textur
e filteri
ng un
its are modelled in
Verilog on
Altera Quar
tus II and sim
u
lated
using
ModelSim tools. The fun
c
tionality
of
the modelled blocks is veri
fied using test inputs in
the simulator.Circle an
d
cube coord
i
nates are gen
e
rated f
o
r circ
le and
cu
be gener
a
tion
.
The work can
form the b
a
sis fo
r designing
a co
mplete r
econf
igu
r
able GPU.
Keyword:
Tex
t
ure Filteri
n
g
Texel
GP
U
Pixel
Reconfigura
b
l
e
Copyright ©
201
2 Institut
e
o
f
Ad
vanced
Engin
eer
ing and S
c
i
e
nce.
All rights re
se
rve
d
.
Co
rresp
ond
i
ng
Autho
r
:
Sanket De
ssai,
Depa
rt
m
e
nt
of
C
o
m
put
er E
ngi
neeri
n
g
,
M.S.Ram
a
iah
Scho
o
l
of
A
dvan
ced
Stud
ies,
#
470
-
P
,Peen
y
a In
du
st
rial Are
a
,Peenya 4
th
Ph
ase,Ben
g
a
l
u
ru
-56
005
8
Karnataka
,
India.
Em
a
il: san
k
e
tdessai@g
m
ail.c
o
m
1.
INTRODUCTION
C
o
m
put
er g
r
a
phi
cs
has
bec
o
m
e
an im
port
a
nt
t
ech
ni
q
u
e i
n
m
a
ny
appl
i
cat
i
ons s
u
ch
as
C
AD t
ool
s,
g
a
m
e
, fil
m
, v
i
rtu
a
l reality and
etc.
A
lth
ough
m
a
n
y
t
ech
n
i
q
u
e
s are
u
s
ed
in
3D
Co
m
p
u
t
er grap
h
i
cs, tex
t
ure
map
p
i
ng
is
on
e of th
e m
o
st successfu
l
and
po
pu
lar techn
i
qu
es i
n
h
i
gh
-q
uality i
m
ag
e synth
e
sis [5
]. Especially,
t
e
xt
ure m
a
ppi
n
g
creat
es t
h
e appea
r
a
n
ce of c
o
m
p
l
e
xi
t
y
wi
tho
u
t
t
h
e t
e
di
u
m
of
m
odel
l
i
ng an
d ren
d
e
r
i
n
g eve
r
y
3D
detail of a
surface. M
o
re
ove
r, te
xture
mapping is
a
basis of
other
mapping techniques s
u
c
h
as
sha
d
ow
m
a
ppi
n
g
, e
n
vi
ro
nm
ent
m
a
ppi
ng,
b
u
m
p
m
a
ppi
ng
an
d et
c.
Ho
we
ver
,
t
h
e
great
est
weak
ness
of t
h
e t
e
xt
u
r
e
m
a
ppi
n
g
i
s
t
h
a
t
i
t
requi
res
hi
gh
m
e
m
o
ry
band
wi
dt
h t
o
fet
c
h t
h
e
t
e
xt
u
r
e
im
age dat
a
. T
h
e
use
of
a ca
che i
s
im
port
a
nt
i
n
i
m
provi
n
g
pr
oc
essi
ng
spee
d
o
f
a sy
st
em
. A wel
l
-
t
u
ned cac
he hi
e
r
arc
h
y
a
nd
o
r
ga
ni
zat
i
o
n can
i
n
d
u
ce t
h
e i
n
cr
ease of
sy
st
em
per
f
o
r
m
a
nce and
ba
nd
wi
dt
h s
a
vi
n
g
i
n
a sy
st
em
bus.
D
uri
n
g
t
h
e t
e
xt
u
r
e m
a
ppi
n
g
process
,
a '
t
ex
ture lookup'
ta
kes place to fi
nd
out wh
e
r
e on the te
xture
each pixel ce
ntre falls. Sinc
e the
texture
d
s
u
rfac
e
m
a
y be at a
n
arbitrary
distance and
orien
t
atio
n
relativ
e to
th
e v
i
ewer, on
e
p
i
x
e
l
do
es no
t
u
s
ually co
rresp
ond
d
i
rectly to
o
n
e
tex
e
l [25
]
. So
m
e
fo
rm o
f
filtering
h
a
s to
b
e
ap
p
lied to
d
e
term
in
e t
h
e
b
e
st
co
lor fo
r th
e
pix
e
l. Insufficien
t o
r
i
n
correct
filterin
g
will sh
ow
u
p
i
n
the i
m
ag
e as artefacts (errors i
n
th
e
im
age), s
u
ch
a
s
'
b
l
o
cki
ngs'
,
ja
ggi
es
,
or
shi
m
m
e
ri
ng.
Tex
t
ure un
it is o
n
e
o
f
th
e
maj
o
r co
m
p
o
n
en
ts in
GPU,
an
d
co
nsists of th
ree m
a
in
p
a
rts-add
r
ess
gene
rator(s), t
e
xture cache
(
s), and te
xt
ure
filter(s).
A reconfigura
b
le
ar
chitecture is chose
n
due t
o
the
Evaluation Warning : The document was created with Spire.PDF for Python.
I
J
RES
I
S
SN
:
208
9-4
8
6
4
Textu
r
e Filterin
g
Arch
itectu
r
e fo
r
GPU Applica
tio
n
Us
ing
Recon
fig
u
r
ab
l
e
Compu
ting
(Krish
na
Bhu
s
ha
n
)
10
9
follo
win
g
t
w
o
reaso
n
s: (1
) A
r
ea
re
ductio
n
is better
ac
hiev
e
d
usi
n
g
al
l
-
p
u
r
p
ose t
e
xt
u
r
e filt
ers
beca
use t
h
e
y
are
n
o
t
id
le as th
e requ
ired
filter typ
e
v
a
ries durin
g
run
-
tim
e; (2
) th
e reco
nfiguratio
n
o
v
e
rhead
is lo
w b
ecause th
e
co
nf
igu
r
ation
do
es no
t
sw
itch v
e
r
y
o
f
ten
.
2.
TE
X
T
URE
FILTERING CONCEPTS
In
3D c
o
m
puter graphics, s
u
rfaces of a
3D
object
a
r
e re
pres
ented int
o
sum
of triangles.
Drawi
ng
of a
2D im
age ont
o the surface is
texture m
a
pping. T
h
e im
ag
e drawn onto t
h
e
surface is called a texture m
a
p and
its in
d
i
v
i
du
al ele
m
en
t, tex
t
u
r
e p
i
x
e
l, is called a tex
e
l.
Th
e tex
t
ure m
a
p
p
i
n
g
co
nsists o
f
two step
s: th
e first
is a
trans
f
orm
from the 2D text
ur
e
space to the
3D object spa
c
e
and the sec
ond is a tra
n
sform from
the 3D object
space to t
h
e
2D scree
n
spa
ce [17]. The
com
position
of
two transform
s
is denote
d
as a rational linear
p
r
oj
ectiv
e transform
as sh
own
in
Equ
a
tion
(1). Th
e
xs
,
ys
and
u
,
v
are
coo
r
di
nat
e
val
u
es of
a pi
xel
i
n
t
h
e
screen s
p
ace a
n
d a c
o
rres
ponding texel i
n
t
h
e texture s
p
ace
. And,
a
and
i
a
r
e c
onst
a
nt
s.
(1)
All th
e trian
g
l
es creating
a
3D
o
b
j
ect a
r
e
decom
posed int
o
s
p
a
n
s in
a s
p
an
rasterizer for dis
p
laying.
Spa
n
m
eans a
set of succes
sive horiz
ontal pixels in a
triangle. By
m
a
pping, a spa
n
is
mappe
d into a ra
ndom
-
directed straight line
on a
texture
im
ag
e as shown in
Fig
u
re
1
.
Th
is lin
e co
n
s
erv
a
tio
n
p
r
op
erty i
s
well
explaine
d in the followi
ng.
When a
n
arbitra
r
y line in
the screen s
p
ace,
=
A
+
B
, is
map
p
ed
in
to
tex
t
ure
im
age space, a corre
spondi
ng line,
v
=
A
`
u
+
B
`, is o
b
t
ained
b
y
sub
s
titu
tin
g
and
in
Eq
u
a
tion
(1
)
and
rearrangem
ent [17].
Fi
gu
re
1.
M
a
p
p
i
n
g
of
Tri
a
ngl
e Spa
n
s
2.
1
Texture
Cach
e An
d Triangl
e
Sp
an
The e
fficiency
of cache
m
e
m
o
ry depends
on re
gion
of l
o
cation in
data
accesses. B
o
t
h
s
p
atial and
te
m
p
o
r
al reg
i
on
s
are p
r
esen
t in
tex
t
ure
m
a
p
p
i
ng
[13
]
.
Mipmap
filtering
i
n
creases
sp
atial lo
cality in
tex
t
ure
access since
the level
of t
h
e
map is selected to closely
match the
level-of-detail that i
s
bei
n
g drawn
on the
screen. That is, due to the mipm
ap filtering, one pi
xe
l
m
ovem
e
nt in screen space is ne
arly
m
a
pped to one
texel m
ove
m
e
nt in text
ure s
p
ace.
One text
ure im
age can
be
m
a
pped
t
o
several poly
gons of
si
ngle fra
m
e
or
consecutive
fra
m
e
s. The
r
ef
ore, tem
poral locality in texel access is al
so prese
n
t. In
the
cases of bilinear
or
trilinear filtering, m
u
lt
iple te
xels are
nee
d
e
d
for single pi
xel. It also c
ontributes to temporal locality
because
so
m
e
o
f
th
e m
u
ltip
le tex
e
ls fo
r a p
i
x
e
l are
ap
pro
p
riate to
o
v
e
rlap
with
so
m
e
tex
e
ls fo
r n
e
ighb
ou
ri
n
g
p
i
x
e
ls.
Du
e t
o
th
e lo
cality o
f
tex
t
u
r
e, tex
t
u
r
e cache can
b
e
u
s
ed to
i
m
p
r
ov
e syste
m
p
e
rforman
ce and
to
sav
e
th
e
requ
ired
b
a
ndwid
th
in
system
b
u
s
.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
SN
:
2
089
-48
64
I
J
RES
Vo
l. 1
,
N
o
. 3
,
Nov
e
mb
er
201
2
:
108
–
1
22
11
0
A re
ndere
d
sc
ene ca
n
be
dec
o
m
posed i
n
to
a set
o
f
s
p
ans
.
C
ache repl
ace
m
e
nt
can be di
vi
de
d base
d
on the triangle spa
n
. If cac
he replacem
ent oc
curs
betwee
n te
xels residing in the sam
e
span, the
n
it is
called as
intra-s
p
a
n
re
place
m
e
nt. If re
placem
ent occurs
betwee
n te
xels of two
diffe
rent spa
n
s
,
we call it
inter-s
pa
n
replacem
ent. In cache operati
o
n, the
r
e is one cache re
plac
e
m
ent whe
n
one
cac
he m
i
ss
occurs e
x
cludi
ng c
o
ld
miss. So, cache miss can
be c
a
lculated
by coun
ting cac
he
re
placem
ent occurring in the
ca
che.
2.
2
Geom
atri
c
a
l
Met
h
o
d
s
Th
e d
i
fferen
tiatio
n
of tex
t
ure an
alysis tech
n
i
q
u
e
s th
at falls u
n
d
e
r th
e title
o
f
g
e
o
m
etrical
p
r
o
cesses is
d
i
stin
gu
ish
e
d
b
y
th
eir d
e
fi
n
itio
n
o
f
tex
t
u
r
e as b
e
in
g
com
p
o
s
ed
o
f
“tex
ture ele
m
en
ts” o
r
p
r
im
itiv
e
s
. Th
e
p
r
o
cess
o
f
an
alysis g
e
n
e
r
a
lly d
e
pend
s
u
pon th
e g
e
o
m
etr
i
c
p
r
op
er
ties of
th
ese tex
t
ur
e ele
m
en
ts. O
n
ce the
tex
t
u
r
e elem
en
ts are id
en
tified
in th
e im
age, there are
two m
a
jor approa
ch
es to analyz
ing t
h
e text
ure
.
One
co
m
p
u
t
es stati
s
tical p
r
op
erties fro
m
th
e ex
tracted
tex
t
ur
e ele
m
en
ts an
d
u
tilizes th
ese as tex
t
u
r
e
features. Th
e
ot
he
r t
r
i
e
s t
o
e
x
t
r
act
t
h
e
pl
acem
e
nt
rul
e
t
h
at
descri
bes t
h
e t
e
xt
u
r
e. T
h
e l
a
t
t
e
r ap
pr
oac
h
m
a
y
i
nvol
ve g
e
o
m
et
ri
c
or
sy
nt
act
i
c
m
e
t
h
o
d
s
of
anal
y
z
i
ng t
e
xt
u
r
e.
Each stream
client
m
a
y acces
s its dedicated
stream
bu
ffe
r e
v
ery cycle if there is data a
v
ailable to be
read or sp
ace av
ailab
l
e to b
e
written
.
Th
e ei
g
h
t
stream
b
u
ffers serv
i
n
g th
e
clu
s
ters are accessed
ei
gh
t word
s at
a tim
e
, one word
per cluste
r.
The eight strea
m
buffe
rs se
rving the
network in
terface are
accessed two
words
at a tim
e
[20].
The
othe
r six s
t
ream
buffe
rs
are accesse
d a
single
word at a tim
e
.
The pe
ak ba
ndwi
dth
of t
h
e
stream
buffe
rs
is therefore
86 words pe
r cycle, allo
wing pe
ak stream
de
mand t
o
excee
d
the SRF ba
ndwidt
h
du
ri
n
g
sh
o
r
t
t
r
ansi
ent
s
. St
rea
m
buffe
rs are
bi
di
rect
i
o
nal
,
but
m
a
y onl
y
be use
d
i
n
a si
ngl
e di
rect
i
on
fo
r t
h
e
du
rat
i
o
n
of ea
c
h
l
o
gi
cal
st
rea
m
t
r
ansfe
r
.
2.
3
VORONOI T
E
SSELLA
TION FEAT
URES
Vo
ro
n
o
i
t
e
ssel
l
at
i
on has bee
n
pr
o
pos
ed be
cause o
f
i
t
s
desi
rabl
e pr
o
p
ert
i
es i
n
defi
ni
n
g
l
o
cal
spat
i
a
l
nei
g
hb
o
r
h
o
ods
an
d bec
a
use
t
h
e l
o
cal
s
p
at
i
a
l
di
st
ri
b
u
t
i
o
ns
of tokens are
re
flected in
the
shapes of
t
h
e Voronoi
p
o
l
ygon
s. Fir
s
t
,
tex
t
ur
e t
o
k
e
ns ar
e ex
tr
acted
an
d th
en
the tessellatio
n
is con
s
tru
c
ted
.
To
ken
s
can
b
e
as si
m
p
le
as po
in
ts
of
h
i
gh
g
r
ad
ien
t
i
n
the i
m
ag
e or com
p
lex
st
ructures suc
h
a
s
line
segm
en
t
s
or
cl
ose
d
bo
u
nda
ri
es.
In
or
der t
o
a
p
p
l
y
geom
et
ri
cal
m
e
t
hods t
o
gr
ay
l
e
vel
im
ages, fi
rst
ext
r
act
i
on
of t
o
ke
ns f
r
o
m
im
ages
has t
o
be pe
rf
orm
e
d. Fol
l
ow
i
ng si
m
p
l
e
al
gori
t
h
m
s
are us
ed t
o
e
x
t
r
act
t
oke
ns
fr
om
i
n
put
gray
l
e
vel
t
e
xt
ura
l
im
ages.
1.
Apply a La
placian-of-Ga
ussi
an (LoG
or
2
G
)
filter to
t
h
e i
m
ag
e. For com
p
u
t
atio
n
a
l efficien
cy, th
e
2
G
filter can b
e
ap
prox
im
a
t
ed
with
a
d
i
fferen
ce
of
Gaussian
s (Do
G
) filter. Th
e size
o
f
t
h
e DoG
filter is d
e
termin
ed
b
y
th
e sizes of th
e two
Gau
ssian filters.
2.
Select th
o
s
e p
i
x
e
ls th
at lie
o
n
a lo
cal i
n
ten
s
ity
m
a
x
i
m
u
m
in
th
e filtered
imag
e.
A
p
i
x
e
l
in th
e
filtered
i
m
ag
e is said
to
b
e
on
a lo
cal
max
i
m
u
m
if its
m
a
g
n
itu
d
e
is l
a
rg
er th
an
six
o
r
m
o
re of its eig
h
t
n
e
arest
n
e
igh
bors.
Th
i
s
resu
lts in
a b
i
n
a
ry im
ag
e.
3.
Perform
a connected com
p
on
ent analysis on
the
binary
image
usi
n
g
ei
ght
nea
r
est
nei
g
h
b
o
rs
.
Eac
h
connected c
o
mpone
n
t de
fines
a
tex
t
ure p
r
im
i
tiv
e
(tok
en
).
The t
e
xt
ure fe
at
ures
based
on
V
o
r
o
n
o
i
p
o
l
y
go
ns
have
been
use
d
f
o
r
segm
ent
a
t
i
on of t
e
xt
u
r
e
d
im
ages. T
h
e s
e
gm
ent
a
t
i
on a
l
go
ri
t
h
m
i
s
edge
based
,
us
ing a statistical com
p
ar
ison of
t
h
e
n
e
ighb
or
ing
col
l
ect
i
ons
o
f
t
oke
ns
. A
l
a
r
g
e
di
ssi
m
i
l
a
ri
t
y
am
ong t
h
e t
e
xt
u
r
e feat
ures
i
s
e
v
i
d
e
n
ce
fo
r a t
e
xt
u
r
e e
dge
.
3.
REQUIRE
M
ENT ANALYSIS
Th
e requ
irem
e
n
t an
alysis for
th
e tex
t
u
r
e
filteri
ng
arch
itectu
r
e m
o
d
e
llin
g
in
clu
d
e
s on
FPGA
d
e
v
i
ces
an
d
its co
m
p
on
en
ts. A tex
t
u
r
e filterin
g
u
n
i
t
n
eed
s t
o
b
e
verified
, t
h
e ou
t
p
u
t
can
b
e
m
o
n
ito
red
o
n
a term
in
al
for th
e m
u
ltip
ro
cessing
tasks. In
o
r
d
e
r to
d
e
bug
th
e processor an
d m
o
n
ito
r th
e
p
r
o
c
esso
r state a
‘JTAG
Debugge
r’ m
u
st be prese
n
t in the system
.
Since the
text
ure filtering
unit
m
odel cons
um
es huge num
ber of
sl
i
ces, t
h
e FP
GA
de
vi
ce o
f
adva
nce
d
ve
rsi
on i
s
req
u
i
r
e
d
.
Hence t
o
b
r
i
n
g ab
o
u
t
t
h
e m
e
nt
i
one
d
req
u
i
r
em
ent
s
for
p
r
o
f
ession
al m
o
d
e
l o
f
textu
r
e
filtering
un
it arch
it
ectu
r
e
th
e
av
ailab
l
e FPGA Altera Cyclo
n
e
II
h
a
s
b
een
deci
de
d up
o
n
.
3.
1
Vo
rono
i Tessella
tio
n
Fea
t
ures
The
desi
g
n
c
o
n
s
i
s
t
s
of
de
vel
o
pi
n
g
a
pr
o
g
ram
m
a
bl
e gra
phi
cs
pr
ocessi
ng
u
n
i
t
wi
t
h
as m
a
ny
aspect
s a
s
pos
si
bl
e t
o
be
co
ded
i
n
ha
r
d
wa
re,
eve
n
wi
t
h
ob
ject
a
n
d e
dge
gene
ra
t
i
on.
Text
ure
m
a
ppi
n
g
i
s
a
sha
d
i
n
g
technique
for image synthesi
s in wh
ich a t
e
xture im
age is
m
a
ppe
d onto
a surface in
a three dim
e
nsional
scene, m
u
ch as wallpape
r
is applied to a wal
l
. If a table
i
s
need t
o
be m
odel
l
e
d, a
rectangular box for the table
Evaluation Warning : The document was created with Spire.PDF for Python.
I
J
RES
I
S
SN
:
208
9-4
8
6
4
Textu
r
e Filterin
g
Arch
itectu
r
e fo
r
GPU Applica
tio
n
Us
ing
Recon
fig
u
r
ab
l
e
Compu
ting
(Krish
na
Bhu
s
ha
n
)
11
1
t
op, a
n
d fo
u
r
cy
l
i
nders f
o
r t
h
e
l
e
gs i
s
used
. U
n
ad
o
r
ne
d, t
h
i
s
t
a
bl
e
m
odel
w
oul
d l
o
ok
qui
t
e
dul
l
w
h
en
ren
d
ere
d
.
The real
i
s
m
of t
h
e rend
ere
d
i
m
age can be enha
nce
d
im
m
e
nsel
y
by
m
a
pp
i
ng a w
o
o
d
g
r
ai
n pat
t
e
rn
ont
o t
h
e
t
a
bl
e t
o
p
,
usi
n
g t
h
e
val
u
es i
n
t
h
e t
e
xt
ure t
o
defi
ne t
h
e c
o
lour at eac
h
point of t
h
e s
u
rfac
e. The
adva
ntage
of
tex
t
u
r
e m
a
p
p
i
ng
is th
at it adds m
u
ch
d
e
tail
to
a scen
e
wh
i
l
e requ
iring
only a
m
o
d
e
st increase in rendering
tim
e
. Texture
mapping does
not a
ffect
hidden s
u
rface elim
ination,
but
merely adds a
sm
a
ll incre
m
e
n
tal cost
to the
sha
d
ing
process
.
T
h
e te
chni
que
ge
ne
ra
lizes easily to curve
d
s
u
rfaces.
Text
u
r
e m
a
ppi
ng
can
be
use
d
t
o
de
fi
ne m
a
ny
su
rface
pa
r
a
m
e
t
e
rs besi
de
s col
o
r.
These
i
n
cl
ude
s t
h
e
pert
urbation of surface
norm
al vectors to s
i
m
u
late
bum
py surfaces, tra
n
spare
n
cy m
a
ppi
ng to m
o
dulate the
opacity of a
transluce
n
t surface, sp
ecula
rity
m
a
pping to
vary t
h
e
glo
ssi
ness
of a surface, and illum
i
nation
m
a
ppi
n
g
t
o
m
odel
t
h
e
di
st
ri
bu
t
i
on
of
i
n
c
o
m
i
ng
l
i
ght
i
n
al
l
d
i
rect
i
ons.
In al
l
of t
h
e
v
a
ri
et
i
e
s of t
e
xt
ure m
a
ppi
ng
m
e
nt
i
oned
ab
o
v
e,
ge
om
et
ri
c
m
a
ppi
n
g
s a
r
e
fu
n
d
am
ent
a
l
.
Two-dim
e
nsional m
a
ppings
are
use
d
to
defi
ne the
pa
ra
m
e
terization of a
s
u
rface
and to
desc
ribe the
t
r
ans
f
o
r
m
a
ti
on
bet
w
ee
n t
h
e
t
e
xt
u
r
e c
o
o
r
di
nat
e
s sy
st
em
and t
h
e sc
reen
co
o
r
di
nat
e
s sy
st
em
.
The g
r
ap
hi
cs
uni
t
t
a
kes o
p
e
r
at
i
ons i
n
a
v
e
ry
-l
o
ng i
n
st
ru
ct
i
on w
o
r
d
f
o
r
m
at
t
h
at
has a one
-t
o-
o
n
e
rep
r
ese
n
t
a
t
i
on
t
o
a hi
gh
-l
evel
scri
pt
i
ng l
a
n
g
u
age
,
w
h
i
c
h p
r
ovi
des a
m
ean t
o
m
ovi
n
g
o
b
je
ct
s and feat
u
r
e
s
i
n
a
scene to
dyna
mically during run-tim
e.
The high-level des
i
gn s
h
ares m
a
ny
sim
ilarities m
u
lti-cycle pipelines,
suc
h
as interm
ediate
m
e
m
o
ries and
re
gisters. Howe
ve
r,
unlike a re
gular process
o
r, the
co-process
o
r has one
pipeline t
h
at
operates
on m
u
ltiple pieces
of data in
pa
rallel;
m
u
ch like
a vector
proce
ssor does in a
single-
in
stru
ction
m
u
ltip
le-d
ata
fash
i
o
n.
There
are t
h
re
e com
pone
nts
of t
h
e circ
uit: an
obj
ect gene
ration
pipeli
ne
to ge
ne
rate e
dge
s of the
t
a
rget
sha
p
e;
a
t
r
ansf
o
r
m
a
ti
on pi
pel
i
n
e t
h
at
per
f
o
r
m
e
d
transform
a
tio
n
s
o
n
th
e un
it o
b
j
ect
s1
; and
a rasterizin
g
p
i
p
e
lin
e th
at
gen
e
rates th
e
po
in
ts fo
r t
h
e VGA con
t
ro
ller
to
d
i
sp
lay. The d
e
sign
h
a
s mad
e
certain
trad
eo
ffs
due
t
o
t
h
e c
o
ns
t
r
ai
nt
s i
m
posed
by
t
h
e
FP
G
A
t
o
sy
nt
hesi
ze t
h
e ci
rcui
t
.
First, the trans
f
orm
a
tion pi
pe
line
does
not
e
m
ploy a gene
ralized 4
x4 m
a
trix m
u
ltiply
because the
li
mited
n
u
m
b
e
r of m
u
ltip
liers
o
n
th
e
FPGA. In
stead
,
th
e
tran
sform
a
t
i
o
n
pip
e
lin
e is cu
rren
tly d
e
sig
n
ed
as an
o
p
e
rate-an
d
-accu
m
u
l
ate
m
o
d
u
le, with
in
termed
iate d
a
ta v
a
lu
es sto
r
ed
in reg
i
sters. Altern
ativ
ely with
m
o
re
avai
l
a
bl
e m
u
l
t
i
pl
i
e
rs,
by
fi
rst
gene
rat
i
n
g a re
duce
d
m
a
t
r
i
x
t
r
ans
f
orm
a
t
i
on, one
dat
a
set
c
a
n b
e
t
r
a
n
sf
or
m
e
d i
n
one
cycle. Sec
o
nd, the
availa
ble m
e
m
o
ry on
the FPGA is li
m
ited
to
8.5 m
e
g
a
b
y
tes at m
o
st, o
f
wh
ich abou
t
512 kilobytes are
a
v
ailable me
m
o
ry
that
are designed
to
be
read from
within a
single
cycle of exe
r
t
i
ng t
h
e
desi
re
d a
d
d
r
ess
.
The
gra
phi
cs
pl
at
fo
rm
was desi
g
n
e
d
t
o
be
abl
e
t
o
ge
ne
r
a
t
e
shapes a
n
d
ob
ject
s
by
co
m
put
i
ng t
h
e
edge
s for the
object, pe
rform
a num
b
er
o
f
t
r
ansf
o
r
m
a
ti
ons,
an
d t
o
rast
e
r
i
ze the sce
n
e i
n
to a
VGA buffe
r
.
T
h
e
m
odul
e h
o
u
ses
t
h
ree
co
nc
ur
re
nt
pi
pel
i
n
es:
o
n
e t
o
gene
rat
e
an e
d
ge l
i
s
t
,
o
n
e t
o
c
o
m
put
e
t
h
e t
r
a
n
sf
o
r
m
a
t
i
ons,
and one to rast
erize the tra
n
s
f
orm
e
d
points.
These t
h
ree
pi
pelines a
r
e im
pl
i
c
i
t
cons
um
er-
p
r
o
duce
r
c
o
n
s
t
r
uct
s
,
with
o
n
e
waiti
n
g
fo
r th
e co
mp
letio
n
of th
e prev
i
o
u
s
b
e
fo
re
co
n
tinu
i
ng
.
3
.
2
High Level D
e
sign
The
gra
phi
cs
pl
at
fo
rm
was desi
g
n
e
d
t
o
be
abl
e
t
o
ge
ne
r
a
t
e
shapes a
n
d
ob
ject
s
by
co
m
put
i
ng t
h
e
edge
s for the
object, pe
rform
a num
b
er
o
f
t
r
ansf
o
r
m
a
ti
ons,
an
d t
o
rast
e
r
i
ze the sce
n
e i
n
to a
VGA buffe
r
.
T
h
e
m
odul
e h
o
u
ses
t
h
ree
co
nc
ur
re
nt
pi
pel
i
n
es:
o
n
e t
o
gene
rat
e
an e
d
ge l
i
s
t
,
o
n
e t
o
c
o
m
put
e
t
h
e t
r
a
n
sf
o
r
m
a
t
i
ons,
and one to rast
erize the tra
n
s
f
orm
e
d
points.
These t
h
ree
pi
pelines a
r
e im
pl
i
c
i
t
cons
um
er-
p
r
o
duce
r
c
o
n
s
t
r
uct
s
,
with
o
n
e
waiti
n
g
fo
r th
e co
mp
letio
n
of th
e prev
i
o
u
s
b
e
fo
re
co
n
tinu
i
ng
.
4
H
A
RD
W
ARE
DESI
G
N
As
descri
bed
i
n
t
h
e
hi
gh
l
e
v
e
l
desi
g
n
, t
h
er
e are t
h
ree
pi
p
e
l
i
n
es t
o
ge
ner
a
t
e
t
h
e ed
ges t
o
ob
ject
s,
t
o
com
pute trans
f
orm
a
tions on t
h
e poi
n
ts, and t
o
raste
r
ize
the
com
puted ele
m
ents, the fi
rst
two
p
i
p
e
lin
es n
eed
to
st
ore al
l
i
t
s
dat
a
i
n
M
4
K bl
oc
ks i
n
or
de
r t
o
m
a
ke t
h
em
t
o
the pi
pel
i
n
e
do
wnst
ream
. Si
nce t
h
e l
a
st
pi
pe
l
i
n
e al
so
serv
es th
e VGA co
n
t
ro
ller, the th
ird
p
i
p
e
lin
e sto
r
es its d
a
ta
in
SRAM
.
Due
t
o
t
h
e t
i
m
e co
nst
r
ai
nt
, ci
rcl
e
, a
n
d
c
ube
gene
rat
o
r
ha
ve
bee
n
i
m
pl
em
ent
e
d
as
part
o
f
t
h
e
pr
oject
.
Th
e
o
t
h
e
r sect
io
n
s
wh
ich
p
l
ay an
im
p
o
r
tan
t
ro
le in tex
t
ure filterin
g
un
it are scaling
and
ro
tator un
its. Th
e
p
i
p
e
lin
e selects o
n
l
y on
e g
e
nerato
r; th
e
p
o
s
i
tiv
e clo
c
k
edge in
itializes
th
e v
a
lu
es to
th
e g
e
n
e
rato
r, and raises
th
e in
itialize l
i
n
e fo
r th
at g
e
nerato
r. Th
e
g
e
nerato
r
will ra
ise its ‘b
u
s
y lin
e’ wh
en
th
ere is a v
a
lid
ed
ge ex
istin
g
th
ro
ugh
its ou
t
p
u
t
lin
es [(Ax,
Ay,
Az), (B
x
,
By, Bz)], and
raise ‘don
e’ when
it is
d
o
n
e
.
4.1
Circle
Ge
nerator
Unit
Th
e circle
g
e
nerato
r un
it draws th
e two
-
d
i
men
s
io
n
a
l
circ
les at the spec
ified origi
n
a
n
d scale.
The
flow ch
art
o
f
t
h
is
u
n
it is
shown in
Figu
re
2. Th
is
u
n
it
will b
e
activ
e
wh
en
resp
ecti
v
e arg
u
m
en
ts are availab
l
e
at po
sitiv
e ed
ge clo
c
k
cycle.
If t
h
e
reset si
gn
al is
h
i
gh
t
h
en
all
reg
i
sters
av
ailab
l
e in the circle
g
e
n
e
rato
r
un
it
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
SN
:
2
089
-48
64
I
J
RES
Vo
l. 1
,
N
o
. 3
,
Nov
e
mb
er
201
2
:
108
–
1
22
11
2
will g
e
t cleared
.
Du
ring
do
n
e
_
l
oo
p
i
n
g
case,
q
u
e
rying
for si
n
e
an
d co
si
n
e
t
a
b
l
e will b
e
don
e,
wh
en in
it sig
n
a
l
is h
i
gh
o
r
low.
During
l
o
op
ing case, t
h
e circl
e
will b
e
drawn
d
e
p
e
nd
ing
on
th
e sin
e
and
co
sin
e
v
a
l
u
es.
Figu
re
2.
Flo
w
Chart
f
o
r Circl
e
Ge
nerat
o
r
U
n
it
4.
2
Cube
Ge
nerator
Unit
The c
u
be
gene
rator unit dra
w
s the c
u
bes at t
h
e s
p
ecifi
ed origin a
n
d scale.
The
flow ch
art
o
f
th
is
un
it
is sho
w
n
i
n
Fi
g
u
re
4
.
Th
is un
it will b
e
activ
e
wh
en
re
sp
ectiv
e arg
u
m
en
ts are av
ailab
l
e
at p
o
sitiv
e edge clo
c
k
cycle. If th
e
reset sig
n
a
l is h
i
g
h
t
h
en
all registers av
ailab
l
e in
th
e cub
e
gen
e
rat
o
r
u
n
it
will g
e
t cleared
. Fo
r
th
ese
m
o
d
e
l elev
en
p
o
s
sib
l
e co
m
b
in
atio
n
a
l cases h
a
s b
e
en written
.
A b
a
sic
m
o
d
e
l o
f
th
e cu
b
e
with
h
a
s b
e
en
co
nsid
ered
i
n
itially as sh
own in
Fi
g
u
re
3
(a). Th
e
d
i
m
e
n
s
io
n
s
o
f
t
h
e cube d
e
p
e
nd
on
t
h
ree m
a
in
ax
is
[(Ax,
Ay, Az)] and th
ree im
ag
in
ary ax
is [(Bx, By, Bz)].
Wh
en
ev
er,
Ay ax
is is h
i
gh
t
h
e cu
b
e
will ch
ang
e
accordingly in y-axis di
rectio
n as
shown in Fi
gure
3 (b
), Ax axis is
high the
c
u
be
will cha
nge
its sha
p
e
accordingly
in x-a
x
is direction
as shown
in Figure 3 (c),
a
n
d
if Az
a
x
is
i
s
high t
h
e c
u
be
will cha
nge
its sha
p
e
accordingly in z-axis
direction as s
h
own in
Figure
3 (d) re
sp
ectively. Si
milarly, for
other c
o
m
b
inations the
cu
b
e
sh
ap
e wil
l
b
e
ch
ang
e
d
acco
rd
ing
l
y. The lo
op
will ru
n
till all p
o
ssib
l
e
cases are ch
eck
e
d.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
J
RES
I
S
SN
:
208
9-4
8
6
4
Textu
r
e Filterin
g
Arch
itectu
r
e fo
r
GPU Applica
tio
n
Us
ing
Recon
fig
u
r
ab
l
e
Compu
ting
(Krish
na
Bhu
s
ha
n
)
11
3
(a)
(
b
)
(c)
(
d
)
Fi
gu
re
3.
St
age
s
o
f
C
u
be
Ge
n
e
rat
i
o
n
4.
3
Sc
al
i
n
g U
n
i
t
Scalin
g is th
e
p
r
o
cess of
resi
zin
g
a
d
i
g
ital
i
m
ag
e. Scaling is a
no
n-triv
i
a
l p
r
o
cess th
at
inv
o
l
v
e
s a
trade-off bet
w
een efficie
n
cy, s
m
oothne
ss and s
h
arpness.
As the size of
an
im
age is increased, so the
pixels
whic
h c
o
m
p
rise the im
age be
com
e
increasingly visible,
ma
k
i
ng
th
e ima
g
e
a
p
p
e
a
r
s "
s
of
t"
.
T
h
e f
l
o
w
ch
ar
t of
scalin
g
un
it sh
own
in Fi
g
u
re 5 is ex
p
l
ains: redu
ci
n
g
an im
ag
e will ten
d
to enh
a
n
c
e its sm
o
o
t
h
n
e
ss and
ap
p
a
ren
t
sh
arpn
ess. Ap
art fro
m
fittin
g
a smaller d
i
sp
lay area, im
ag
e size is m
o
st commo
n
l
y d
ecreased
i
n
or
der t
o
pr
od
u
ce t
hum
bnai
l
s
.
Enl
a
r
g
i
n
g a
n
im
age i
s
gene
ral
l
y
com
m
on
fo
r m
a
ki
ng s
m
al
l
e
r im
ager
y
fi
t
a
b
i
gg
er screen
in
fu
ll screen
m
o
d
e
, for ex
am
p
l
e. In
“zoomin
g
”
an
im
a
g
e, it is no
t po
ssib
l
e to
d
i
sco
v
e
r an
y
m
o
re in
fo
rm
ati
o
n
in
th
e im
ag
e th
an
alread
y
ex
ists, a
n
d
imag
e qu
ality cer
tain
ly su
ffers. Howev
e
r, th
ere are
sev
e
ral m
e
th
o
d
s of in
creasing
th
e nu
m
b
er of
p
i
x
e
ls that
an i
m
age contains, which
ev
en
s
ou
t th
e
app
e
a
r
anc
e
of
th
e orig
i
n
al p
i
xels.
4.
4
R
o
t
a
t
o
r Uni
t
Th
e ro
tation
pro
cess is
u
s
ed
t
o
ro
tate o
r
t
u
rn an
obj
ect b
a
sed
on
th
e ang
l
e
o
f
ro
tation
requ
ired
b
y
th
e
user
. A r
o
t
a
t
i
o
n t
r
an
sf
orm
a
t
i
on i
s
ge
ne
rat
e
d by
speci
fy
i
n
g
a ro
tation
axis an
d
ro
tatio
n an
g
l
e. Param
e
ters are
th
e ro
tatio
n ang
l
e
an
d a
p
o
si
t
i
on
(
) called
t
h
e
ro
tation
p
o
i
n
t
abou
t
wh
ich th
e
ob
j
ect is to
b
e
ro
tated as
sho
w
n i
n
Fi
gu
r
e
6.
A
n
d t
h
e
fl
ow
chat
i
s
re
pr
esent
e
d
i
n
Fi
g
u
r
e
7.
5
.
VERIFICATION,
TESTI
N
G
AN
D
VA
LIDA
TION
Testin
g
of
sof
t
w
a
r
e
or
h
a
rd
w
a
r
e
is condu
cted
o
n
a com
p
le
te syste
m
to
ev
alu
a
te t
h
e system's
agreem
ent with its specifi
ed
req
u
i
r
em
ent
s
.Test
i
ng i
n
v
o
l
v
es ope
rat
i
o
n
of a sy
st
em
or ap
pl
i
cat
i
on
un
de
r
cont
rol
l
e
d
co
n
d
i
t
i
ons
an
d e
v
a
l
uat
i
ng t
h
e
res
u
l
t
s
.
5
.
1
TEST VECTOR FOR PIPELINE
MULTIPLIE
R
Th
e test case
resu
lt of th
e
p
i
p
e
lin
ed m
u
ltip
lier is sho
w
n
i
n
Fi
g
u
re
8
.
The ti
m
e
tak
e
n
to p
e
rfo
rm
th
e
com
p
ilation by using
pipeli
ned a
d
der is 60 ps. In this
m
odel each
bit of the two nu
m
b
ers are multiplied
p
a
rallely.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
SN
:
2
089
-48
64
I
J
RES
Vo
l. 1
,
N
o
. 3
,
Nov
e
mb
er
201
2
:
108
–
1
22
11
4
Figu
re
4.
Flo
w
Chart
f
o
r C
u
b
e
Ge
nerat
o
r
U
n
it
Fig
u
re 5
.
Flow
Ch
art fo
r Scalin
g
Un
it
Evaluation Warning : The document was created with Spire.PDF for Python.
I
J
RES
I
S
SN
:
208
9-4
8
6
4
Textu
r
e Filterin
g
Arch
itectu
r
e fo
r
GPU Applica
tio
n
Us
ing
Recon
fig
u
r
ab
l
e
Compu
ting
(Krish
na
Bhu
s
ha
n
)
11
5
Fig
u
r
e
6
.
Ro
tatio
n
of
an
O
b
j
e
ct
th
ro
ugh
A
ngle
abo
u
t
t
h
e
Po
sitio
n
Fig
u
re
7
.
Flow Ch
art
fo
r R
o
tato
r
Un
it
Fig
u
re
8
.
Test
Case Resu
lt of
Pip
e
lin
ed Mu
lt
ip
lier
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
SN
:
2
089
-48
64
I
J
RES
Vo
l. 1
,
N
o
. 3
,
Nov
e
mb
er
201
2
:
108
–
1
22
11
6
Fig
u
re
9
.
Test
Case Resu
lt of
Circle Gen
e
rat
o
r wh
en
Reset
Sig
n
a
l is Hi
g
h
Fig
u
re
10
. Test Case Resu
lt
o
f
Circle Gen
e
rat
o
r
Wh
en
In
it Sig
n
a
l is
High
Evaluation Warning : The document was created with Spire.PDF for Python.
I
J
RES
I
S
SN
:
208
9-4
8
6
4
Textu
r
e Filterin
g
Arch
itectu
r
e fo
r
GPU Applica
tio
n
Us
ing
Recon
fig
u
r
ab
l
e
Compu
ting
(Krish
na
Bhu
s
ha
n
)
11
7
5.2
Test Vec
t
or
for
Circle
Generator
Th
is
u
n
it is
u
s
ed
to draw th
e circu
l
ar
ob
j
ects
in
an
im
ag
e. Initiall
y, wh
en
the reset sign
al i
s
h
i
gh
, and
all o
t
h
e
r sign
al
s are low,
reg
i
sters in th
e un
i
t
will g
e
t cl
eared
as sh
own
in Figu
re 9.
When
reset sign
al i
s
low
and
l
o
c
k
,
i
n
i
t
si
gnal
s
are
hi
gh
, at
t
h
i
s
t
i
m
e st
ep
si
ze s
h
oul
d
be m
e
nt
i
one
d
t
o
d
r
aw
a ci
rcl
e
.
It
ha
s bee
n
obs
er
ved i
n
t
h
e Fi
gu
re
10 t
h
a
t
t
h
e out
p
u
t
s
[(
Ax
, Ay
,
Az
), (
B
x, B
y
, B
z
)]
o
f
t
h
e
bl
oc
k car
ry
t
h
e co
or
di
na
t
e
s o
f
the circle. Is has also
been
observe
d that for
the gi
ve
n
step
size 0
000
000
00
000
001
1, th
e
sh
ap
e of
th
e ci
r
c
le is
change
d along
Ax,
Ay, Bx, a
n
d By a
x
is.
5.
3
T
e
st
Vec
t
or
for
C
ube
G
e
nera
tor
Th
is un
it is u
s
ed
to
d
r
aw th
e
cu
b
i
cal ob
j
e
cts in
an
i
m
ag
e. In
itially, wh
en
th
e reset sig
n
a
l
is h
i
g
h
and
all o
t
h
e
r sign
al
s are low,
reg
i
sters in th
e
un
it
will g
e
t cl
eared
as
sh
own
in Fig
u
re 11
. Wh
en
reset
sign
al is
lo
w
an
d lo
ck
, i
n
it sig
n
a
ls are
h
i
gh, at th
is tim
e st
ep
size sho
u
l
d be
m
e
nt
i
one
d
t
o
dra
w
a cu
be sha
p
e ob
ject
s. It
can
be
ob
ser
v
ed
i
n
t
h
e Fi
g
u
r
e
12
t
h
at
t
h
e c
o
or
di
n
a
t
e
s of
t
h
e c
u
b
e
cha
nge
s acc
o
r
di
ng
t
o
t
h
e c
o
unt
e
r
.
5.
4
T
e
s
t
Ve
ctor
f
o
r
Ro
t
a
t
o
r
Th
is un
it is u
s
ed
to
ro
tate th
e o
b
j
ects in
an
i
m
ag
e.
Wh
en
reset sig
n
a
l is low and
lo
ck
, i
n
it sig
n
a
ls are
high, at this time the coordinates of
the
objects in an im
ag
e whic
h are ne
ed
ed t
o
be rota
ted are recei
ve
d from
th
e resp
ectiv
e
fu
n
c
tion
s
as sh
own in
Figu
re
13
. Th
e coo
r
dinates of the
obje
cts in im
ages changes a
ccordi
n
g t
o
the axis
.
Fig
u
re
11
. Test Case Resu
lt
Wh
en
Reset
Sig
n
a
l is Hi
g
h
Fo
r Cub
e
Fig
u
re
12
. Test Case Resu
lt
Wh
en
In
it Signal is High
For
Cu
b
e
Evaluation Warning : The document was created with Spire.PDF for Python.