Indonesian J
ournal of Ele
c
trical Engin
eering and
Computer Sci
e
nce
Vol. 2, No. 1,
April 201
6, pp. 115 ~ 13
1
DOI: 10.115
9
1
/ijeecs.v2.i1.pp11
5-1
3
1
115
Re
cei
v
ed
Jan
uary 19, 201
6
;
Revi
sed Ma
rch 2
2
, 2016;
Acce
pted Ma
rch 3
0
, 2016
New De
sign of Network on Chip Based on Virtual
Routers
Mohamed Fe
hmi Chatme
n*
1
,
Adel Ba
ganne
2
and Rach
ed
Tour
ki
3
1
Microelectro
n
i
cs Lab
orator
y,
F
a
cult
y
of Sci
ences Mo
nastir
,
5000, T
unisia
2
Labor
ator
y
of Scienc
e an
d T
e
chn
o
lo
g
y
Info
rmation,
comm
unic
a
tion
and k
n
o
w
l
e
d
ge, UB
S Universit
y
, B
P
921
16, 56
32
1 Lori
ent, F
r
ance
3
Microelectro
n
i
cs
Labor
ator
y
,
F
a
cult
y
of
Scie
nces Mon
a
stir, 500
0, T
unisia
*Corres
p
o
ndi
n
g
author, em
ail
:
mohamed.fe
h
m
i.chatmen
@
g
m
ail.com
A
b
st
r
a
ct
Netw
ork is con
s
ider
ed the
mo
st conven
ient
w
a
y
to commu
n
icate
betw
een
different IP int
egrate
d
into the sa
me
chip. Studi
es
have b
e
e
n
d
e
vel
ope
d to pr
opos
e netw
o
rk
s w
i
th impr
ove
d
perfor
m
a
n
ce
in
term
s
of latenc
y, power consum
pti
on, thr
o
ughput
and
quality of serv
ice. M
o
st
of thes
e
networks have
been
desi
gne
d bas
e
d
on the 2-d
i
me
nsi
ona
l
net
w
o
rk structure. Recently, w
i
th the introd
uc
tion of the ne
w
structure of 3D
integr
ated circ
uits
(3D IC), ne
w
w
o
rks have used th
is type
of circuit to d
e
s
i
gn
3 di
mensi
o
n
s
on-ch
ip n
e
tw
orks. T
he adv
ant
age
bro
ught
by
this new
st
ruc
t
ure is to r
educ
e the
avera
ge
nu
mb
er of h
o
p
s
crossed
fro
m
t
he s
ource
to t
h
e d
e
stin
ation,
w
h
ich i
m
prove
s
the thr
o
u
g
h
p
u
t an
d th
e
aver
age
lat
ency
of
the
netw
o
rk.
Ke
y
w
ords
: chi
p
, latency, on-c
h
ip n
e
tw
orks, pow
er
consu
m
pt
ion, throu
g
h
put
, virtual routers
Copy
right
©
2016 In
stitu
t
e o
f
Ad
van
ced
En
g
i
n
eerin
g and
Scien
ce. All
rig
h
ts reser
ve
d
.
1. Introduc
tion
3D network
differs from
2D
network
by the ability to int
egrate more IP (i
ntellectual
property) with les
s
latenc
y [1
], particula
rl
y for routers that are di
stan
t from each ot
her.
The mai
n
a
d
vantage
of the
3D
structu
r
e
comp
ar
ed to
2D
stru
ctu
r
e i
s
that all
ro
uters are
clo
s
e to e
a
ch other, m
a
king
ea
ch ro
uter bette
r conne
cted: in
fact, a ro
uter may h
a
ve
6
con
n
e
c
ted
ne
ighbo
ring
ro
u
t
ers i
n
a
3
D
netwo
rk,
wh
il
e in
a 2
D
n
e
twork; it m
a
y
have at m
o
st
4
con
n
e
c
ted n
e
i
ghbo
rs.
This
gives th
e 3
D
netwo
rk mo
re
altern
ative p
a
ths fo
r the
transmi
s
sion
of
the packet to its destin
a
tion
, which h
e
lp
s us av
oid n
e
twork
satu
ration
for low pa
ckets load
s.
Figure 1
sho
w
s th
e extra
possibl
e pat
hs that
can
be follo
wed
by packet to
rea
c
h its
destin
a
tion in
the 3D network
com
pare
d
to the 2D
netwo
r
k; in f
a
ct, we have
at least 2 extra
alternative pa
ths (the n
u
mb
er of path
s
de
pend
s on
the
routing al
go
rithm use
d
on the network).
These adva
n
t
ages n
eed
e
d
the invoca
tion of much
more
re
sou
r
ce
s; in
cludi
n
g
port
s
numbe
r u
s
ed
for each ro
uter (o
ne ro
uter ha
s a
maximum of
7 ports in
st
ead of 5) a
n
d
the
multiplexing units asso
cia
t
ed
to
e
a
ch extra
outp
u
t
port. In a
ddit
i
on, the
stru
cture
of the
3D
network itself show
s other issues
especi
ally
regarding the
employm
ent
of TSV
(T
hrough Silicon
Via).
Figure 1. Advantage of 3
D
netwo
rk
com
parin
g to a 2D network
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 25
02-4
752
IJEECS
Vol.
2, No. 1, April 2016 : 115 –
131
116
In
th
is
w
o
rk
,
w
e
p
r
es
en
t a n
e
w
ne
tw
ork
arc
h
ite
c
tu
re
b
a
s
e
d
on
th
e
c
o
nc
ep
t of vir
t
u
a
l
route
r
s, which can play th
e same role
of a
3D network, but with
fewer
re
sou
r
ce
s and bett
e
r
performanc
e
in terms
of lat
enc
y, in fac
t, ther
e a
r
e two
versio
ns of th
is ne
w archite
c
ture:
-
The first ve
rsion i
s
name
d
RV
NO
C
(Re
duced
Re
so
u
r
ce
s
Virtual
Network on Chip
) and
is
desi
gne
d to u
s
e le
ss
re
sou
r
ce
s
and
hav
e re
sp
ectful
a
v
erage
laten
cy, for that, every RV
NO
C
route
r
is shari
ng the NI link
with few othe
rs.
-
The se
co
nd
versio
n is na
med LVNOC (Red
uced L
a
tency Virtua
l Netwo
r
k o
n
Chip) a
nd is
desi
gne
d to
have a
mini
mum of
laten
c
y with
an
o
p
timized
am
o
u
nt of
re
sou
r
ce
s, for that,
every LVNO
C route
r
ha
s i
t
s own lin
k to the NI.
Figure 2
sho
w
s the
pro
p
o
s
ed
archite
c
t
u
re
(c)
with the conventional arc
h
itec
tures
,
(a) for
the 2D net
wo
rk an
d (b
) for
the 3D net
wo
rk.
Figure 2.
Net
w
or
k
s
st
r
u
ct
u
r
es
Both the LV
NO
C a
nd
RVNOC net
works
co
nsi
s
t
on a
nu
mb
er of
se
para
ted sub
netwo
rks, wh
ich ca
n only be con
n
e
c
te
d to network
interfac
es
; in fac
t
the routers
having
the
same
co
ordinates i
n
thei
r own
sub n
e
tworks have
the sam
e
n
e
twork inte
rf
ace. Th
e set of
r
o
u
t
er
s
ha
ving
th
e
s
a
me
ne
tw
o
r
k
in
ter
f
ac
e form the
RG route
r
(Gl
o
bal Ro
uter).
Other works on the 3D ne
twork have tri
ed to
redu
ce
the numbe
r o
f
vertical movements
(alon
g
the z axis: TSV) beca
u
se the TSV links
are compli
cate
d, expen
si
ve [19, 9] and sh
ow
con
s
id
era
b
le
power con
s
u
m
ption [10].
Our archite
c
t
u
re
s
rem
o
ve
these
lin
ks (we
did
not
n
eed
TSV); in
fact, ou
r a
r
ch
itecture
s
are 2
D
stru
ctures b
u
t ca
n supp
ort a
number
of IP equivalent to a 3D network for a
n
y
dimen
s
ion
s
.
These a
r
chitectures
were devel
op
ed and sim
u
lated in VHDL usin
g the
MODELSIM6
.
5 tool.
2. Related
Wor
k
s
The re
ason for migration from 2
D
network to a
3D
ne
tw
o
r
k
is
to
in
c
r
eas
e
throughput; for
the sam
e
nu
mber of no
d
e
s, we
have
a lowe
r averag
e num
b
e
r of hop
s i
n
a 3D n
e
twork
comp
ared to
a 2D net
work [3, 26]. But the 3
D
n
e
tw
o
r
k
itself sh
ows
other ch
allen
ges espe
ciall
y
rega
rdi
ng
su
rface
a
nd
e
nergy
con
s
u
m
ption [1
7,
18] cau
s
ed
by the
use
of TSV an
d
the
difficulties
of fabrication t
hat make th
eir inte
g
r
atio
n limited. Fo
r this, redu
ci
ng the n
u
mb
er of
vertical lin
ks
(TSV) has b
e
en the goal o
f
many work
s: the work of [7] sho
w
s a n
e
twork
stru
cture
in whi
c
h the
vertical links are pla
c
ed i
rre
gula
r
ly. Bartza
s in [8] also h
a
s trie
d to redu
ce
the
numbe
r of T
SV by prop
o
s
ing
sp
ecifi
c
paths for
d
a
ta flow. S.
Pasri
c
h
a
in [
2
] pro
p
o
s
e
s
th
e
seri
alization
method a
s
a solutio
n
to re
duce the num
ber of TSV.
Yan G
h
idini
i
n
[5] h
a
s pro
posed
red
u
ci
ng the
nu
mb
er
of TSV b
a
s
ed
on
the
principl
e of
multiplexing
and
also a
d
o
p
ting the
seri
alizatio
n m
e
thod [4]
which is in
spi
r
ed
form the
work o
f
Pasri
c
h
a
. Thi
s
p
r
in
ciple h
a
s
be
en u
s
e
d
i
n
a net
wo
rk
called LASIO
whi
c
h i
s
the
Herme
s
net
work
expan
sion.
Liu Che
ng p
r
opo
sed in [6] to share TS
V between a
d
jacent route
r
s to re
du
ce their total
numbe
r, but t
o
a
c
hieve thi
s
, an
additio
n
a
l po
rt was
a
dded t
o
the
structu
r
e
of th
e ro
uters in
e
a
ch
layer formin
g the 3D network. Efsthios So
tirio
u
in [13] has propo
se
d a heterog
ene
ou
s
architectu
re
o
f
netwo
rk on
chip
form
ed b
y
route
r
s with
2D
and
3
D
structu
r
e to
mi
nimize
the
use
Evaluation Warning : The document was created with Spire.PDF for Python.
IJEECS
ISSN:
2502-4
752
Ne
w de
sign o
f
Netwo
r
k on
Chip Ba
sed o
n
Virtual Ro
uters
(M
oham
e
d
Fehm
i Chat
m
en)
117
of TSV. Efsthios
criti
c
ized t
he ex
ce
ssive
use
of the
s
e l
i
nks in
wo
rks
offering
simil
a
r a
r
chite
c
tures
[14, 15, 16] beca
u
se they negle
c
ted the
param
et
ers related to the physi
cal impl
ementation.
In [11] the author ha
s re
du
ced the n
u
mb
er of
TSV sh
owin
g that with only a few
spe
c
ific
loc
a
tions
of TSV, it is
poss
i
ble to ac
hiev
e a go
od tradeoff. In [12] t
he author has
tried to reduc
e
the numbe
r
of TSV using a techni
qu
e of routi
ng,
but this pa
cket tran
smi
ssi
on techni
que
gene
rate
s co
nsid
era
b
le lat
ency for a
sig
n
ificant tr
affic. All these wo
rks have trie
d
to improve 3
D
netwo
rk pe
rforma
nce in t
e
rm
s
of su
rface co
nsum
ption
by
a
c
ti
ng on
the T
SV numbe
r
but
assumin
g
tha
t
this minimization deg
rad
e
s net
wo
rk p
e
rform
a
n
c
e regarding the
averag
e laten
c
y.
In this paper
we sh
ow a n
e
w archite
c
tu
re of NOC
b
a
s
ed on the p
r
incipl
e of virtual route
r
s. This
architectu
re i
s
define
d
by
a 2D dim
e
n
s
i
on but
with t
he pe
rform
a
n
c
e of a 3
D
n
e
twork a
nd with
much le
s
s
re
sou
r
c
e
s.
Our pap
er is
defined
on
th
ree
pri
n
ci
pal
parts,
in th
e f
i
rst
one
we e
x
plain the
st
ructure
and th
e fun
c
tioning
of ou
r low l
a
ten
c
y
route
r
, in the
se
co
nd p
a
rt
we
explain
how we
u
s
e
the
route
r
to e
s
t
ablish ou
r n
e
w n
e
two
r
ks stru
ct
ures,
and in th
e l
a
st pa
rt, we
com
pare th
e
perfo
rman
ce
s of our netwo
rks to the othe
rs.
3.
Architec
ture
of the Route
r
3.1
Composi
t
ion
and Struc
t
u
r
e
Our router
whose archite
c
ture
is in
spire
d
from [20] is desig
ned to have minimal
latency
for se
ndin
g
the pa
cket fro
m
the sou
r
ce
to the
destin
a
tion. In fact, the packet i
s
ro
uted thro
ugh
multiplexing units
(CROS
SBAR) whic
h
operate acco
rding to the c
o
mbinat
ional logic
,
and
only
the routing
u
n
it ope
rate
s
a
t
the
clo
c
k ed
ge. Thi
s
unit
read
s the
he
a
der of the
pa
cket an
d di
re
ctly
allocates the
destin
a
tion p
o
rt in one cl
o
ck e
dge.
Figure 3.
Structure of the router
Even if we think that ou
r b
u
ffered route
r
is t
he faste
s
t router, it doe
sn’t su
ppo
rt q
uality of
servi
c
e an
d has no virtual
cha
nnel
s. No
te that
the ha
lf counter uni
t is prese
n
t only for RVNO
C
route
r
and n
o
t
for the LVNOC.
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 25
02-4
752
IJEECS
Vol.
2, No. 1, April 2016 : 115 –
131
118
3.2
Principal Functions o
f
th
e Rou
t
ing Unit
Our
route
r
ru
ns a
c
co
rdin
g
to the ha
nd
sha
k
e
co
mm
unication p
r
o
t
ocol. Th
e switchi
ng
mech
ani
sm a
dopted for o
u
r
netwo
rk is the Virtual Cut Throu
gh To
avoid dea
dlo
c
ks.
P
r
es
en
ce of
requ
e
s
t
?
R
e
ad
pack
e
t
h
e
ader
St
or
e
p
a
cket
Te
s
t
i
f
X
@
D
>
X
@
S
?
C
o
n
t
i
n
u
e
w
i
t
h
ada
pt
a
t
i
ve a
l
go
r
i
t
h
m
E
a
s
t
p
o
r
t
i
s
f
r
e
e
w
i
th
p
r
io
r
i
t
y
?
nor
t
h po
r
t
i
s
f
r
e
e
w
i
th
p
r
io
r
i
t
y
?
So
ut
h
p
o
r
t
i
s
f
r
ee
w
i
th
p
r
io
r
i
t
y
?
w
e
st p
o
rt i
s
fre
e
w
i
th
pri
o
ri
t
y
?
tra
n
sm
it
p
a
ck
et
Wa
i
t
f
o
r
ackn
ow
l
g
e
m
e
n
t
T
h
e pack
e
t
h
a
s
been
s
e
n
t
?
E
n
d o
f
t
r
an
s
m
i
s
s
i
on
an
d
w
a
it
fo
r a
n
o
t
h
e
r
re
q
u
e
s
t
wai
t
tr
a
n
s
m
it
p
a
c
k
e
t
?
W
a
i
t
unt
i
l
t
h
e
o
u
t
p
ut
af
t
e
r
X
Y
i
s
f
r
e
e
Fr
e
e
me
mo
r
y
NO
ye
s
ye
s
NO
ye
s
ye
s
ye
s
ye
s
ye
s
NO
NO
NO
NO
ye
s
NO
NO
Te
s
t
i
f
NBB
<
=
S
N
BB
?
ye
s
A
p
p
lic
a
tio
n
o
f
t
h
e
X
Y
al
gor
i
t
h
m
NO
Figure 4. Ope
r
ating p
r
in
cipl
e of an eleme
n
tary route
r
In the pre
s
en
ce of reque
st,
the routing
u
n
it
read
s the
packet he
ade
r and it sta
r
ts storin
g
the pa
cket in
the sto
r
ag
e
memory
unit
(buffer).
Mea
n
whil
e, the routing
u
n
it seeks
wheth
e
r the
dire
ction de
si
gnated by th
e co
rre
sp
ond
ing hea
der
i
s
free (a
ccordi
ng to the XY algorithm
) a
n
d
that there a
r
e
no other
req
uest
s
(with hi
gher
pr
io
rity)
desi
gnating t
he sa
me out
put port, then
it
route
s
the pa
cket throu
gh
the multiplexing units
(spe
cifying the ap
prop
riate valu
es for the in
p
u
t
sele
ction of the multiplexi
ng unit) an
d
repo
rts
that
the port is unavailabl
e to other po
ssible
requ
est
s
, if
not (th
e
di
re
ction
sp
ecifie
d by
the
XY algo
rithm i
s
unavail
able
)
, it see
k
s th
e
availability of other
port
s
,
starting
with t
he ports
associated with t
he shortest path, to route t
he
packet thro
ug
h.
In ca
se
all p
o
r
ts a
r
e
unava
ilable, the
pa
cket
already
store
d
in
the
memory
wait
s for the
availability of the port de
sig
nated by the XY algorithm
favorited by a higher p
r
io
rity to the recen
t
s
requ
est
s
(d
esignating the
same
output p
o
rt). (See Fi
g
u
re 4
)
Evaluation Warning : The document was created with Spire.PDF for Python.
IJEECS
ISSN:
2502-4
752
Ne
w de
sign o
f
Netwo
r
k on
Chip Ba
sed o
n
Virtual Ro
uters
(M
oham
e
d
Fehm
i Chat
m
en)
119
If the sendin
g
is com
p
leted
,
the routing
unit
rele
ases
the data sto
r
ed in the me
mory and
turns the
stat
e of the
asso
ciated
output
port to
avail
able.
Note th
at all po
rts are initially at t
he
available state.
For the RV
NOC router, th
e routing u
n
it send
s
pa
cke
t
s to the local
output port o
n
ly at a
corre
s
p
ondin
g
half-cy
cle
s
(the RVNOC
route
r
ope
rat
e
s only for the half cycle specifie
d by the
half cycl
e counter unit);for the ot
her hal
f-cycles, the
out
put port is set to
0 (the
reasons will be
explained lat
e
r in se
ction
4).
The figure 4
sho
w
s the pri
n
cipl
e of the rout
ing
algo
rithm usi
ng an
example of a
packet
transmissio
n from the loca
l port to the eastern
output
port, note that the NNB si
gnal is u
s
e
d
to
swit
ch
betwe
en the
ad
apti
v
e algo
rithm
and th
e XY ro
uting al
gorith
m
, whi
c
h
will
be mo
re
detai
led
later in this p
aper
3.3
Packet
Hea
d
er and Instanta
neou
s Routing
The p
a
cket
head
er
conta
i
ns the i
n
formation
ne
ce
ssary to rout
e the pa
cket
from the
sou
r
ce to the destin
a
tion. For ou
r network the pa
cket head
er is d
e
fined on 3
2
bits.
Packet head
e
r
:
Y
@
S
X@S
@IP S
@IP D
P. length
Y
@
D
X@D
Whe
r
e we de
noted by:
X@D: the co
ordin
a
te of de
stination r
out
er followi
ng the x-axis defi
ned on 4 bit
s
.
Y@D : the co
ordin
a
te of de
stination r
out
er alon
g the y-axis defin
ed
on 4 bits.
P.length: the packet si
ze th
at is defined
on 16 bits.
@IP D: the IP addre
ss rel
a
tive to NI w
h
ich
is
corre
s
pondi
ng to the destinatio
n route
r
and is
defined o
n
4 bits.
@IP S: the source IP address rela
tive to NI and is d
e
fined on 4 bi
ts.
X@S: the co
ordin
a
te of the sou
r
ce ro
uter
followi
ng the x-axis an
d
is defined o
n
4 bits.
Y@S: the co
ordin
a
te of the sou
r
ce ro
uter
alon
g the y-axis an
d is d
e
fined on 4 bi
ts
As explaine
d
in section 3
.
2, the routing
unit sele
cts the output
port instantl
y
when
readi
ng th
e h
eade
r. Inde
e
d
, a state
sig
nal (STA.P)
i
s
a
ssi
gne
d to
each o
u
tput
port that i
s
se
t to
0 if the port is free a
nd
se
t to 1 if not.
So we
have
5 state
s
sig
n
als ea
ch
of which i
s
a
s
soci
ated
with an o
u
tpu
t
port. The st
ates
signal
s
are STA.PW,
STA.PE, ST
A.PN, STA.PS, and STA.PL,
and are respectively as
sociated to output ports following t
he directions WEST, EAST, NORTH,
SOUT
H, and
LOCAL.T
he
state sign
al value is
set to
0
whe
n
the po
rt is fre
e
(wh
en no
packet
is
being tran
sferred to the a
ssociate
d
outpu
t port and
n
o
alrea
d
y store
d
packet
s
are
waiting for th
e
port availabilit
y). Otherwi
se, the st
ate signal value associated to the output port is set to 1.
Table 1. State sign
als u
n
d
e
r differe
nt transmi
ssion
s
scena
rio
s
STA.PW STA.PS
STA.PN
STA.PE
X\{w
}->
w
OR
BX\{
B
w
}
->
w
X
\
{S
}->
S
OR
BX\{
BS}
->S
X\{
N
}->N
OR
BX\{
B
N}->N
X
\
{E
}->
E
OR
BX\{
BE}
->E
0
0
0
0+P
0 0
0 0
0
0
0+P
1
0 0
0 1
0
0
1
0+P
0 0
1 0
0
0+P
1
1
0 0
1 1
0
1
0
0+P
0 1
0 0
0
1
0+P
1
0 1
0 1
0
1
1
0+P
0 1
1 0
0+P
1
1
1
0 1
1 1
1
0
0
0+P
1 0
0 0
1
0+P (
y
?
)
0+P (
y
?
)
1
1
0
0
1
1
0
1
0+P
1 0
1 0
1
0+P
1
1
1 0
1 1
1
1
0
0+P
1 1
0 0
1
1
0+P
1
1 1
0 1
1
1
1
0+P
1 1
1 0
1
1
1
1
1 1
1 1
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 25
02-4
752
IJEECS
Vol.
2, No. 1, April 2016 : 115 –
131
120
•
0: Port is in th
e avai
lab
l
e stat
e
•
1: the port is
unav
ail
abl
e
•
0 +
P: the port is availa
ble a
nd
w
e
fa
vor
i
te the transmiss
io
n in his d
i
rectio
n
•
0 +
P (y?): F
a
vors the tra
n
s
m
ission
in th
e
directi
on
of the
port if this min
i
mizes the
dist
ance
alo
ng th
e
y
-
axis
•
X \ {
w
}: set of inp
u
t ports exc
l
udi
ng W
EST
Input Port
•
X \ {S}: set of
inp
u
t ports exc
l
udi
ng SOUT
H Input Port
•
X \ {E}: set of
inp
u
t ports exc
l
udi
ng the EAS
T
Input port
•
X \ {N}: set of
inp
u
t ports
exc
l
udi
ng the NOR
T
H
Input Port
•
-> Ds : one of the entr
y
p
o
rts is transferri
ng
data to the dir
e
ction
of the out
put port Ds
• BX
: all buffers
•
BX \ {BDe} ->
Ds : one from the set of
memories
e
x
cl
udi
ng the mem
o
r
y
ass
o
ci
ated
w
i
th th
e entr
y
port
BDe is transfer
r
ing d
a
ta to the
output port Ds
or
w
a
itin
g for the ava
ila
bil
i
t
y
of the output p
o
rt.
The Table 1
sho
w
s the values a
s
soci
ated with
the different state signal
s by con
s
ide
r
ing
a packet tra
n
smi
ssi
on from the lo
cal
input
po
rt and its p
a
cket heade
r in
dicatin
g
a X@D
coo
r
din
a
te g
r
eater th
an X
@
R (X
@D a
nd X@R
are
respe
c
tively the de
stinatio
n ro
uter
and
the
current router coordi
nates follo
wi
ng
x-axis) in this case the
routing unit facilitates t
he
transmission of
the packet
to the EAST
direction if its state si
gnal
i
s
set to 0. If
not, we send the
packet
acco
rding to
the
NORT
H
or to t
he SO
UT
H
d
epen
ding
on
their availabilit
y, but if both
are
available
the
routing
unit
send
s the
pa
cket to
t
he
direction
that mi
nimize
s the
d
i
stan
ce
between
the de
stinatio
n IP and the
curre
n
t route
r
acco
rding
to
the y-axis. In
ca
se b
o
th si
g
nals
are
set t
o
1, the packet will be se
nt to the WEST dire
ction.
If all state signal
s are set to 1 then the alrea
d
y
stored
packet waits the availability of
the EAST
output port, this time wi
th
a higher priorit
y
comp
ared to recent que
rie
s
.
3.4
Priorit
y
Of Input Ports an
d the Elimination of Liv
e
Lock
Our
route
r
u
s
es the d
a
ta stored in m
e
m
o
ry if
there i
s
a tran
smissi
on erro
r or
al
l output
ports a
r
e bu
sy, otherwise, we can use a
n
y avail
able output port to
route the pa
ckets which can
gene
rate
live lock proble
m
s
(the
pa
cket n
e
ver re
ach
e
s its de
stination
)
. An
d to
avoid t
h
is
probl
em, a
si
gnal
calle
d NB is u
s
ed
to i
ndicate to the
add
re
ssed router th
e nu
mber
of times the
packet
ha
s n
o
t followe
d t
he p
a
th spe
c
ified by XY
algorith
m
in t
he al
rea
d
y crosse
d
route
r
s.
Acco
rdi
ng
to the
value ca
rried by
the NB
and co
m
p
a
r
ed to
the th
resh
old T
N
B.
The
routing
u
n
it
deci
d
e
s
whi
c
h routing
alg
o
rithm to
u
s
e
:
either c
onti
nue
s
with the
ada
ptive rou
t
ing, or ju
st u
s
e
s
XY. The thre
shold d
epe
nd
s on the
netwo
rk
si
ze
(f
or ex
ample fo
r the
3x3 network;
the TNB i
s
set
to 3 a
nd fo
r t
he 4x4
net
wo
rk the T
N
B i
s
set to
4).
T
h
e
ch
oice of th
e
TNB
wa
s m
a
de after ma
ki
ng
some
pe
rformances mea
s
ureme
n
ts th
at we
won’t
c
onsi
der i
n
thi
s
pa
pe
r, in fa
ct this
algo
rithm
whi
c
h i
s
a
no
vel one, d
o
e
s
n’t sho
w
b
e
tter pe
rform
a
n
c
e
s
comp
are
d
to the
“loo
k ahea
d”
ba
se
d
one
s. We o
n
ly adopted
this algorit
hm becau
se
of its implementation
simplicity and
th
e
perfo
rman
ce
s improveme
n
t compa
r
ed to
the determin
i
stic XY. In a
ddition, the establi
s
hme
n
t of
this algo
rithm
is not con
s
id
ered a
s
an ai
m in this pap
er.
Our n
e
two
r
k operate
s
a
c
cording to F
I
FO sched
uli
ng algo
rithm,
the first inp
u
t port
addressin
g
th
e output p
o
rt
will be
se
rve
d
first. In
so
me cases, m
u
ltiple inp
u
t p
o
rts
add
re
ss
the
same
outp
u
t
port at th
e
sa
me time
(call
ed h
e
re
in
s
t
an
t r
e
qu
es
ts)
,
s
o
the
F
I
F
O
a
l
g
o
r
ithm is
no
t
appli
c
able,
a
nd the
reque
st havin
g the
high
est
NB
numbe
r i
s
th
e on
e
who
h
a
s th
e hi
ghe
st
prio
rity. In ca
se
whe
r
e b
o
th req
u
e
s
ts
h
a
s the
so
me
NB then
an a
r
bitra
r
y pri
o
rit
y
assi
gnme
n
t is
con
s
id
ere
d
. T
h
is
prio
rity is
defined
a
s
fol
l
ows:
PE
>
P
S
>
P
W
> PN
>
PL with
Px is
the priority
asso
ciated wi
th the input port x. Note th
at the lo
cal input port has t
he lowe
st pri
o
rity and this for
the simple
re
aso
n
that other pa
ckets
co
ming from oth
e
r directio
ns
are old
e
r.
Evaluation Warning : The document was created with Spire.PDF for Python.
IJEECS
ISSN:
2502-4
752
Ne
w de
sign o
f
Netwo
r
k on
Chip Ba
sed o
n
Virtual Ro
uters
(M
oham
e
d
Fehm
i Chat
m
e
n)
121
(a)
(b
)
(c)
Unavailable output port
Routing the pack
et
w
i
th adaptivel
y
Routing the pack
et under th
e X
Y
algorithm
Figure 5. Net
w
ork be
havio
r to
avoid dea
dlock an
d live lock
(a)
Behavior of the routin
g alg
o
rithm if the pac
ket is carrying a numb
er
of two miss-routing
comp
ared to XY (NB = 2)
(b)
Behavior of t
he routing
al
gorithm
if the
pa
cket
is in
a situ
ation
where
all
outp
ut port
s
a
r
e
unavailabl
e, i
n
this
ca
se
the pa
cket a
l
ready
sto
r
ed
in mem
o
ry
waits for th
e
next po
rt
availability after the XY direction
(c)
Behavior
of the routing
alg
o
rithm if the
p
a
ck
et receive
s
a
num
ber of
three
miss-ro
uting(NB
=
3) in thi
s
case the NB i
s
e
qual to the T
N
B
then the
XY algorithm
will be
applie
d even if the
r
e
are bette
r alternative path
s
.
3.5
Packe
t S
w
i
t
c
h
ing Principle
Our router i
s
actually ope
rating in a virtual
cut thro
u
gh pa
cket switching type with one
modificatio
n
whi
c
h i
s
the
storage
strate
gy; in fact
the
route
r
store
s
immediately
the pa
cket in
the
buffer re
gardl
ess the state
of t
he addre
s
sed po
rt. This doe
s not mean that we must save
the
entire
packet before
sendi
ng it, but
it
will be
sent
as soon as the
destination
port i
s
avail
a
ble.
The insta
n
t st
orag
e wa
s ne
ce
ssary for two re
ason
s:
The first on
e is to have two packet
s
ad
dre
ssi
ng the
same o
u
tput port at the sa
me time
(the ro
uting u
n
it see
s
that the out
put po
rt is free and g
i
ves acce
ss to both input p
a
ckets) this will
result the lo
ss of data
of
one of two p
a
ckets. In
su
ch
ca
se, the
routing
unit d
e
tects thi
s
error
immediately
(the ro
uting
u
n
it always te
sts if it
ha
s
given a
c
ce
ss to an
outp
u
t port
to multi
p
le
input pa
ckets) and sto
p
s
sendin
g
the one with lowe
r priority witho
u
t risk of lo
si
ng data be
ca
use
they are alrea
d
y stored in
memory. Tho
s
e two a
c
tion
s ope
rate a
s
combi
nato
r
ial
function
s.
The
se
con
d
reason i
s
to
h
a
ve a
sen
d
in
g erro
r (re
c
ei
ving an
error
ackno
w
le
dgm
ent from
the de
stinatio
n ro
uter) in t
h
is
ca
se th
e
routing
uni
t starts
to res
end
the
pack
et as
s
o
on
as
t
he
output port is
available.
The communi
cation p
r
oto
c
ol adopte
d
for our n
e
two
r
k is the hand
shake: In pre
s
ence of
a requ
est, th
e routing
unit
se
nd
s
an
ackno
w
le
dgme
n
t
(a
ck = 00)
to
in
dicat
e
t
he re
ceipt of the
head
er
and t
hen
sen
d
s (a
ck =
01
) to i
ndicate the
b
eginni
ng of t
h
e receipt of
the rest of t
he
packet. It se
nds a
n
a
ckn
owle
dgme
n
t (ack=11
) to i
n
dicate the succe
ssful
re
ceipt of the entire
p
a
c
k
e
t. An
ac
kn
ow
le
dg
men
t
(
a
c
k
=
1
0)
is
se
n
t
in
ca
se of tran
smi
s
sion
error. T
he treatm
ent
of
these
t
w
o co
mmuni
cation obsta
cle
s
we
re
not spe
c
if
i
e
d by othe
r o
n
-chip n
e
two
r
k de
sig
ners (i
t is
imperative to store the packet fo
r each
communi
cation to ensur
e that no packet
will be lost).
4.
Net
w
o
r
k Arc
h
itec
ture With
Virtual
Ro
uters
4.1.
The Ne
w
Ne
tw
o
r
ks Struc
ture
Our net
works (RVNOK
or
LVNO
C) a
r
e
comp
os
ed by
glob
al route
r
s RG, whe
r
e
each RG
route
r
com
p
ri
se
s N elem
en
tary route
r
s:
- A base
rout
er (R0) i
s
equ
ivalent to the route
r
having
the coo
r
din
a
te z=0 for a 3
D
network.
- N virtual
ro
uters:
the first one
(R1) is equival
ent t
o
the
route
r
R(X, Y, 1
)
h
a
ving the
sa
me
coo
r
din
a
tes (X, Y) as the
base route
r
b
u
t with
the co
ordin
a
te z=1
and the se
co
nd is equival
ent
to the ro
uter
having the
sa
me coordinat
es
(X, Y)
a
s
t
h
e ba
se
ro
uter b
u
t with th
e co
ordinate
z=2.
The n
th
route
r
is equivale
nt to R(X, Y, N).
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 25
02-4
752
IJEECS
Vol.
2, No. 1, April 2016 : 115 –
131
122
Figure 6. Net
w
ork a
r
chitecture with 2 virtual route
r
s
In other
wo
rd
s we repla
c
e
the ro
uters h
a
vi
ng the
sa
me XY co
ordi
nates
but n
o
t the z by
a singl
e rout
er kno
w
n a
s
RG. Thi
s
allo
ws to eli
m
ina
t
e the Z+ a
n
d
Z- di
re
ction
s
a
s
well
as t
h
e
multiplexing e
n
tities a
s
soci
ated to ea
ch
route
r
for
each level (layer). This leads t
o
a large gai
n in
terms of em
pl
oyed re
sou
r
ces. For exa
m
ple, con
s
id
eri
ng RG
with o
n
ly two virtual routers (N=2
):
Figure 7. Architecture of a global route
r
with 2 virtual route
r
s
For the
first
v
e
rsio
n (
R
VN
OC
), the R
G
rout
e
r
ha
s
1
2
input/outp
u
t
ports foll
owing the
dire
ction
s
Ea
st, West,
No
rth, South e
q
u
ivalent
to
2
4
on
e-way
p
o
rts an
d a
si
ngle i
nput/ou
t
put
port (lo
c
al p
o
r
t) equival
ent
to 2 one-wa
y ports, in
ste
ad of 3 route
r
s with 1
4
in
put/output po
rts
fo
llo
w
i
ng
th
e d
i
r
e
c
t
io
ns
Ea
s
t, We
s
t,
N
o
r
t
h
,
So
u
t
h, Z
+
, Z-
(
E
qu
iva
l
e
n
t
to
28
p
o
r
t
s)
an
d 3
input/output
ports to the l
o
cal
nod
e (e
quivalent to
6 unidi
r
e
c
tion
al po
rts).
Tot
a
lly, we h
a
ve
26
ports fo
r glob
al route
r
s in
st
ead of 34 po
rts, this is a re
ductio
n
of 23
%.
For the
se
co
nd version
(L
VNOC) the
RG ha
s th
e sa
me po
rts nu
mber i
n
all di
rectio
ns
but with 2 extra lo
cal po
rts,
so we
have
30 po
rts in
ste
a
d of 34 (p
re
sente
d
by
the 3D ro
uter), this
is a red
u
ctio
n
of 12%.
4.2.
Comparis
on of Our Ne
w
Net
w
o
r
k S
t
r
u
ctur
e
w
i
th the 2d Struc
ture
Our
archite
c
ture
can
be
d
e
fined a
s
a
set of 2D
n
e
tworks
se
parate
d
from e
a
ch
other, the
links bet
wee
n
these
differe
nt netwo
rks i
s
do
ne at th
e
netwo
rk interface
(NI). An
d at ea
ch
NI, we
have a num
b
e
r of co
nne
ct
ed nod
es
eq
ual to the nu
mber of
sep
a
r
ate net
works. This structu
r
e
has
several a
d
vantage
s co
mpared to 2D network:
4.2.1.
Number O
f
L
i
nks
The propo
se
d stru
ctu
r
e is
optimize
d
co
mpared to a
2D net
wo
rk [
27]; in fact, the numb
e
r
of links fo
r the conve
n
tion
al 2D net
work is determi
ne
d by the equa
tion
.
1
:
∗
1
∗
∗
1
Eq.1
Evaluation Warning : The document was created with Spire.PDF for Python.
IJEECS
ISSN:
2502-4
752
Ne
w de
sign o
f
Netwo
r
k on
Chip Ba
sed o
n
Virtual Ro
uters
(M
oham
e
d
Fehm
i Chat
m
e
n)
123
With N: num
b
e
r of netwo
rk colum
n
s
M: number of network line
s
For ou
r RV
NOC st
ru
cture,
the numbe
r
of links i
s
det
ermin
ed by the equatio
n
.
2
:
∗
1
∗
∗
1
∗
Eq.2
For ou
r LVNOC st
ru
cture,
the numbe
r of
links i
s
det
ermin
ed by the equatio
n
.
3
:
∗
1
∗
∗
1
∗
Eq.3
With:
Ns: nu
mbe
r
o
f
column
s in a singl
e network.
Ms: numb
e
r o
f
lines in a sin
g
le network.
NNE: num
ber of sepa
rate n
e
tworks.
To get an ide
a
of the redu
ction rat
e
, let us con
s
id
e
r
a
2D network
with si
ze 5x5
(N
= M = 5)
with
possibility to connect 25 cores. After applying
.
1
we have a total of 6
5
bidirec
t
ional link
s
.
The eq
uivale
nt stru
cture
asso
ciated
with our
archi
t
ecture i
s
de
fined by a set of three 2
dimen
s
ion
a
l netwo
rks of
3x3 (NEE =
N1 = M
1
= 3
)
with po
ssibi
lity to connect 27 cores, af
ter
applying
.
2
we get a total of
45 bidirectio
n
a
l links for RVNOC
netwo
rk.
For the LV
NOC net
wo
rk,
after applyin
g
.
3
, we get a 63
bidire
ctional
links. So we
can
con
n
e
c
t
much m
o
re
core
s to our p
r
opo
sed a
r
chitectures
with less links.
4.2.2. Algorithmic
Complexity
Algorithmi
c
complexity ne
ce
ssarily de
p
end
s on th
e
netwo
r
k
si
ze
(espe
c
ially re
gardi
ng the
issue
of live l
o
ck) an
d
sin
c
e the
si
ze
of
our net
work
i
s
m
u
ch le
ss t
han th
e
2D o
n
es be
ca
use
it is
divided by NNE, we ca
n say that our a
r
chitect
u
re offers a
significant redu
ction on
the
comp
utationa
l complexity
4.2.3. Late
nc
y
It is clear tha
t
with incre
a
sing the si
ze
of
the network, the averag
e latency in
creases,
our ne
w arch
itecture al
wa
ys sho
w
s a redu
ced
size
comp
ared to
a 2D netwo
rk, even with
a
large
r
num
be
r of conn
ect
ed co
re
s, so
certainl
y ou
r archite
c
tu
re
presents a
redu
ce
d late
ncy
(co
n
si
de
ring t
he low inj
e
cti
on rate for th
e RVNO
C
structure)
com
p
ared to its e
q
u
ivalent in the
2D net
works.
We'
r
e talki
n
g about a
smalle
r ne
twork si
ze
becau
s
e we
con
s
ide
r
for ea
ch
comm
uni
cati
on a
sepa
rat
e
net
work
(whi
ch
alway
s
h
a
s a
red
u
ce
d
size
co
mpared to
a
2D
netwo
rk
).
Figure 8 sho
w
s
a compa
r
i
s
on
between
the laten
c
y
of a 2D n
e
two
r
k of si
ze 5x5
with both
RVNO
C 3x3
and LVNOC
3x3:
Figure 8. The
average late
ncy of a 2D n
e
twork
of si
ze
5x5 compa
r
e
d
with both RVNOC
3x3 and
LVNO
C 3x
3
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 25
02-4
752
IJEECS
Vol.
2, No. 1, April 2016 : 115 –
131
124
Figure 8
sho
w
s th
at the p
e
rform
a
n
c
e
o
f
the RVNOC is b
e
tter tha
n
the 2
D
NO
C only fo
r
low inj
e
ctio
n
rate. Th
e LV
NO
C al
ways sho
w
s
redu
ced l
a
ten
c
y
comp
ared to
other net
wo
rks.
(Re
a
son
s
will
be explaine
d
later in this p
aper).
4.3.
The Time Multiplexing Principle
In addition t
o
its rol
e
of
deliverin
g d
i
ffer
ent pa
ckage
s
to diffe
rent de
stinati
ons, the
routing
unit al
so allo
ws to synchroni
ze th
e different
lev
e
ls of the net
work (he
r
e we talk abo
ut the
RVNO
C). In
fact, the rout
ers bel
ongi
n
g
to the
sa
m
e
sepa
rate
n
e
twork
ope
r
a
t
e at the
sa
me
perio
d. Which is not th
e
same fo
r oth
e
r route
r
s h
a
ving a
differe
nt index; Ba
se
route
r
s are al
so
functioni
ng o
n
their own perio
d (half
-
cycle): t
he hal
f-cycle
cou
n
ting unit co
unt
s the half-cycle
modulo
N-1, the numb
e
r of
cycle
s
introd
uce
d
to
the routing unit (b
etwee
n
1 and
N-1
)
indicate
s
whi
c
h type
of route
r
s is op
erati
ng. A
s
a
n
exampl
e, le
t’s con
s
ide
r
N a
s
the
num
ber
of half-cy
cle
and a net
work with two virt
ual route
r
s:
n = 1: the base route
r
s
(eq
u
ivalent to z
= 0 in 3D
NO
C) a
r
e in op
erating state
n =
2: virtual
route
r
s V
R
1
(equivalent to
r
oute
r
s havi
n
g z =
1 in
3
D
NOC) a
r
e i
n
ope
ratin
g
st
at
e
n =
0: virtual
route
r
s V
R
2
(equivalent to
r
oute
r
s havi
n
g z =
2 in
3
D
NOC) a
r
e i
n
ope
ratin
g
st
at
e
Figure 9. Time division mul
t
iplexing for three el
eme
n
tary route
r
s
The RG ro
ute
r
allo
ws tran
smitting two flits at
the end
of one cy
cle (1 flit for the first half
cy
cle
of
t
he f
i
rst
cy
cle a
n
d
se
con
d
f
lit
in t
he
second
half cycl
e of
the se
co
nd
cycle
)
for
ea
ch
route
r
, ave
r
a
g
ing
a late
ncy of 1
cycl
e f
o
r th
e tran
sm
issi
on
of a
si
ngle flit b
e
tween t
w
o
neig
h
bors
RG routers.
This fu
nctio
n
(Th
e
time
multiplexing) co
ncern
s
o
n
ly the RV
NOC ve
rsi
on
while th
e
LVNO
C all ro
uters
wo
rk
si
multaneo
usly
at the same
clo
ck e
d
ge.
4.4.
Structure o
f
the Loc
al Port for
Rg Ro
uter
4.4.1.
Cas
e
of th
e Rv
noc
Beside the d
e
letion of displacement al
ong the
z-axi
s
(z
+, z-), we redu
ce
d the numbe
r
of local po
rts while mai
n
taining the
sa
me numb
e
r
of cores that
can be
co
n
n
ecte
d in a
3D
netwo
rk.
Act
ually,
a set o
f
N co
re
s ca
n
be
conn
ect
e
d
to a sin
g
le
glo
bal RG
route
r
u
s
in
g one
input p
o
rt a
n
d
on
e o
u
tput
port. Th
e o
u
tput po
rt na
m
ed
DATA_O
UT_L
OCA
L
_
R
G i
s
m
u
ltipl
e
xed
(sh
a
red) bet
wee
n
N o
u
tp
ut port
s
from
N el
ementa
r
y
routers. Thi
s
multiplexing i
s
done using
th
e
multiplexing u
n
it that has a singl
e functio
n
whi
c
h is the
OR gate bet
wee
n
the N o
u
tput sign
als.
In fact, each
signal
ca
rri
es the a
ppropriate
data
within the
approp
r
iate
half cycle,
otherwise, it
is
set to 0
(see
se
ction 3.
2). Th
eref
or
e, a
t
e
a
ch
ha
lf c
y
c
l
e
,
a s
i
ng
le
s
i
g
n
a
l
h
a
s
approp
riate
d
a
ta (and
is n
o
t set
to
zero
)
so th
at
the
result
of the
O
R
g
a
te o
p
e
r
a
t
ions
with
oth
e
r
Evaluation Warning : The document was created with Spire.PDF for Python.