IAES
Inter
national
J
our
nal
of
Robotics
and
A
utomation
(IJRA)
V
ol.
15,
No.
2,
June
2026,
pp.
488
∼
502
ISSN:
2722-2586,
DOI:
10.11591/ijra.v15i2.pp488-502
❒
488
Optimized
mapping
in
2D
and
3D
netw
ork
on
chip
using
Bat
algorithm
Maamar
Bougherara
1,2
,
Rak
Amara
1,3
,
Amina
Guidoum
1
1
Department
of
Computer
Science,
High
Normal
School
of
K
ouba,
Algiers,
Algeria
2
Lim
Laboratory
,
F
aculty
of
Exact
Sciences,
Akli
Mohand
Oulhadj
Uni
v
ersity
of
Bouira,
Algeria
3
L
TIR
Laboratory
,
F
aculty
of
Electronics
and
Computational
Science,
USTHB
Uni
v
ersity
,
Algiers,
Algeria
Article
Inf
o
Article
history:
Recei
v
ed
Aug
17,
2025
Re
vised
May
6,
2026
Accepted
May
13,
2026
K
eyw
ords:
Bat
algorithm
Communication
cost
Mapping
Netw
ork
on
chip
Optimization
ABSTRA
CT
Communication
within
system-on-chip
(SoC)
architectures
has
e
v
olv
ed
signif-
icantly
to
k
eep
pace
with
the
gro
wing
comple
xity
of
modern
applications.
T
o
o
v
ercome
the
limitations
of
traditional
interconnects,
netw
ork-on-chip
(NoC)
has
emer
ged
as
a
scal
able
and
ef
cient
communication
solution.
Although
early
NoC
designs
relied
hea
vily
on
2D
architectures,
their
ph
ysical
and
per
-
formance
constraints
ha
v
e
led
to
the
rise
of
3D
NoC
architectures
,
which
of
fer
better
spatial
inte
gration
and
impro
v
ed
performance.
In
order
to
automate
the
NoC
design
process,
a
number
of
electronic
design
automation
(ED
A)
tools
and
optimization
algorithms
are
emplo
yed
to
help
designers
achie
v
e
ef
cient
and
high-performance
designs.
W
ithin
this
ED
A
frame
w
ork,
one
of
the
most
critical
stages
is
the
core
placement
or
application
mapping
phase,
where
computational
tasks
are
allocated
to
the
processing
elements
of
the
architecture.
This
step
is
v
ery
hard
due
to
its
combinatorial
nature,
and
its
optimization
is
essential
since
it
directly
impacts
communication
cost,
ener
gy
consumption,
and
o
v
erall
system
performance.
T
o
address
this
challenge,
numerous
heuristic
and
metaheuristic
algorithms
ha
v
e
been
e
xplored
for
both
2D
and
3D
NoCs.
In
this
pape
r
,
we
pro-
pose
a
n
adaptation
of
the
bat
algorithm
to
solv
e
the
mapping
problem
in
both
2D
and
3D
NoC
architectures,
with
the
objecti
v
e
of
minimizing
communication
cost.
The
proposed
approach
is
e
v
aluated
and
compared
ag
ainst
other
optimiza-
tion
methods
to
assess
its
ef
fecti
v
eness
in
enhancing
NoC
performance
within
the
ED
A
frame
w
ork.
This
is
an
open
access
article
under
the
CC
BY
-SA
license
.
Corresponding
A
uthor:
Maamar
Bougherara
Department
of
Computer
Science,
High
Normal
School
of
K
ouba
Algiers,
Algeria
Email:
bougherara.maamar@gmail.com
1.
INTR
ODUCTION
As
the
demand
for
high-performance
computing
syst
ems
continues
to
gro
w
,
netw
ork-on-chip
(NoC)
architectures
ha
v
e
emer
ged
as
a
promising
solution
to
address
communication
bottlenecks
in
comple
x
system-
on-chip
(SoC)
designs
[1].
Similar
in
concept
to
traditional
computer
netw
orks,
a
NoC
pro
vides
scalable
and
structured
communication
among
multiple
processing
elements.
Ho
we
v
er
,
due
to
stringent
constraints
in
area,
po
wer
,
and
performance,
NoC
designs
must
be
carefully
optimized
to
meet
the
specic
requirements
of
their
tar
get
applications.
In
NoC-based
systems,
appl
ications
are
typically
di
vided
into
se
v
eral
computational
tasks,
each
encapsulated
in
an
intellectual
property
(IP)
block.
These
IPs
must
be
inte
grated
into
the
NoC
infrastructure
in
a
manner
that
maximizes
performance
while
minimizing
ener
gy
consumption
and
latenc
y
[2].
J
ournal
homepage:
http://ijr
a.iaescor
e
.com
Evaluation Warning : The document was created with Spire.PDF for Python.
IAES
Int
J
Rob
&
Autom
ISSN:
2722-2586
❒
489
A
typical
NoC
consists
of
four
k
e
y
components
[3]:
i)
IP
cores,
which
include
processors,
memory
units,
and
custom
controllers;
ii)
netw
ork
interf
aces
(NIs),
which
serv
e
as
a
bridge
between
the
IP
cores
and
NoC
communication;
iii)
routers,
which
handle
pack
et
forw
arding
and
arbitration;
and
i
v)
ph
ysical
intercon-
nects,
which
dene
the
paths
for
data
transmiss
ion.
Each
tile
consists
of
an
IP
,
a
netw
ork
interf
ace
(NI),
and
a
router
.
The
topology
of
a
NoC
determines
ho
w
tiles
are
connected
[4].
Among
the
a
v
ailable
congurations,
the
2D
mesh
topology
is
the
most
widely
adopted
due
to
its
re
gular
structure,
simplicity
,
and
scalability
.
W
ith
the
increasing
comple
xity
and
inte
gration
density
of
modern
SoCs,
2D
NoC
architectures
f
ace
gro
wing
limitations
in
terms
of
bandwidth,
latenc
y
,
and
ener
gy
ef
cienc
y
.
T
o
addre
ss
these
challenges,
three-dimensional
(3D)
NoC
architectures
ha
v
e
been
proposed
[2].
By
v
ertically
stacking
multiple
2D
layers
and
emplo
ying
through-
silicon
via
(TSV)
technology
[2],
3D
NoCs
signicantly
reduce
the
a
v
erage
communication
distance,
enhance
bandwidth,
and
impro
v
e
performance.
In
a
3D
mesh
topology
,
each
router
connects
to
up
to
six
neighboring
routers
[5],
f
o
ur
in
the
same
layer
and
tw
o
in
adjacent
v
ertical
layers.
This
conguration
enables
both
intra-
layer
and
inter
-layer
communication,
leading
to
better
parallelism,
reduced
latenc
y
,
and
higher
throughput.
Figure
1
sho
ws
2D
and
3D
mes
h
NoC
topologies.
The
design
of
a
NoC-based
system
generally
follo
ws
three
main
st
ages
[6]:
i)
task
assignment,
which
in
v
olv
es
assigning
application
tasks
to
a
predened
library
of
IP
blocks;
ii)
IP
mapping,
where
these
IP
blocks
are
mapped
to
ph
ysical
nodes
(routers)
within
the
NoC;
and
iii)
static
routing,
which
determines
the
communication
paths
between
the
mapped
IPs.
Each
of
these
stages
is
sup-
ported
by
ED
A
tools,
which
e
xplore
the
design
space
to
generate
optimized
hardw
are/s
oftw
are
congurations.
Figure
2
illustrates
a
standard
embedded
system
design
o
w
for
NoC
platforms.
This
paper
introduces
a
no
v
el
model
for
solving
the
IP
mapping
problem
in
both
2D
and
3D
NoCs.
After
modeling
the
problem,
dening
the
main
objecti
v
e
of
the
study
,
and
formulating
the
cost
function,
the
Bat
algorith
is
emplo
yed
for
the
rst
time
as
a
ne
w
method
to
address
the
application
mapping
problem.
(a)
2D
(b)
3D
Figure
1.
Mesh-based
NoC
Figure
2.
T
ypical
embedded
system
design
o
w
for
NoC
platform.
The
structure
of
this
paper
is
as
follo
ws.
Section
2
introduces
the
NoC
generalities.
Section
3
pro
vides
a
re
vie
w
of
prior
research
related
to
mapping
in
NoC
architectures.
In
section
4,
we
present
the
modeling
of
Optimized
mapping
in
2D
and
3D
network
on
c
hip
using
Bat
algorithm
(Maamar
Bougher
ar
a)
Evaluation Warning : The document was created with Spire.PDF for Python.
490
❒
ISSN:
2722-2586
the
mapping
problem
prior
to
applying
a
metaheuristic
approach.
The
bat
algorithm,
used
as
the
core
mapping
strate
gy
,
is
described
in
section
5.
Section
6
analyzes
the
results
obtained
from
the
simulation
e
xperiments.
Finally
,
section
7
presents
the
conclusion
of
the
study
along
with
directions
for
future
w
ork.
2.
NOCS
GENERALITIE
NoC
dra
ws
its
inspiration
from
communication
netw
orks
originally
designed
for
supercomputers
and
is
formed
by
interconnected
on-chip
components
that
communicate
through
pack
et-based
transmission
o
v
er
a
scalable
interconnection
architecture.
NoC
of
fers
numerous
benets,
including
ener
gy
ef
cienc
y
,
reliability
,
bandwidth
scalability
in
comparison
to
traditional
b
us
architectures,
and
reusability
[6].
A
NoC’
s
topology
refers
to
the
arrangement
of
its
components
within
the
on-chip
interconnect,
which
can
be
structured
as
a
ring,
torus,
or
2D
mesh.
It
is
also
characterized
by
other
aspects
such
as
the
communication
mode,
o
w
control
mechanisms
t
o
pre
v
ent
deadlocks,
and
b
uf
fering
policies.
Se
v
eral
steps
are
required
in
the
design
of
a
NoC
based
system.
Initially
,
the
application
is
decomposed
into
a
set
of
parall
el
communication
tasks.
Then,
each
task
is
assigned
to
a
selected
processing
core,
which
is
scheduled
according
to
the
system
requirements.
Finally
,
the
processing
cores
are
mapped
onto
the
NoC
architecture
[5].
This
paper
focuses
on
the
application
mapping
stage,
which
remains
an
open
research
problem.
An
optimal
mapping
can
achie
v
e
up
to
51.7%
communication
ener
gy
sa
vings
compared
to
an
ad
hoc
implementation
[7].
T
o
achie
v
e
high
performance,
an
optimal
mapping
must
be
determined.
Gi
v
en
m
tasks
mapped
onto
a
NoC
with
n
cores,
where
m
≤
n
,
the
number
of
possible
mappings
is
n
!
/
(
nm
)!
.
As
an
NP-hard
combinatorial
optimization
problem,
applica
tion
mapping
is
generally
addressed
using
heuristic
algorithms
to
obtain
suboptimal
solutions.
In
con
v
entional
2D
NoCs,
each
IP
is
connected
to
a
router
via
a
netw
ork
interf
ace,
and
these
routers
are
arranged
in
a
planar
grid
with
wired
links.
Communication
tak
es
place
using
pack
et-switched
data
transfer
,
where
messages
are
brok
en
into
pack
et
s
and
routed
through
the
NoC
[8].
Compared
to
traditional
b
us
-based
systems,
2D
NoCs
of
fer
se
v
eral
adv
antages:
higher
com
munication
bandwidth,
reduced
latenc
y
,
impro
v
ed
scalability
,
and
lo
wer
po
wer
consumption
[9].
Ho
we
v
er
,
as
more
IPs
are
inte
grated
into
a
single
chip,
the
a
v
erage
number
of
hops
(routers
tra
v
ersed
by
a
pack
et)
increases,
leading
to
greater
communication
ener
gy
and
de
graded
performance.
This
issue
becomes
increasingly
problematic
as
system
size
gro
ws
[7].
The
3D
NoC
paradigm
addresses
these
is
sues
by
enabling
v
ertical
communication
through
TSVs.
This
approach
reduces
the
a
v
erage
hop
count
and
signicantly
lo
wers
latenc
y
and
ener
gy
consumption.
Additionally
,
it
enhances
area
utilization,
making
it
suitable
for
lar
ge-scale
multicore
systems.
Consequently
,
3D
NoCs
ha
v
e
g
ained
signicant
interest
in
both
academia
and
industry
as
a
compelling
solution
for
future
high-performance
computing
platforms.
This
w
ork
focuses
on
optimizing
the
mapping
phases
of
NoC
design,
aiming
to
minimize
communication
ener
gy
while
maintaining
high
system
performance.
In
a
3D
NoC,
the
IP
mapping
problem
consists
of
assigning
each
IP
core
to
a
specic
node
within
the
3D
topology
.
This
phase
is
critical,
as
ef
cient
mapping
can
greatly
reduce
communication
cost,
ener
gy
usage,
and
latenc
y
[10].
Due
to
its
combinatorial
nature,
the
IP
mapping
problem
is
considered
NP-hard
and
is
closely
related
to
the
quadratic
assignment
problem
(QAP).
This
comple
xity
mak
es
e
xact
solutions
impractical
for
lar
ge
systems,
especially
within
limited
design
time.
As
3D
NoCs
enable
the
inte
gration
of
a
greater
number
of
IPs
across
multiple
layers
using
TSVs,
ef
cient
and
scalable
mapping
algorithms
are
becoming
increasingly
important.
T
o
this
end,
heuristic
and
metaheuristic
approaches
of
fer
a
viable
alternati
v
e,
pro
viding
near
-optimal
solutions
within
reasonable
time
frames.
3.
RELA
TED
W
ORK
Se
v
eral
approaches
ha
v
e
been
proposed
to
the
mapping
in
NoCs.
These
mapping
techniques
can
be
classied
into
three
principal
cate
gories:
e
xact
approaches
and
heuristic
(or
approximate)
techniques
[11].
W
e
sho
w
in
this
paper
the
most
cited
who
tak
e
into
a
count
single
ob
j
ecti
v
e.
Exact
mapping
methods,
such
as
e
xhausti
v
e
search,
linear
programming,
and
branch-and-bound
algorithms,
rely
on
mathematical
rigor
to
guarantee
globally
optimal
solutions.
An
inte
ger
linear
programming
(ILP)
formulation
for
the
IP
mapping
problem
is
presented
in
[12].
This
approach
models
both
objecti
v
es
and
constraints
as
linear
functions,
with
the
additional
requirement
that
decision
v
ariables
tak
e
inte
ger
v
alues.
The
ILP
model
is
solv
ed
using
the
optimization
tool,
enabling
precise
solution
computation
within
a
mathematically
dened
space.
Despite
their
accurac
y
,
e
xact
methods
suf
fer
from
high
computational
comple
xity
and
si
gnicant
runtime
requirements.
As
a
result,
the
y
are
typically
feasible
only
for
small-scale
mapping
scenar
ios
in
v
olving
a
limited
number
of
IP
IAES
Int
J
Rob
&
Autom,
V
ol.
15,
No.
2,
June
2026:
488-502
Evaluation Warning : The document was created with Spire.PDF for Python.
IAES
Int
J
Rob
&
Autom
ISSN:
2722-2586
❒
491
cores,
where
the
size
of
the
solution
space
remains
manageable
under
current
computing
capabilities.
The
second
cate
gory
encompasses
heuristic-based
approaches,
which
aim
to
pro
vide
ef
cient
and
practical
solutions
to
the
application
mapping
problem
on
NoC
architectures.
The
NMAP
algorithm
[13]
is
a
widely
used
heuristic
that
maps
application
cores
to
tiles
iterat
i
v
ely
.
At
each
step,
a
core
is
selected
and
assigned
to
a
tile,
repeating
the
process
until
all
cores
are
mapped.
While
NMAP
includes
an
iterati
v
e
impro
v
e-
ment
mechanism,
the
quality
of
the
nal
solutions
often
remains
constrained
by
the
initial
mapping.
BMAP
[14]
introduces
a
binomial
mapping
approach
based
on
iterati
v
e
tw
o-w
ay
mer
ging,
taking
into
account
the
traf
c
load
between
cores
to
optimize
communication.
CastNet
[15]
enhances
solution
di
v
ersity
by
le
v
eraging
the
symmetric
characteristics
of
mesh
topologies.
It
selects
multiple
initial
tiles
and
generates
se
v
eral
mapping
options,
ul
timately
choosing
the
best
one
based
on
the
number
of
free
neighboring
tiles
for
each
core.
CHMAP
[16]
calculates
a
priority
v
alue
for
each
core
based
on
its
communication
requirements
and
its
position
within
a
communication
spanning
tree.
The
algorithm
then
determines
the
mapping
order
and
assigns
cores
to
appro-
priate
tiles
according
to
these
computed
priorit
ies.
On
yx
[17],
proposed
in
2009,
introduces
a
lozenge-shaped
mapping
path
and
denes
four
mo
v
ement
patterns
to
assign
priorities
to
tiles.
This
met
hod
has
sho
wn
impro
v
ed
communication
cost
compared
to
earlier
heuristics.
Spiral
[18]
starts
by
mapping
the
highest-priority
task
at
the
center
of
the
mesh,
then
progressi
v
ely
maps
remaining
tasks
outw
ard
in
a
spiral
pattern
from
already
mapped
cores.
T
o
address
hierarchical
communication,
Dageleh
and
Jamali
[19]
proposed
V
-CastNet3D,
a
clustering-
based
mapping
algorithm.
Cores
are
grouped
into
clusters
to
reduce
inter
-cluster
communication,
and
the
CastNet
heuristic
is
then
applied
withi
n
each
cluster
for
mapping
onto
a
2D
NoC
mesh.
F
or
3D
meshes,
certain
techniques
de
v
eloped
for
2D
meshes
ha
v
e
been
adapted.
F
or
e
xample,
NMAP
3D,
ILP3D,
and
PSMAP3D.
Furthermore,
CastNet3D,
also
proposed
by
Nalci
et
al.
[12],
e
xtends
this
approach
to
3D
NoC
architectures
by
optimizing
the
utilization
of
v
ertical
links,
thereby
impro
ving
communication
ef
cienc
y
and
reducing
ener
gy
consumption.
Gi
v
en
the
high
computational
cost
of
e
xact
methods
and
the
limitations
of
current
processing
capabi
l-
ities,
metaheuristic
techniques
ha
v
e
emer
ged
as
practical
and
scalable
alternati
v
es
for
solving
the
IP
mapping
problem
in
NoC
architectures.
Dra
wing
inspir
ation
from
natural
and
biological
phenomena,
these
approaches
aim
to
ef
ciently
e
xplore
lar
ge
solution
spaces
and
produce
near
-optimal
s
olutions
within
reasonable
compu-
tational
time.
In
the
conte
xt
of
2D
NoC
architectures,
se
v
eral
metaheuristic
techniques
ha
v
e
been
proposed:
In
[20],
CGMAP
,
a
genetic
algorithm-based
approach,
replaces
the
traditional
random
initialization
with
a
chaotic
mapping
operator
,
enhanci
ng
the
e
xploration
capabilities
of
the
algorithm.
GBMAP
[21]
emplo
ys
an
e
v
olu-
tionary
algorithm
to
ef
ciently
map
processing
cores
onto
NoC
tiles.
Sw
arm
intelligence-based
approaches
ha
v
e
also
been
e
xplored.
F
or
e
xample,
a
discrete
multi-objecti
v
e
particle
sw
arm
optimization
(PSO)
method
is
proposed
in
[22],
utilizing
deterministic
initial
solutions
to
impro
v
e
performance.
In
[23],
a
h
ybrid
approach
combining
PSO
and
genetic
algorithms
further
enhances
solution
quality
.
Ant
colon
y
optimization
(A
CO)
al-
gorithm
is
used
to
minimize
bandwidth
requirements,
while
a
h
ybrid
A
CO-GA
method
is
proposed
in
[24]
to
reduce
communication
costs.
An
articial
bee
colon
y
(ABC)
algorithm
is
presented,
where
genetic
operators
are
introduced
with
ABC
in
[25]
to
impro
v
e
the
communication-a
w
are
mapping
ef
cienc
y
.
Cat
sw
arm
opti-
mization
(CSO)
[26]
is
applied
to
map
IPs
onto
2D
NoCs
with
notable
results.
Dif
ferential
e
v
olution
(DE)
is
emplo
yed
in
[27]
to
address
communication
cost
optimization
in
2D
NoC
mapping
problems.
While
these
techniques
are
primarily
applied
to
2D
NoCs,
recent
research
has
e
xtended
metaheurist
ic
strate
gies
to
3D
NoC
architectures:
In
[28],
a
PSO-based
algorithm
is
de
v
eloped
for
mapping
IPs
onto
a
partially
v
ertically-connected
3D
mesh
NoC.
A
PSO
based
mapping
method
is
introduced
in
[22]
to
impro
v
e
communication
costs
by
applying
bandwidth
limitation.
A
similar
approach
is
adopted
in
[29],
b
ut
with
the
objecti
v
e
of
minimizing
ener
gy
consumption.
Hang
et
al.
in
[28]
apply
quantum
particle
sw
arm
optimization
(QPSO)
to
the
3D
NoC
mapping
problem,
thereby
reducing
po
wer
consumption.
An
adapti
v
e
genetic
algorithm
based
on
logistic
functions
(LF
A
GA)
is
proposed
in
[30]
to
optimize
the
ener
gy
consumption
of
the
netw
ork.
Finally
,
a
method
based
on
fuzzy
logic
is
proposed
in
[31].
An
ABC-based
method
for
3D
NoC
mapping
is
introduced
in
[32],
demonstrating
impro
v
ed
results
in
communication
optimization.
A
thermal-a
w
are
mapping
technique
is
proposed
in
[33],
which
inte
grates
genetic
algorithms
with
f
uzzy
logic
to
minimize
the
peak
temperature
in
3D
NoC
systems.
Finally
,
in
[34],
simulated
annealing
is
emplo
yed
to
enhance
communication
ef
cienc
y
in
3D
NoC
mappings.
Since
our
paper
e
xplores
the
application
of
the
Bat
algorithm
(B
A),
it
is
important
to
highlight
its
prior
use
in
the
NoC
design
domain.
In
[35],
the
algorithm
w
as
applied
to
the
routing
phase,
which
follo
ws
the
generation
of
an
optimized
mapping.
In
[36],
the
Bat
algorithm
w
as
emplo
yed
for
the
rst
time
in
a
3D
Optimized
mapping
in
2D
and
3D
network
on
c
hip
using
Bat
algorithm
(Maamar
Bougher
ar
a)
Evaluation Warning : The document was created with Spire.PDF for Python.
492
❒
ISSN:
2722-2586
NoC
architecture,
b
ut
the
study
w
as
restricted
to
small-scale
real
benchmarks
with
a
limited
number
of
cores.
Similarly
,
in
[37],
B
A
w
as
applied
solely
to
the
mapping
problem
in
2D
NoC
systems.
In
[38],
B
A
w
as
applied
to
the
dynamic
mapping
problem
in
2D
NoC
systems,
where
se
v
eral
f
aulty
cores
were
introduced
as
a
simulation
scenario.
In
contrast,
the
present
w
ork
proposes
a
comprehensi
v
e
Bat
algorithm–based
mapping
strate
gy
for
both
2D
and
3D
NoC
architectures.
Extensi
v
e
e
xperiments
were
performed
using
real
benchmarks
and
syn-
thetic
applications
generated
via
TGFF
,
enabling
a
thorough
analysis
of
the
algorithm’
s
scalability
and
perfor
-
mance
as
the
number
of
cores
increas
es.
The
obtained
results
clearly
demonstrate
the
ef
cienc
y
and
rob
ustness
of
the
proposed
approach
when
compared
to
other
state-of-the-art
optimization
algorithms.
4.
APPLICA
TION
MAPPING
PR
OBLEM
In
the
conte
xt
of
mapping
problem
in
NoC,
the
application
is
typically
modeled
as
an
IP
core
graph,
while
the
NoC
infrastructure
is
represented
as
a
topology
graph.
The
IP
mapping
problem
consists
of
assigning
logical
IP
cores
dened
by
specic
communication
relationships
and
bandwidth
demands
within
the
IP
core
graph
to
ph
ysical
resource
nodes
in
the
NoC
topology
graph
[8].
4.1.
NoC
topology
graph
The
topology
of
a
NoC
is
abstracted
as
a
directed
graph
T
(
R
,
L
)
,
where:
R
is
the
set
of
nodes,
with
each
node
r
k
∈
R
representing
a
tile
in
the
NoC.
L
is
the
set
of
directed
edges,
where
each
edge
l
k
,l
∈
L
represents
a
ph
ysical
communication
link
between
tiles
r
k
and
r
l
[26].
4.2.
IP
cor
e
graph
Similarly
to
NoC
topology
,
the
IP
core
graph
is
a
high-le
v
el
abstraction
that
represents
the
beha
vior
of
an
application.
It
is
denoted
as
a
directed
graph
G
(
C
,
A
)
,
where:
C
is
the
set
of
nodes,
each
node
c
i
∈
C
representing
an
IP
core,
and
A
is
the
set
of
directed
edges,
where
each
edge
a
i,j
∈
A
represents
communication
from
IP
source
core
c
j
to
destination
core
c
i
[23].
Each
edge
a
i,j
∈
A
is
associated
with
a
weight
w
i,j
,
which
denotes
the
bandwidth
requirement
between
the
IPs
c
j
and
c
i
.
4.3.
Mapping
function
in
NoC
The
mapping
function
denes
the
mapping
of
IP
cores
from
the
IP
core
graph
G
(
C
,
A
)
to
nodes
in
the
NoC
topology
graph
T
(
R
,
L
)
,
with
the
objecti
v
e
of
mini
mizing
communication
ener
gy
[27].
F
or
each
IP
core
c
i
∈
C
,
there
e
xists
a
corres
ponding
node
r
k
∈
R
in
the
NoC
topology
such
that:
map:
V
→
T
map
(
c
i
)
=
r
k
,
∀
c
i
∈
C
∃
r
j
∈
R
Furthermore,
each
IP
core
must
be
mapped
to
one
node,
ensuring
that
for
an
y
tw
o
IPs
c
i
and
c
j
:
map(c
i
)
̸
=
map
(
c
j
)
T
o
satisfy
this
constraint,
the
number
of
nodes
in
the
NoC
topology
graph
must
be
greater
than
or
equal
to
the
number
of
nodes
in
the
IP
core
graph.
If
ne
cessary
,
dummy
nodes
can
be
added
to
the
IP
core
graph
to
equalize
the
sizes
of
both
graphs
[25].
These
dummy
nodes
are
non-communicating
placeholders
and
do
not
participate
in
an
y
data
e
xchanges.
4.4.
Solution
r
epr
esentation
The
mapping
solution
is
represented
as
a
sequence
with
a
permutation
of
the
inte
gers
fr
om
1
to
n
,
where
n
is
the
number
of
nodes
in
the
NoC
topology
graph
[26].
Each
sequence
position
represents
a
node
in
the
NoC
topology
,
and
its
v
alue
denotes
the
ID
of
the
assigned
IP
cor
e.
An
e
xample
mapping
solution
is
sho
wn
in
T
able
1.
T
able
1.
Solution
e
xample
cor
e
number
1
2
3
4
5
6
7
8
9
node
place
8
3
2
1
9
5
6
7
4
4.5.
Communication
cost
The
main
objecti
v
e
of
this
paper
is
to
reduce
communication
cost
by
minimizing
the
number
of
hops
for
each
communication
when
tw
o
tasks
are
mapped
onto
the
NoC.
The
total
communication
cost
is
calculated
using
(1)
[34].
commcost
=
X
i
X
j
w
i,j
∗
nbhops
(
r
k
,
r
l
)
,
(1)
IAES
Int
J
Rob
&
Autom,
V
ol.
15,
No.
2,
June
2026:
488-502
Evaluation Warning : The document was created with Spire.PDF for Python.
IAES
Int
J
Rob
&
Autom
ISSN:
2722-2586
❒
493
where
c
i
=
sour
ce
,
c
j
=
distination
,
map
(
c
i
)
=
r
k
,
map
(
c
j
)
=
r
l
,
c
i
̸
=
c
j
,
and
r
k
̸
=
r
l
.
nbhops
()
is
the
number
of
hops
between
the
source
and
destination,
calculated
using
the
Manhattan
distance,
dened
by
the
follo
wing
function:
H
ops
(
r
k
,
r
l
)
=
|
X
k
−
X
l
|
+
|
Y
k
−
Y
l
|
+
|
Z
k
−
Z
l
|
(2)
where
(
X
k
,
Y
k
,
Z
k
)
and
(
X
l
,
Y
l
,
Z
l
)
represent
the
coordinates
of
the
nodes
r
k
and
r
l
within
a
3D
mesh
NoC,
respecti
v
ely
.
F
or
a
2D
mesh
NoC,
Z
k
and
Z
l
are
equal
to
0
.
5.
APPLICA
TION
MAPPING
WITH
B
A
T
ALGORITHM
5.1.
Bat
in
natur
e
Microbats,
a
subgroup
of
bats,
are
the
only
mammals
capable
of
sustained,
acti
v
e
ight.
The
y
possess
an
e
xtraordinary
ability
kno
wn
as
echolocation,
which
enables
them
to
na
vig
ate
and
hunt
ef
ciently
in
complete
darkness.
T
o
detect
and
locate
their
pre
y
,
microbats
emit
high
frequenc
y
ultrasonic
pu
l
ses
typically
ranging
from
25
to
150
kHz
[39].
Each
short
pulse,
lasting
approximately
5
to
20
milliseconds,
generates
an
echo
upon
striking
an
object
or
pre
y
.
By
analyzing
these
echoes,
the
bat
can
precisely
determine
the
distance,
direction,
and
e
v
en
the
size
of
the
tar
get.
During
the
initial
search
phase,
a
bat
emits
an
a
v
erage
of
10
to
20
pulses
per
second.
Ho
we
v
er
,
as
it
closes
in
on
its
pre
y
,
it
dramatically
increases
the
pulse
rate
up
to
200
pulses
per
second—while
simultaneously
reducing
the
intensity
of
its
emitted
signals
[40].
This
adapti
v
e
strate
gy
helps
a
v
oid
echo
o
v
erlap
and
enhances
localization
accurac
y
.
The
bat’
s
ight
path
adjusts
continuously
in
response
to
real
time
acoustic
feedback,
allo
wing
for
agile
and
precise
mo
v
ements.
This
remarkable
natural
beha
vior—balancing
global
e
xploration
when
the
pre
y
is
distant
with
ne
tuned
e
xploitation
as
it
nears
the
tar
get
has
inspired
the
de
v
elopment
of
a
nature-inspired
optimization
tech-
nique
kno
wn
as
the
Bat
algorithm
(B
A).
Initially
proposed
to
address
comple
x
global
optimization
problems,
the
B
A
mimics
k
e
y
aspects
of
echolocation:
frequenc
y
v
ariation,
loudness
modulation,
and
dynamic
search
adaptation[41].
B
A
has
demonstrated
strong
performa
n
c
e
across
v
arious
technical
domains,
including
multi
objecti
v
e
optimization,
machine
learning,
and
scheduling
problems.
Its
strength
lies
in
its
ability
to
seam-
lessly
combine
broad
e
xploration
of
the
solution
space
with
intensi
v
e
local
reneme
nt,
mirroring
the
intelligent
hunting
strate
gies
of
bats
in
the
wild[42].
5.2.
Bat
algorithm
steps
The
Bat
algorithm,
a
nature
inspired
optimization
method,
w
as
proposed
by
Y
ang
in
2010
[39].
It
is
based
on
the
echolocation
beha
vior
of
microbats,
which
use
high-frequenc
y
sound
pulses
to
locate
pre
y
and
a
v
oid
obstacles
especially
during
twilight
hours.
This
natural
mechanism
w
as
mathematically
modeled
to
create
a
no
v
el
and
ef
cient
optimization
technique
that
simulates
the
dynamic
and
adapti
v
e
ight
patterns
of
bats
in
the
wild.
This
section
presents
the
steps
of
the
Bat
algorithm,
inspired
by
the
real-life
beha
vior
of
bats
in
nature.
T
o
dene
the
algorithm,
Y
ang
introduced
six
k
e
y
steps,
which
are
outlined
belo
w
[40].
a.
Step
1:
Initialization
of
Bat
population
and
problem
parameters
Consider
a
d-dimensional
search
space
represented
by
a
population
of
bats.
The
total
number
of
bats
(ock
size)
is
denoted
by
N
.
The
position
of
each
bat
i
at
iteration
t
is
e
xpressed
as
a
v
ector
in
this
d-dimensional
space
[41].
X
t
i
=
[
x
i,
1
,
x
i,
2
,
x
i,
3
.
.
.
x
i,d
]
(3)
The
Bat
algorithm
be
gins
by
randomly
initializing
a
populati
o
n
of
virtual
bats
and
setting
the
parameters
re-
quired
for
the
optimization
problem.
The
objecti
v
e
is
to
e
v
aluate
the
quality
of
each
candidate
solution
x
by
computing
the
objecti
v
e
function
f
(
x
)
,
as
dened
in
the
problem.
b
.
Step
2:
Bat
population
memory
initialization
Each
bat
i
is
equipped
with
a
memory
that
stores
the
location
of
its
personal
best
position
m
i
.
At
iteration
t
,
this
corresponds
to
the
best
position
found
by
bat
i
while
e
xploring
the
search
space.
During
the
initialization
stage,
the
position
of
each
bat
is
randomly
generated.
Once
generated,
all
solution
v
ectors
are
stored
in
the
bat
memory
[40].
At
this
point,
the
best
global
solution,
denoted
as
X
best
,
is
init
ialized
and
corresponds
to
the
best
bat
position
in
the
memory
according
to
the
objecti
v
e
function.
Optimized
mapping
in
2D
and
3D
network
on
c
hip
using
Bat
algorithm
(Maamar
Bougher
ar
a)
Evaluation Warning : The document was created with Spire.PDF for Python.
494
❒
ISSN:
2722-2586
c.
Step
3:
Bat
mo
v
ement
In
this
step,
each
bat
i
ies
at
a
speed
v
i
assigned
to
a
randomly
generated
frequenc
y
f
i
.
The
ne
w
position
of
bat
i
in
the
search
space
is
updated
as
follo
ws
[41]:
V
t
+1
i
=
V
t
i
+
(
X
t
i
−
X
best
)
f
i
(4)
X
t
+1
i
=
X
t
i
+
V
t
+1
i
(5)
f
i
=
f
min
+
(
f
min
−
f
max
)
β
(6)
where
β
∈
[0
..
1]
.
The
position
of
bat
i
is
updated
based
on
its
pre
vious
position
and
increased
by
a
relati
v
ely
small
v
alue.
This
v
alue
is
inherited
as
the
distance
between
the
global
best
position
and
the
parent
bat
becomes
close
to
the
descendant
bat.
d.
Step
4:
Intensication
of
the
current
bat
population
This
step
introduces
controlled
randomness
into
the
Bat
algorit
hm
through
a
local
search
mechanism.
W
ith
a
certain
probability
dened
by
the
pulse
rate
R
i
,
each
bat
may
perform
a
local
random
w
alk
around
the
best-kno
wn
solution
in
the
population.
A
hist
orically
good
solution
m
i
is
selected
from
the
current
best
indi
viduals,
and
the
ne
w
position
of
bat
i
is
updated
as
follo
ws
[43]:
X
t
+1
i
=
(
m
i
+
ϵA
t
if
r
and
>
R
i
X
t
i
+
V
t
+1
i
other
w
ise
(7)
here,
ϵ
is
a
random
number
dra
wn
from
a
uniform
distrib
ution
[
−
1
..
1]
,
and
A
t
represents
the
a
v
erage
loudness
of
all
bats
in
the
population
at
iteration
t
.
This
strate
gy
allo
ws
bats
to
e
xploit
promising
areas
of
the
search
space
while
still
maintaining
some
de
gree
of
randomness.
e.
Step
5:
Updating
the
bat
population
memory
F
or
each
bat
in
the
population
memory
(BM),
the
ne
wly
generated
solution
x
′
i
,
x
′
j
may
replace
the
current
solution
x
i
,
x
j
under
the
follo
wing
conditions
[44]:
(
f
(
X
t
+1
i
)
<
f
(
X
best
)
R
and
(0
..
1)
<
A
t
i
(8)
Similar
to
natural
beha
vior
,
the
loudness
A
i
and
the
pulse
emission
rate
R
i
are
updated
dynamically
during
the
optimization
process.
As
the
iterations
progress,
A
i
gradually
decreases,
whereas
R
i
increases
according
to
[41]:
A
t
+1
i
=
α
A
t
i
(9)
R
t
+1
i
=
R
0
i
(1
−
e
(
−
σ
×
t
)
)
(10)
where
R
t
i
→
R
0
i
,
A
t
i
→
0
,
and
t
→
∞
.
R
0
i
is
the
maximum
pulse
emission
rate.
α
∈
[0
..
1]
is
the
loudness
damping
f
actor
,
σ
>
0
is
the
pulse
rate
increment
f
actor
,
and
t
is
the
current
it
eration
number
.
The
parameter
α
is
comparable
to
the
cooling
f
actor
in
simulated
annealing.
f.
Step
6:
T
ermination
criterion
The
Bat
algorithm
repeats
steps
3
through
5
until
a
termination
condition
is
satised.
Common
stop-
ping
criteria
include:
−
Reaching
a
maximum
number
of
generations,
−
Exceeding
a
computational
time
limit,
−
A
x
ed
number
of
non-impro
ving
iterations,
−
Achie
ving
a
desired
objecti
v
e
v
alue
(solution
quality
threshold).
IAES
Int
J
Rob
&
Autom,
V
ol.
15,
No.
2,
June
2026:
488-502
Evaluation Warning : The document was created with Spire.PDF for Python.
IAES
Int
J
Rob
&
Autom
ISSN:
2722-2586
❒
495
Once
the
termination
condition
is
met,
the
algorithm
halts
and
returns
the
best
solution
found
during
the
search
process.
The
details
of
the
algorithm
steps
are
presented
in
algorithm
1.
Algorithm
1.
The
Bat
Algorithm
Generate
the
population
of
bats
Ev
aluate
each
bat
using
the
objecti
v
e
function
Initialize
the
memory
of
each
bat
Initialize
the
v
elocity
of
each
bat
Initialize
R
0
i
and
A
0
i
for
each
bat
Initialize
the
frequenc
y
f
i
of
each
bat
using
(6)
Sa
v
e
the
best
global
solution
iter
←
0
while
iter
<
maximum
number
of
iterations
do
f
or
each
bat
i
do
Update
v
elocity
and
position
using
(4)
and
(5)
Generate
a
random
number
r
a
nd
1
for
use
in
(7)
Ev
aluate
the
ne
w
position
of
the
bat
Generate
a
random
number
r
a
nd
2
for
use
in
(8)
Incr
ease
R
i
and
decrease
A
i
using
(9)
and
(10)
Update
the
memory
of
the
bat
Update
the
best
global
solution
end
f
or
iter
←
iter
+
1
end
while
r
etur
n
the
best
global
solution
5.3.
Modeling
Bat
algorithm
f
or
mapping
pr
oblem
In
order
to
model
the
Bat
algorithm
so
that
it
can
ef
fecti
v
ely
addres
s
the
mapping
problem,
we
ha
v
e
detailed
its
steps
by
aligning
them
with
the
specic
requirements
of
the
problem.
a.
Step
1:
Initialization
of
Bat
population
and
problem
parameters
At
this
stage,
the
parameters
required
for
calibrating
the
Bat
Algorithm
are
initialized
by
dening
the
optimization
problem,
objecti
v
e
function,
constraints,
and
k
e
y
algorithm
settings
to
achie
v
e
optimal
perfor
-
mance.
−
N
:
number
of
bats
(population
size).
−
maxt
iter
:
maximum
number
of
iterations.
−
R
0
i
:
maximum
pulse
emission
rate.
−
A
i
:
loudness
of
bat
i
.
−
σ
and
α
:
constants.
−
f
min
and
f
max
:
minimum
and
maximum
frequencies
used
by
the
bats.
b
.
Step
2:
Initialization
of
Bat
positions
and
memories
Randomly
position
N
bats
in
a
d
-dimensional
search
space,
where
each
bat
represents
a
feasible
solution
to
the
placement
problem
according
to
(3).
Each
bat
stores
its
best-found
position
in
memory
,
which
is
initially
set
to
its
current
position
at
iteration
0.
c.
Step
3:
Ev
aluation
of
initial
tness
F
or
each
bat,
e
v
aluate
the
quality
of
its
current
position
by
substituting
the
corresponding
decision
v
ariable
v
alues
into
the
objecti
v
e
function
and
sa
v
e
the
global
best
solution.
d.
Step
4:
Generation
of
ne
w
positions
Each
bat
updates
its
position
i
n
the
sea
rch
space
us
ing
its
v
elocity
and
the
global
best
solution
accord-
ing
to
(4)
and
(5).
The
feasibility
of
the
ne
wly
generated
position
is
e
v
aluated
based
on
(8).
e.
Step
5:
Feasibility
checking
of
ne
w
positions
F
or
each
bat,
the
feasibility
of
the
ne
wly
generated
position
is
e
v
aluated
according
to
the
problem
constraints.
If
the
position
is
feasible,
the
bat
updates
its
current
position
accordingly
.
f.
Step
6:
P
arameter
updating
In
this
step,
each
bat
updates
its
memory
and
adjusts
its
parameters
by
increasing
its
pulse
em
ission
rate
and
decreasing
its
loudness
according
to
(9)
and
(10).
Optimized
mapping
in
2D
and
3D
network
on
c
hip
using
Bat
algorithm
(Maamar
Bougher
ar
a)
Evaluation Warning : The document was created with Spire.PDF for Python.
496
❒
ISSN:
2722-2586
g.
Step
7:
Stopping
criterion
checking
The
loop
comprising
steps
5
to
6
is
repeated
until
the
predened
maximum
number
of
iterations
(iter
max
)
is
reached.
Once
the
stopping
criterion
is
satised,
the
best
solution
stored
in
memory
,
in
terms
of
the
objecti
v
e
function
v
alue,
is
returned
as
the
nal
solution
to
the
optimization
problem.
6.
EXPERIMENT
RESUL
T
T
o
e
v
aluat
e
the
performance
of
a
NoC
architecture,
a
v
ariety
of
benchmarks
are
commonly
utilized
[17].
A
benchmark
typically
consists
of
a
set
of
interdependent
tasks
that
emulate
real
applicati
on
w
orkloads.
These
tasks
represent
computat
ional
elements
and
the
communication
between
them,
allo
wing
designers
to
e
v
aluate
architectural
aspects
s
uch
as
ener
gy
ef
cienc
y
and
communication
cost.
In
this
study
,
tw
o
main
cate-
gories
of
benchmarks
are
considered:
a.
Real-w
orld
benchmarks
These
benchmarks
are
deri
v
ed
from
actual
multimedia
applications
and
are
frequently
used
for
e
v
al-
uating
the
performance
of
NoC
architectures.
Fi
gures
3(a)
to
3(d)
illustrates
four
widely
adopted
benchmarks,
video
object
plane
decoder
(V
OPD),
mo
ving
picture
e
xperts
group
(MPEG4),
picture
in
picture
(PIP),
multi-
windo
w
display
(MWD).
(a)
(b)
(c)
(d)
Figure
3.
Benchmarks
used
in
the
e
xperiments
(a)
V
OPD
–
video
object
plane
decoder
,
(b)
MPEG4
–
mo
ving
picture,
(c)
PIP
–
picture
in
picture,
and
(d)
MWD
–
multi-windo
w
display
e
xperts
group
b
.
Synthetic
benchmarks
These
benchmarks
are
automatically
generated
using
the
task
graphs
for
free
(TGFF)
tool
[45],
which
enables
the
creation
of
v
arious
task
graphs
with
controlled
properties
such
as
task
count,
communication
v
ol-
ume,
and
e
x
ecution
time.
Synthetic
benchmarks
play
an
important
role
in
e
v
aluating
the
performance
of
NoC
architectures
under
di
v
erse
scenarios.
In
this
w
ork,
v
e
benchmarks
generated
using
TGFF
and
commonly
adopted
in
NoC
studies
[46]
are
utilized
for
testing.
6.1.
Setting
Before
presenting
the
e
xperimental
results,
we
detail
the
conguration
parameters
used
to
calibrate
the
algorithm
for
optimal
performance,
as
well
as
the
2D
and
3D
topologies
emplo
yed
(T
ables
X
and
Y).
F
or
real-
w
orld
benchmarks,
V
OPD,
MPEG4,
and
MWD
were
mapped
onto
a
4×4
2D
NoC
and
a
2×4×2
3D
architecture,
while
the
smaller
PIP
benchmark
w
as
mapped
onto
a
2×2×2
3D
topology
.
F
or
synthetic
benchmarks,
TG0,
TG1,
and
TG2
were
mapped
onto
a
6×6
2D
topology
and
a
3×3×3
3D
topology
,
whereas
TG3
and
TG4
were
mapped
onto
an
8×8
2D
topology
and
a
4×4×4
3D
topology
.
Algorithm
calibration
w
as
performed
through
multiple
tests,
with
the
population
size
set
to
500
and
the
maximum
number
of
ite
rations
set
to
1500.
The
initial
loudness
A
i
w
as
set
to
0
.
25
,
the
pulse
emission
rate
r
i
to
0
.
25
,
and
the
frequenc
y
range
w
as
dened
between
0
and
500
.
6.2.
Result
and
discussion
The
results
reported
in
T
able
2
present
the
performance
of
the
Bat
mapping
approach
applied
to
both
2D
and
3D
NoC
topologies
across
Synthetic
benchmarks.
and
The
graphical
representations
in
Figure
4
allo
w
for
a
comparison
of
the
mapping
results
achie
v
ed
by
the
Bat
algorithm
in
2D
and
3D
NoC.
IAES
Int
J
Rob
&
Autom,
V
ol.
15,
No.
2,
June
2026:
488-502
Evaluation Warning : The document was created with Spire.PDF for Python.
IAES
Int
J
Rob
&
Autom
ISSN:
2722-2586
❒
497
The
data
clearly
indicate
that
the
3D
NoC
conguration
consistently
outperforms
its
2D
counte
rpart,
underscoring
the
benets
of
e
xploiting
the
third
dimension
in
task-to-core
mapping.
These
results
demonstrate
that
incorporating
the
additional
spatial
dimension
not
only
optimizes
resource
allocation
b
ut
also
enables
more
ef
cient
handling
of
comple
x
application
graphs,
which
is
often
challenging
in
traditional
2D
NoCs.
T
able
2.
Comparison
of
obtained
results
Benchmarks
#
T
ask
#
Edge
Bat
2D
Bat
3D
TG0
12
13
30300
18200
TG1
16
16
43600
34400
TG2
27
27
89000
61800
TG3
30
34
109400
109300
TG4
50
59
271700
201700
TG0
TG1
TG2
TG3
TG4
0
1
2
3
·
10
5
Syn
thetic
b
enc
hmarks
Comm
unication
Cost
2D
3D
Figure
4.
Comparison
between
2D
and
3D
mapping
T
o
thoroughly
e
v
aluate
t
he
ef
fecti
v
eness
of
our
Bat-based
mapping
approach,
we
rst
compared
it
with
impro
v
ed
v
ersions
of
algorithms
pre
viously
proposed
by
the
same
authors
for
3D
NoC
architectures,
namely
ABC
[25]
and
simulated
annealing
[34].
This
initial
comparison
pro
vides
a
measure
of
the
competiti
v
eness
of
our
method
ag
ainst
already
optimized
reference
techniques.
The
results
presented
in
T
able
3
summarize
the
performance
of
the
Bat
mapping
approach
applied
to
2D
and
3D
NoC
topologies
using
synthetic
benchmarks.
Furthermore,
the
graphical
illustrations
in
Figure
5
pro
vide
a
clear
comparison
of
the
mapping
outcomes
ob-
tained
for
both
2D
and
3D
NoCs,
enabling
a
deeper
analysis
of
the
dif
ferences
between
the
tw
o
architectures
in
comparison
with
Bat,
SA,
and
ABC
approaches.
T
able
3.
Comparison
results
in
synthetic
benchmarks
Benchmarks
SA
2D
ABC
2D
Bat
2D
SA
3D
ABC
3D
B
at
3D
TG0
32400
49300
30300
27800
31700
18200
TG1
39700
63400
43600
37200
40600
34400
TG2
94700
115100
89000
78000
73000
61800
TG3
–
179900
109400
–
121100
109300
TG4
–
326400
271700
–
199900
201700
TG0
TG1
TG2
TG3
TG4
TG0
TG1
TG2
TG3
TG4
0
1
2
3
·
10
5
NoC
2D
NoC
3D
Comm
unication
Cost
SA
ABC
Bat
Figure
5.
Result
obtained
in
the
synthetic
benchmarks
Optimized
mapping
in
2D
and
3D
network
on
c
hip
using
Bat
algorithm
(Maamar
Bougher
ar
a)
Evaluation Warning : The document was created with Spire.PDF for Python.