TELK
OMNIKA
T
elecommunication,
Computing,
Electr
onics
and
Contr
ol
V
ol.
18,
No.
3,
June
2020,
pp.
1229
∼
1236
ISSN:
1693-6930,
accredited
First
Grade
by
K
emenristekdikti,
No:
21/E/KPT/2018
DOI:
10.12928/TELK
OMNIKA.v18i3.13174
❒
1229
On-chip
deb
ugging
f
or
micr
opr
ocessor
design
F
ajar
Suryawan,
Bana
Handaga,
Abdul
Basith
Department
of
Electrical
Engineering,
Uni
v
ersitas
Muhammadiyah
Surakarta,
Indonesia
Article
Inf
o
Article
history:
Recei
v
ed
May
20,
2019
Re
vised
Jan
22,
2020
Accepted
Feb
21,
2020
K
eyw
ords:
Engineering
education
Field
programmable
g
ate
array
Microprocessor
design
Post-silicon
deb
ug
Programmable
logic
ABSTRA
CT
This
article
proposes
a
closer
-to-metal
approach
of
R
TL
inspection
in
microprocessor
design
for
use
in
education,
engineering,
and
research.
Signals
of
interest
are
tapped
throughout
the
m
icroprocessor
hierarchical
design
and
are
then
output
to
the
top-le
v
el
entity
and
finally
displayed
to
a
V
GA
monitor
.
Input
clock
signal
can
be
fed
as
slo
w
as
one
wish
to
trace
or
deb
ug
the
microprocessor
being
designed.
An
FPGA
de
v
elopment
board,
along
with
its
accompan
ying
softw
are
package,
is
used
as
the
design
and
test
platform.
The
use
of
VHDL
commands
’
t
ype’
and
’
record’
in
the
hierarch
y
pro
vides
k
e
y
ingredients
in
the
o
v
erall
design,
since
this
allo
ws
simple,
clean,
and
tractable
code.
The
m
ethod
is
tested
on
MIPS
single-c
ycle
microprocessor
blueprint.
The
result
sho
ws
that
the
technique
produces
more
consistent
display
of
the
true
contents
of
re
gisters,
ALU
input/
output
signals,
and
other
wires
–
compared
to
the
standard,
widely-used
simulation
method.
This
approach
is
e
xpected
to
increase
confidence
in
students
and
designers
since
the
reported
signals’
v
alues
are
the
true
v
alues.
Its
use
is
not
li
mited
to
the
de
v
elopment
of
microprocessors;
e
v
ery
FPGA-
based
digital
design
can
benefit
from
it.
This
is
an
open
access
article
under
the
CC
BY
-SA
license
.
Corresponding
A
uthor:
F
ajar
Surya
w
an,
Department
of
Electrical
Engineering,
Uni
v
ersitas
Muhammadiyah
Surakarta,
Jl.
A.
Y
ani,
T
romol
Pos
1,
Suk
oharjo,
Ja
w
a
T
eng
ah,
Indonesia.
Email:
F
ajar
.Surya
w
an@ums.ac.id
1.
INTR
ODUCTION
Digital
de
vice
de
v
elopment
depends
greatly
on
precise
understanding
ho
w
data
propag
ate
between
basic
digital
logic
units,
also
called
r
e
gister
tr
ansfer
le
vel
.
In
design
phase,
designers
often
use
simulation
procedures
to
check
whether
their
designs
meet
the
logic
requirements.
An
e
xample
of
this
is
also
encountered
in
senior
le
v
el
electrical/computer
engineering
bachelor
-de
gree
courses
such
as
Pr
o
gr
ammable
Lo
gic
Design
or
Computer
Ar
c
hitectur
e
[1–4].
In
such
courses,
students
are
ask
ed
to
design
a
micro-architecture
of
a
microprocessor
based
on
a
gi
v
en
architecture
(that
is,
the
assembly
language
requirement).
Students
then
write
HDL
code
representing
the
micro-architecture
and
test
the
design
ag
ainst
a
set
of
instructions.
T
esting
is
generally
done
in
simulation
and,
after
a
number
of
testing-coding
iterations,
hardw
are
test
is
performed.
Softw
are
simulation
is
indispensable
for
its
quick
setting,
f
ast
compilation,
and
–
pro
vided
the
designer
is
e
xperienced
–
accurac
y
.
Ho
we
v
er
,
misma
tches
between
simulation
and
synthesized
hardw
are
are
not
entirely
unheard
of,
e
v
en
for
simple
design.
Mismatches
also
occur
betwee
n
pre-synthesis
and
post-synthesis
simulations.
T
o
mak
e
matter
w
orse,
in
post-synthesis
(netlist)
simulat
ion
one
generally
can
only
monitor
the
top-le
v
el
ports;
signals
deeper
in
hierarch
y
are
inaccessible.
T
o
address
this
problem,
we
propose
a
closer
-to-
metal
approach
for
the
re
gister
transfer
le
v
el
inspection.
Ef
fecti
v
ely
,
this
is
an
on-chip
deb
ugging
technique
J
ournal
homepage:
http://journal.uad.ac.id/inde
x.php/TELK
OMNIKA
Evaluation Warning : The document was created with Spire.PDF for Python.
1230
❒
ISSN:
1693-6930
where
signals
of
interest
are
brought
up
to
the
top
le
v
el
for
output
reading.
The
term
on-c
hip
deb
ug
generally
refers
to
a
technique
in
microprocessor
(or
other
digital
de
vice)
design
where
a
designer
can
inject
a
f
ault
in
the
de
vice
under
de
v
elopment
t
o
test
its
f
ault
tolerant
beha
vior
[5–
8].
F
or
microprocessors,
this
is
usually
done
using
JT
A
G
protocol
based
on
NEXUS
Consortium
standard.
In
[9],
on-chip
deb
ug
is
used
in
high
le
v
el
synthesis
for
FPGAs.
On-chip
deb
ug
has
been
a
concern
from
the
be
ginning
of
computer
era.
FPGA
has
also
tak
en
part
in
this
field.
F
or
e
xample,
w
ork
by
J
amal
et.al
[10,
11]
proposes
better
functional
changes
during
on-chip
deb
ug,
utilizing
FPGA
o
v
erlay
architecture.
Contemporary
w
orks
i
n
this
field,
particularly
post-silicon
deb
ug,
can
be
found
in
[12–15].
Indeed,
post-silicon
deb
ug
readiness
needs
to
be
prepared
early
in
the
design
[16–18].
A
number
of
authors
e
xtend
the
idea
to
other
areas
such
as
machine
learning
[19,
20].
In
this
article
we
describe
on-c
hip
deb
ug
more
in
its
literal
meaning.
That
is,
the
process
of
deb
ugging
a
microprocessor
in
which
the
deb
ug
capability
is
embedded
in
the
hardw
are
design.
Our
contrib
ution
lies
in
the
follo
wing
aspects.
First,
we
propose
a
simple
hardw
are-oriented
deb
ugging
method
for
use
in
an
y
digital
design.
W
e
hope
this
introductory
notion
will
trigger
student
s’
creati
v
e
f
aculty
to
solv
e
some
challenging
problems
that
otherwise
dif
ficult
to
tackle.
Second,
we
describe
–
in
a
tutorial
w
ay
–
the
construction
of
a
simple
on-chip
deb
ugging
feature
in
the
design
of
a
microprocessor
using
‘type’
and
‘record’
in
vhdl.
W
e
belie
v
e
this
will
help
students
and
designers
easily
duplicate
our
w
ork.
2.
RESEARCH
METHOD
This
research
starts
with
a
list
of
design
requirements
for
an
on-chip
deb
ugging
feature
in
a
microprocessor:
(a)
non-intrusi
v
eness:
the
deb
ugging
feature
should
be
as
discreet
as
possible
so
as
not
to
obstruct
the
main
design
(b)
meaningful
message:
the
interf
ace
to
human
reader
should
be
immediately
readable
(c)
easy
to
modify:
when
the
designer
w
ants
to
tap
other
signals
in
the
microprocessor
design,
it
should
be
straightforw
ard
to
do
so
The
main
idea
of
this
chip
deb
ugging
feature
is
sho
wn
in
Figure
1.
Microproces
s
or
Control
Unit
Datapath:
ALU
Registers
Memory
Rep
o
rting
modu
l
e
VGA
Dis
p
la
y
Cloc
k
Figure
1.
On-chip
deb
ugging
principles
in
this
paper
.
The
ne
xt
step
is
b
uilding
a
model
microprocessor
from
scratch
using
VHDL
in
an
FPGA
chip
(an
Altera
DE2-115
de
v
elopment
board
w
as
used).
Here
we
use
a
scaled-do
wn
v
ersion
of
the
MIPS
microprocessor
architecture
[21,
22]
as
our
proof
of
concept.
MIPS
architecture
has
the
adv
antages
of,
among
others,
being
simple
and
consistent
for
students
to
follo
w
.
MIPS
is
of
RISC-architecture
and
originated
as
a
pedagogical
model
at
Stanford
Uni
v
ersity
.
W
e
de
v
eloped
the
chip-deb
ugging
feature
based
on
a
MIPS
implementation
described
by
Harris
and
Harris
[23].
In
this
stage
a
V
GA
displa
y
module
were
also
b
uilt,
consisting
of
a
vg
a
sync
module,
tw
o
character
memory
modules,
and
a
font
R
OM.
The
e
xperimental
hardw
are
setup
is
sho
wn
in
Figure
2.
W
e
e
xtend
the
single-c
ycle
M
IPS
instruction
set
found
in
the
main
te
xt
of
[23]
by
constructing
a
number
of
ne
w
instructions
and
‘re
wire’
the
datapath
as
needed.
W
e
then
v
erify
–
using
the
proposed
on-
chip
deb
ugging
technique
–
that
the
final
result
w
orks
as
e
xpected.
In
the
ne
xt
section
we
will
discuss
the
engineering
design
in
more
detail.
TELK
OMNIKA
T
elecommun
Comput
El
Control,
V
ol.
18,
No.
3,
June
2020
:
1229
–
1236
Evaluation Warning : The document was created with Spire.PDF for Python.
TELK
OMNIKA
T
elecommun
Comput
El
Control
❒
1231
Figure
2.
Hardw
are
setup.
Altera
DE2-115
de
v
elopment
board
is
used
to
test
the
proposed
technique,
together
with
a
1024x768
resolution
monitor
.
3.
PR
O
T
O
TYPE
DESIGN
The
design
requires
se
v
eral
aspects
to
w
ork
seamlessly
together
.
These
are:
processor
design
with
its
signal
inspection,
information
display
,
and
e
xperiment
design.
3.1.
Pr
ocessor
design
and
signal
inspection
As
mentioned
before,
the
approach
w
orks
by
sensing
internal
microprocessor
signals
(including
mem-
ory
access
ones)
and
sending
them
up
through
the
design
hierarch
y
.
Using
hierarchical
design
implies
that
man
y
entities
and
files
are
used,
which
poses
a
ne
w
challenge
on
ho
w
to
tap
signals
from
dif
ferent
entities
in
a
straightforw
ard
and
unobtrusi
v
e
w
ay
.
The
signal
tapping
as
sho
wn
in
Figure
1
is
implemented
using
a
shared
b
us
that
is
a
v
ailable
across
the
hierarch
y
.
Figure
3
sho
ws
the
or
g
anization
of
modules
that
mak
e
up
the
entire
microprocessor
.
ALU
PC
Si
g
n
exte
n
d
M
a
i
n
d
eco
d
er
AL
U
d
ec
o
d
er
I
n
str
.
M
emo
r
y
cl
o
ck
Da
t
a
M
emo
r
y
Da
t
ap
at
h
MI
PS
Proc
es
s
or
Micr
o
pr
oces
so
r
Sy
stem
Rep
o
r
ti
n
g
m
o
d
u
l
e
a
n
d
d
i
s
p
l
a
y s
ys
tem
Te
st sui
t
e
in
FPG
A
t
Sh
i
f
t
Reg
i
s
ter
s
(
32
-
b
i
t
s
x
3
2
)
r
V
G
A
d
i
s
p
l
a
y
cter
Figure
3.
Module
or
g
anization
and
hierarch
y
.
The
small
red
blocks
are
instantiations
of
the
record
entity
that
encapsulates
the
tapped
signals’
information.
The
red
‘cable’
then
acts
as
a
deb
ug-b
us
across
the
hierarch
y
.
The
arithmetic
lo
gic
unit
(ALU)
does
the
arithmetic
computation
with
the
help
of
a
set
of
re
gisters
(32-bits
×
32)
that
functions
as
a
scratchpad
for
the
ALU.
Pr
o
gr
am
counter
(PC)
acts
as
a
pointer
to
instruction.
Sign-e
xtend
module
e
xtends
less-than-32-bit-wide
numbers
(for
e
xample
in
immediate-type
instructions)
to
its
32-bit
representation.
The
shift
module
functions
as
bit-wise
shifter
.
All
these
are
in
the
“datapath”
module
On-c
hip
deb
ug
ging
for
micr
opr
ocessor
design
(F
ajar
Suryawan)
Evaluation Warning : The document was created with Spire.PDF for Python.
1232
❒
ISSN:
1693-6930
which
also
hosts
a
number
of
multiple
x
ers
controlling
which
w
ay
data
w
ould
flo
w
into.
Controlling
is
done
by
the
“controller”
module
outside
the
datapath,
which
decode
32-bit
instructions
from
the
instructions
memory
.
Datapath
and
Controller
forms
the
”MIPS
processor”
module.
T
ogether
with
Instruction
Memory
and
Data
Memory
modules,
the
y
mak
e
up
the
complete
microprocessor
system.
The
signal
tapping,
sho
wn
as
red
blocks
in
Fi
gure
3,
is
a
record-type
entity
instantiated
at
e
v
ery
mod-
ules
of
interest.
Acting
as
a
“deb
ugging
b
us”,
this
record
is
ready
to
accept
the
v
alue
of
an
y
signal
of
interest
in
e
v
ery
le
v
el
in
the
hierarch
y
.
Since
the
b
us
is
logically
encapsulated,
it
does
not
obstruct
the
main
design.
Mod-
ifying
the
b
us’
content
is
straightforw
ard
and
can
be
done
once
in
the
de
finition,
without
the
need
to
change
an
y
code
in
the
instantiation
part.
--
File
name:
myfuns.vhd
--
...
Other
statements
...
--
'Regsbundler'
below
is
for
collecting
--
contents
of
registers.
--
There
are
32
registers,
each
32
bit
wide.
type
regsbundler
is
record
R00,R01,R02,R03,R04,R05,R06,R07,
R08,R09,R10,R11,R12,R13,R14,R15,
R16,R17,R18,R19,R20,R21,R22,R23,
R24,R25,R26,R27,R28,R29,R30,R31
:
std_logic_vector
(
31
downto
0
);
end
record
;
--
'Bundler'
below
is
for
collecting
--
signals
in
datapath.
--
'Regsbundler'
above
is
also
included.
type
bundler
is
record
pc
:
std_logic_vector
(
31
downto
0
);
instr
:
std_logic_vector
(
31
downto
0
);
RA1,RA2,WA3
:
std_logic_vector
(
4
downto
0
);
alua,alub
:
std_logic_vector
(
31
downto
0
);
aluout
:
std_logic_vector
(
31
downto
0
);
RD1,RD2,WD3
:
std_logic_vector
(
31
downto
0
);
regs
:
regsbundler;
--
registers'
contents
end
record
;
--
...
--
File
name:
sc_datapath.vhd
--
...
Other
library
declarations
...
use
work.myfuns.
all
;
Entity
sc_datapath
is
port
(
--
Other
port
definitions
--
...
Reporter
:
out
bundler
);
end
sc_datapath
;
Architecture
struct
of
sc_datapath
is
--
...
Other
statements
...
-----
Reporter
collects
------------
Reporter.pc
<=
pc;
Reporter.instr
<=
instr;
Reporter.RA1
<=
instr(
25
downto
21
);
Reporter.RA2
<=
instr(
20
downto
16
);
Reporter.WA3
<=
writereg;
Reporter.RD1
<=
srca
;
Reporter.RD2
<=
writedata
;
Reporter.WD3
<=
result
;
Reporter.alua
<=
srca;
Reporter.alub
<=
srcb;
Reporter.aluout
<=
aluout;
------------------------------------
--
...
Figure
4.
’Record’
type
for
the
construction
of
deb
ugging
b
us.
On
the
left:
definition.
On
the
right:
e
xample
instantiation
and
usage
in
the
datapath
module.
3.2.
Inf
ormation
display
T
o
sho
w
the
deb
ugging
steps,
the
v
alues
of
signals
of
interest
are
displayed
in
V
GA
monitor
,
one
ro
w
per
clock.
The
Altera
DE2-115
board
is
equipped
with
a
V
GA
port,
b
ut
users
must
themselv
es
program
the
V
GA
synchronization
and
character
generation.
Here
we
adopt
the
method
described
in
[24],
and
the
arrangement
is
depicted
if
Figure
5.
In
this
layout,
phase
loc
k
ed
loop
(PLL)
is
used
step
up
the
clock
frequenc
y
.
It
feeds
the
V
GA
sync
module,
which
produces
the
horizontal
and
v
ertical
synchronization
signals,
to
be
used
by
the
V
GA
display
.
The
vg
a
sync
module
also
generates
i
nformation
of
current
pix
el’
s
x
and
y
position.
This
information
is
used
by
the
character
generation
and
F
ont-R
OM
modules
to
render
appropriate
character
at
a
gi
v
en
time.
Ag
ain,
the
interested
readers
are
referred
to
[24]
for
further
technical
details
re
g
arding
the
display
arrangement.
The
microprocessor
system,
sho
wn
in
Figure
5,
also
in
Figure
3
as
green-outlined
box,
transmits
the
deb
ugging
signals
of
f
to
the
report
compiler
where
all
tapped
signals
are
lined
up
and
sent
of
f
to
the
character
generation
circuit.
TELK
OMNIKA
T
elecommun
Comput
El
Control,
V
ol.
18,
No.
3,
June
2020
:
1229
–
1236
Evaluation Warning : The document was created with Spire.PDF for Python.
TELK
OMNIKA
T
elecommun
Comput
El
Control
❒
1233
Crys
tal
O
s
c
i
l
l
ato
r
PLL
VG
A Sy
n
c
Fon
t
ROM
Pi
xe
l
Da
ta
FPGA
M
i
cr
o
p
r
o
ces
s
o
r
S
ys
tem
VG
A
d
i
s
p
l
ay
C
h
ar
act
e
r
ge
n
e
ra
to
r
Re
p
o
rt
comp
i
l
at
i
on
Figure
5.
Display
arrangement,
including
reporting
module.
3.3.
T
est
design
The
ne
xt
step
in
this
w
ork
is
to
design
a
test
which
will
confirm
the
functionality
of
of
the
o
v
erall
set
up
,
which
will
serv
e
as
a
proof
of
concept
for
our
proposed
method.
A
microprocessor
architecture
e
xpansion
task
is
chosen
as
the
test.
That
is,
ne
w
instructions
are
to
be
introduced
to
set.
This
will
require
a
modification
in
the
micro-architecture
of
our
microprocessor
and
testing
the
functionalities
of
the
ne
w
instructions.
As
the
base
architecture,
we
adopt
[23],
which
in
t
urn
w
as
inspired
by
earlier
editions
of
[25].
Specifically
,
the
reader
are
referred
to
Chapter
6
(Architecture)
and
Chapter
7
(Microarchitecture)
of
[23].
These
added
instructions
are
shift
left
lo
gical
(sll),
shift
right
lo
gical
(srl),
and
shift
right
arithmetic
(sra).
The
three
instructions
are
of
R-type
instruction
and
ha
v
e
the
same
in
v
ocation
form.
F
or
instance,
the
format
for
sll
is:
sll
r
d,rt,shamt
where
r
d
is
the
destination
re
gister
,
rt
is
the
source
re
gister
.
The
four
-bytes
data
is
stored
in
rt
is
shifted
left
by
shamt
amount,
and
then
stored
in
r
d
.
Similar
form
holds
for
srl
and
sr
a
.
The
architecture
for
these
three
re
gister
-type
instructions
is
sho
wn
in
Figure
6(a).
F
ollo
wing
the
con
v
ention,
the
six
most
significant
bits
(instr[31:26],
the
op
field)
are
0,
indicating
R-type
instructions.
The
6
least
significant
bits
(instr[5:0],
the
funct
field)
indicate
which
R-type
instruction
is
operati
v
e.
(And
only
last
tw
o
bit
indicate
which
of
the
three
shifts
will
be
operati
v
e).
The
shift
amount
is
placed
in
the
shamt
field
(instr[10:6]).
The
source
and
destination
re
gister
are
in
the
rt
and
rd
fields,
respecti
v
ely
.
Based
on
the
abo
v
e
architecture,
a
number
of
modifications
are
implemented
in
the
microarchitect
ure.
Figure
6(b)
sho
ws
part
of
the
ne
w
microarchitecture
design.
The
shifter
module
recei
v
es
the
instr[10:6]
as
the
amount
of
shift
and
recei
v
es
instr[1:0]
as
shift
mode
chooser
(which
of
the
three
shift
commands
is
operati
v
e).
The
32
bit
data
(to
be
shifted)
comes
from
the
re
gister
and
is
output
to
a
multipl
e
x
er
,
which
will
choose
between
tw
o
signals:
output
from
the
original
ALU
or
output
from
the
shifter
.
op
6 b
i
ts
rs
5 b
i
ts
rt
5 b
i
ts
rd
5 b
i
ts
s
h
amt
5 b
i
ts
f
u
n
ct
6 b
i
ts
sll
$
8
,
$
2
,
1
0
0
0
2
8
1
0
0
000000
00000
00010
01000
01010
000000
(0
x
0
0
0
2
4
2
8
0
)
s
r
l
$
7
,
$
7
,
6
0
0
7
7
6
2
000000
00000
00111
00111
00110
000010
(0
x
0
0
0
7
3
9
8
2
)
sra
$
1
1
,
$
1
0
,
7
0
0
10
11
7
3
000000
00000
01010
01011
00111
000011
(0
x
0
0
0
a
5
9
c3
)
(
a
)
Sr
c
A
Sr
c
B
A
L
U
A
L
U
o
u
t
0
1
1
:
0
10
:
6
A
L
U
C
o
n
t
r
o
l
I
n
s
t
r
31
:
26
5
:
3
A
L
U
_
or
_
Sh
i
ft
S
hi
f
t
e
r
(
b
)
Figure
6.
Ne
w
instructions
for
the
microprocessor:
sll,
srl,
and
sra.
(a)
Bit
arrangement
within
32-bit
wide
instruction,
conforming
MIPS
ISA
in
[23].
(b)
Hardw
are
implementation
in
the
micro-architecture
(only
modified
part
is
sho
wn).
On-c
hip
deb
ug
ging
for
micr
opr
ocessor
design
(F
ajar
Suryawan)
Evaluation Warning : The document was created with Spire.PDF for Python.
1234
❒
ISSN:
1693-6930
After
modifying
the
microarchitecture,
a
set
of
code
will
be
used
to
confirm
the
v
alidity
.
The
test
code
formulated
is
a
continuation
of
the
test
code
in
[23],
page
437.
It
consists
of
computations
in
v
olving
all
instruction
by
which
a
specific
state
is
tar
geted
(“v
al
ue
of
7
is
stored
in
memory
address
84”).
A
f
ault
in
the
implementation
of
an
y
instruction
will
render
the
tar
get
state
not
achie
v
ed.
It
is
v
ery
unlik
ely
to
produce
the
e
xpected
result
under
f
aulty
condition.
Figure
7
sho
ws
the
o
v
erall
ins
truction
test,
written
in
MIPS
assembly
l
anguage.
The
blue
lines
(address
0
to
40
and
address
64)
are
the
original
test
from
[23],
and
the
green
ones
(address
44
to
60)
are
the
ne
w
code.
In
the
end
of
the
test,
the
computed
v
alue
(happens
to
be
126
=
0x7e)
is
stored
in
the
memory
address
84
=
0x54.
#
A
s
s
e
m
b
l
y
D
e
s
c
r
i
p
t
i
o
n
A
d
d
r
e
s
s
M
a
c
h
i
n
e
#
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
m
a
i
n
:
a
d
d
i
$
2
,
$
0
,
5
#
i
n
i
t
i
a
l
i
z
e
$
2
=
5
0
2
0
0
2
0
0
0
5
a
d
d
i
$
3
,
$
0
,
1
2
#
i
n
i
t
i
a
l
i
z
e
$
3
=
1
2
4
2
0
0
3
0
0
0
c
a
d
d
i
$
7
,
$
3
,
-
9
#
i
n
i
t
i
a
l
i
z
e
$
7
=
3
8
2
0
6
7
f
f
f
7
o
r
$
4
,
$
7
,
$
2
#
$
4
=
(
3
O
R
5
)
=
7
c
0
0
e
2
2
0
2
5
a
n
d
$
5
,
$
3
,
$
4
#
$
5
=
(
1
2
A
N
D
7
)
=
4
1
0
0
0
6
4
2
8
2
4
a
d
d
$
5
,
$
5
,
$
4
#
$
5
=
4
+
7
=
1
1
1
4
0
0
a
4
2
8
2
0
b
e
q
$
5
,
$
7
,
e
n
d
#
s
h
o
u
l
d
n
’
t
b
e
t
a
k
e
n
1
8
1
0
a
7
0
0
0
a
s
l
t
$
4
,
$
3
,
$
4
#
$
4
=
1
2
<
7
=
0
1
c
0
0
6
4
2
0
2
a
b
e
q
$
4
,
$
0
,
a
r
o
u
n
d
#
s
h
o
u
l
d
b
e
t
a
k
e
n
2
0
1
0
8
0
0
0
0
1
a
d
d
i
$
5
,
$
0
,
0
#
s
h
o
u
l
d
n
’
t
h
a
p
p
e
n
2
4
2
0
0
5
0
0
0
0
a
r
o
u
n
d
:
s
l
t
$
4
,
$
7
,
$
2
#
$
4
=
3
<
5
=
1
2
8
0
0
e
2
2
0
2
a
a
d
d
$
7
,
$
4
,
$
5
#
$
7
=
1
+
1
1
=
1
2
2
c
0
0
8
5
3
8
2
0
s
u
b
$
7
,
$
7
,
$
2
#
$
7
=
1
2
-
5
=
7
3
0
0
0
e
2
3
8
2
2
s
w
$
7
,
6
8
(
$
3
)
#
[
8
0
]
=
7
3
4
a
c
6
7
0
0
4
4
l
w
$
2
,
8
0
(
$
0
)
#
$
2
=
[
8
0
]
=
7
3
8
8
c
0
2
0
0
5
0
j
s
h
_
t
e
s
t
#
s
h
o
u
l
d
b
e
t
a
k
e
n
3
c
0
8
0
0
0
0
1
1
a
d
d
i
$
2
,
$
0
,
1
#
s
h
o
u
l
d
n
’
t
h
a
p
p
e
n
4
0
2
0
0
2
0
0
0
1
s
h
_
t
e
s
t
:
s
l
l
$
8
,
$
2
,
1
0
#
$
8
=
7
*
2
^
1
0
=
7
1
6
8
4
4
0
0
0
2
4
2
8
0
a
d
d
i
$
7
,
$
8
,
2
5
6
#
$
7
=
7
1
6
8
+
2
5
6
=
7
4
2
4
4
8
2
1
0
7
0
1
0
0
s
r
l
$
7
,
$
7
,
6
#
$
7
=
7
4
2
4
/
2
^
6
=
1
1
6
4
c
0
0
0
7
3
9
8
2
a
d
d
i
$
9
,
$
0
,
1
2
8
0
#
i
n
i
t
i
a
l
i
z
e
$
9
=
1
2
8
0
5
0
2
0
0
9
0
5
0
0
s
u
b
$
1
0
,
$
0
,
$
9
#
$
1
0
=
-
1
2
8
0
5
4
0
0
0
9
5
0
2
2
s
r
a
$
1
1
,
$
1
0
,
7
#
$
1
1
=
-
1
2
8
0
S
R
A
7
=
-
1
0
5
8
0
0
0
a
5
9
c
3
s
u
b
$
1
2
,
$
0
,
$
1
1
#
$
1
2
=
0
-
(
-
1
0
)
=
1
0
5
c
0
0
0
b
6
0
2
2
a
d
d
$
2
,
$
7
,
$
1
2
#
$
2
=
1
1
6
+
1
0
=
1
2
6
6
0
0
0
e
c
1
0
2
0
e
n
d
:
s
w
$
2
,
8
4
(
$
0
)
#
w
r
i
t
e
m
e
m
[
8
4
]
=
1
2
6
6
4
a
c
0
2
0
0
5
4
Figure
7.
T
esting
the
ne
w
microarchitecture.
Blue
lines
(address
0
to
40
and
address
64)
are
test
code
from
[23].
Green
lines
(address
44
to
60)
are
inserted
to
test
ne
w
instructions
implemented
in
the
microarchitecture.
4.
RESUL
T
AND
AN
AL
YSIS
T
w
o
scenarios
are
administered
to
assess
the
proposed
method.
First
one
in
v
olv
es
simulation
of
the
microprocessor
system
using
ModelSim.
The
second
one
is
to
run
the
microprocessor
in
the
Altera
FPGA
chip.
In
both
scenarios,
the
same
microprocessor
module
is
used
(green
box
in
Figure
3).
In
the
first
scenario
–the
ModelSim
simulation–
a
SystemV
erilog
testbench
file
is
b
uilt
as
wrapper
for
the
microprocessor
.
The
second
scenario
synthesizes
the
microprocessor
and
programmed
the
netlist
into
the
FPGA
chip.
In
both
scenarios,
the
test
code
i
n
its
machine
form
as
sho
wn
in
Figure
7,
rightmost
column,
is
em-
bedded
to
the
chip
in
the
Instruction
Memory
.
The
instruction
is
e
x
ecuted
sequentially
,
with
jumps
at
specific
moments.
Deb
ugging
in
microprocessor
design
most
of
the
times
in
v
olv
es
probing
v
alues
of
internal
signals
such
as
program
counter
,
instruction,
re
gister
address
input,
re
gister
data
input,
memory
address,
ALU
inputs,
ALU
output.
Indeed,
these
are
the
signals
we
will
display
in
both
scenarios.
Figure
8
sho
ws
the
result
for
the
first
scenario.
It
can
be
seen
that
a
number
of
signal
v
alues
are
not
resolv
ed.
Instead
of
sho
wing
v
alues
of
signals,
‘xxxxxxxx’
are
sho
wn.
In
the
second
scenario,
where
on-chip
deb
ugging
features
are
emplo
yed
in
the
FPGA-based
microprocessor
chip,
all
signal
v
alues
are
de-
lineated
correctly
.
The
result
is
depicted
in
Figure
9.
The
dif
ference
in
the
signal
display
(though
the
final
result
is
the
same,
implying
correct
model/design)
might
come
from
o
v
erly
tight
timing
specification,
coarse
timing
resolution,
multiply
dri
v
en
signals
,
dif
ferent
initial
states,
or
simply
a
b
ug
in
the
simulation
softw
are.
The
on-chip
deb
ugger
,
on
the
other
hand,
sho
ws
real
data
from
the
hardw
are
(though
still
limited
by
the
display
system’
s
speed
capability).
While
it
is
relati
v
ely
straightforw
ard
(b
ut
not
necessaril
y
easy)
to
fix
b
ugs
in
the
simulation
side,
hardw
are
reporting
is
precious
and
some
times
the
only
choice
a
designer
has.
The
MIPS
e
xample
sho
wn
here
serv
es
as
a
demonstration
of
this
on-chip
deb
ugging
technique.
TELK
OMNIKA
T
elecommun
Comput
El
Control,
V
ol.
18,
No.
3,
June
2020
:
1229
–
1236
Evaluation Warning : The document was created with Spire.PDF for Python.
TELK
OMNIKA
T
elecommun
Comput
El
Control
❒
1235
Figure
8.
ModelSim
output.
Unresolv
ed
signals
are
indicated.
Figure
9.
Output
of
the
on-chip
deb
ugging
technique.
5.
CONCLUSION
This
paper
proposed
a
hardw
are-oriented
approach
of
deb
ugging
a
chip
design
in
FPGA,
by
w
ay
of
tapping
signals
from
an
y
le
v
el
in
the
hierarch
y
up
to
the
top
le
v
el.
These
signals
of
interest
were
then
displayed
using
V
GA
module
pro
vided
by
the
board.
The
tapping
w
as
done
using
a
the
VHDL
‘record’
type
as
a
b
us
with
which
signals
are
b
undled
and
transported
up
the
hierarch
y
.
A
microprocessor
design
challenge
w
as
used
as
the
test
case.
The
proposed
method
correctly
dis
-
played
t
h
e
internal
signals.
The
approach
naturally
sho
wed
higher
fidelity
compared
to
simulation.
While
softw
are
simulation
of
hardw
are
design
is
indispensable
and
will
continue
to
get
better
,
hardw
are-le
v
el
deb
ug-
ger
and
reporting
module
is
in
v
aluable
and
some
times
the
only
option.
This
is
e
v
en
more
true
when
the
chip
is
already
in
the
deplo
yment
stage.
W
e
hope
that
this
paper
will
inspire
other
researchers
and
students
alik
e
to
emplo
y
the
same
technique
in
their
designs.
A
CKNO
WLEDGMENT
This
research
is
partly
funded
by
Uni
v
ersitas
Muhammadiyah
Surakarta’
s
Doctoral
Research
Grant.
The
author
w
ould
lik
e
to
thank
his
students
in
Pr
o
gr
ammable
Lo
gic
Design
and
Computer
Ar
c
hitectur
e
classes.
On-c
hip
deb
ug
ging
for
micr
opr
ocessor
design
(F
ajar
Suryawan)
Evaluation Warning : The document was created with Spire.PDF for Python.
1236
❒
ISSN:
1693-6930
REFERENCES
[1]
C.
K
ellett,
“
A
project-based
learning
approach
to
programmable
logic
design
and
computer
architecture,
”
Education,
IEEE
T
r
ansactions
on
,
v
ol.
55,
no.
3,
pp.
378–383,
2012.
[2]
J.
H.
Lee,
S.
E.
Lee,
H.-C.
Y
u,
and
T
.
Suh,
“Pipelined
CPU
design
with
FPGA
in
teaching
computer
architecture,
”
Education,
IEEE
T
r
ansactions
on
,
v
ol.
55,
no.
3,
pp.
341–348,
2012.
[3]
W
.
Richard,
D.
T
aylor
,
and
D.
Zar
,
“
A
capstone
computer
engineering
design
course,
”
Education,
IEEE
T
r
ansactions
on
,
v
ol.
42,
no.
4,
pp.
288–294,
1999.
[4]
F
.
Surya
w
an,
“
A
project-based
approach
to
fpg
a-aided
teaching
of
digital
systems,
”
in
4th
International
Confer
ence
on
Electrical
Engineering
,
Computer
Science
and
Informatics
(EECSI)
.
IEEE,
2017,
pp.
590–595.
[5]
A.
Fidalgo,
M.
G.
Gericota,
G.
R.
Alv
es,
and
J.
M.
Fe
rreira,
“Real-time
f
ault
injection
using
enhanced
on-chip
deb
ug
infrastructures,
”
Micr
opr
ocessor
s
and
Micr
osystems
,
v
ol.
35,
no.
4,
pp.
441–452,
2011.
[6]
M.
Portela-Garcia,
C.
Lopez-Ongil,
M.
Garc
´
ıa-V
alderas,
and
L.
Entrena,
“F
ault
injection
in
modern
mi-
croprocessors
using
on-chip
deb
ugging
infrastructures,
”
IEEE
T
r
ansactions
on
Dependable
and
Secur
e
Computing
,
v
ol.
8,
no.
2,
pp.
308–314,
2011.
[7]
K.
D.
Maier
,
“On-chip
deb
ug
support
for
embedded
systems-on-chip,
”
in
Cir
cuits
and
Systems,
2003.
ISCAS’03.
Pr
oceedings
of
the
2003
International
Symposium
on
,
v
ol.
5.
IEEE,
2003,
pp.
V
–V
.
[8]
H.
P
ark,
J.
Xu,
J.
P
ark,
J.-H.
Ji,
and
G.
W
oo,
“Design
of
on-chip
deb
ug
system
for
embedded
processor
,
”
in
SoC
Design
Confer
ence
,
2008.
ISOCC’08.
International
,
v
ol.
3.
IEEE,
2008,
pp.
III–11
–
III–12.
[9]
P
.
Fezzardi,
M.
Lattuada,
and
F
.
Ferrandi,
“Using
ef
ficient
path
profiling
to
optimize
memory
consumption
of
on-chip
deb
ugging
for
high-le
v
el
synthes
is,
”
A
CM
T
r
ansactions
on
Embedded
Computing
Systems
(TECS)
,
v
ol.
16,
no.
5s,
p.
149,
2017.
[10]
A.-S.
Jamal,
J.
Goeders,
and
S.
J.
E.
W
ilton,
“
An
FPGA
o
v
erlay
architecture
supporting
rapid
imple-
mentation
of
functional
changes
during
on-chip
deb
ug,
”
in
2018
28th
International
Confer
ence
on
F
ield
Pr
o
gr
ammable
Lo
gic
and
Applications
(FPL)
.
IEEE,
2018.
[11]
A.-S.
Jamal,
“
An
FPGA
o
v
erlay
architecture
supporting
softw
are-lik
e
compile
times
during
on-chip
deb
ug
of
high-le
v
el
synthesis
designs,
”
Ph.D.
dissertation,
Uni
v
ersity
of
British
Columbia,
2018.
[12]
P
.
Mishra
and
F
.
F
arahmandi,
P
ost-Silicon
V
alidation
and
Deb
ug
.
Cham,
Switzerland:
Springer
,
2019.
[13]
H.
Oh,
T
.
Han,
I.
Choi,
and
S.
Kang,
“
An
on-chip
error
detection
method
to
reduce
the
post-silicon
deb
ug
time,
”
IEEE
T
r
ansactions
on
Computer
s
,
v
ol.
66,
no.
1,
pp.
38–44,
Jan
2017.
[14]
H.
Oh,
I.
Choi,
and
S.
Kang,
“DRAM-based
error
detection
method
to
reduce
the
post-silicon
deb
ug
time
for
multiple
identical
c
ores,
”
IEEE
T
r
ansactions
on
Computer
s
,
v
ol.
66,
no.
9,
pp.
1504–1517,
Sep.
2017.
[15]
Y
.
Cao,
H.
P
alombo,
S.
Ray,
and
H.
Zheng,
“Enhancing
observ
ability
for
post-silicon
deb
ug
with
on-chip
communication
monitors,
”
in
2018
IEEE
Computer
Society
Annual
Symposium
on
VLSI
(ISVLSI)
,
July
2018,
pp.
602–607.
[16]
S.
Ray
,
“Soc
i
nstrumentations:
Pre
-silicon
preparation
for
post-silicon
readiness,
”
in
P
ost-Silicon
V
alida-
tion
and
Deb
ug
.
Springer
,
2019,
pp.
19–32.
[17]
R.
Abdel-Khalek
and
V
.
Bertacco,
“Post-silicon
platform
for
the
functional
diagnosis
and
deb
ug
of
netw
orks-on-chip,
”
A
CM
T
r
ansactions
on
Embedded
Computing
Systems
(TECS)
,
v
ol.
13,
no.
3s,
p.
112,
2014.
[18]
M.
Abramo
vici,
“In-system
silicon
v
alidation
and
deb
ug,
”
IEEE
Design
T
est
of
Computer
s
,
v
ol.
25,
no.
3,
pp.
216–223,
May
2008.
[19]
D.
Holanda
Noronha,
R.
Zhao,
J.
Goeders,
W
.
Luk,
and
S.
J.
E.
W
ilton,
“On-chip
fpg
a
deb
ug
instrumenta-
tion
for
machine
learning
applications,
”
in
Pr
oceedings
of
the
2019
A
CM/SIGD
A
International
Symposium
on
F
ield-Pr
o
gr
ammable
Gate
Arr
ays
.
A
CM,
2019,
pp.
110–115.
[20]
K.
Rahmani
and
P
.
Mishra,
“Feature-based
signal
selection
for
post-silicon
deb
ug
using
machine
learn-
ing,
”
IEEE
T
r
ansactions
on
Emer
ging
T
opics
in
Computing
,
pp.
1–1,
2017.
[21]
D.
A.
P
atterson
and
J.
L.
Hennessy
,
Computer
Or
ganization
and
Design:
the
Har
dwar
e/Softwar
e
Inter
-
face
,
5th
ed.
Mor
g
an
Kaufmann,
2014.
[22]
J.
L.
Hennessy
and
D.
A.
P
atterson,
Computer
Ar
c
hit
ectur
e:
a
Quantitative
Appr
oac
h
,
5th
ed.
Mor
g
an
Kaufmann,
2012.
[23]
D.
M.
Harris
and
S.
L.
Harris,
Digital
Design
and
Computer
Ar
c
hitectur
e
,
2nd
ed.
Mor
g
an
Kaufmann,
2013.
[24]
P
.
P
.
Chu,
FPGA
Pr
ototyping
by
VHDL
Examples
.
W
ile
y
,
2008.
[25]
D.
A.
P
atterson
and
J.
L.
Hennessy
,
Computer
Or
ganization
and
Design:
the
Har
dwar
e/Softwar
e
Inter
-
face
,
3rd
ed.
Mor
g
an
Kaufmann,
2005.
TELK
OMNIKA
T
elecommun
Comput
El
Control,
V
ol.
18,
No.
3,
June
2020
:
1229
–
1236
Evaluation Warning : The document was created with Spire.PDF for Python.