TELK OMNIKA T elecommunication, Computing, Electr onics and Contr ol V ol. 18, No. 3, June 2020, pp. 1229 1236 ISSN: 1693-6930, accredited First Grade by K emenristekdikti, No: 21/E/KPT/2018 DOI: 10.12928/TELK OMNIKA.v18i3.13174 1229 On-chip deb ugging f or micr opr ocessor design F ajar Suryawan, Bana Handaga, Abdul Basith Department of Electrical Engineering, Uni v ersitas Muhammadiyah Surakarta, Indonesia Article Inf o Article history: Recei v ed May 20, 2019 Re vised Jan 22, 2020 Accepted Feb 21, 2020 K eyw ords: Engineering education Field programmable g ate array Microprocessor design Post-silicon deb ug Programmable logic ABSTRA CT This article proposes a closer -to-metal approach of R TL inspection in microprocessor design for use in education, engineering, and research. Signals of interest are tapped throughout the m icroprocessor hierarchical design and are then output to the top-le v el entity and finally displayed to a V GA monitor . Input clock signal can be fed as slo w as one wish to trace or deb ug the microprocessor being designed. An FPGA de v elopment board, along with its accompan ying softw are package, is used as the design and test platform. The use of VHDL commands t ype’ and record’ in the hierarch y pro vides k e y ingredients in the o v erall design, since this allo ws simple, clean, and tractable code. The m ethod is tested on MIPS single-c ycle microprocessor blueprint. The result sho ws that the technique produces more consistent display of the true contents of re gisters, ALU input/ output signals, and other wires compared to the standard, widely-used simulation method. This approach is e xpected to increase confidence in students and designers since the reported signals’ v alues are the true v alues. Its use is not li mited to the de v elopment of microprocessors; e v ery FPGA- based digital design can benefit from it. This is an open access article under the CC BY -SA license . Corresponding A uthor: F ajar Surya w an, Department of Electrical Engineering, Uni v ersitas Muhammadiyah Surakarta, Jl. A. Y ani, T romol Pos 1, Suk oharjo, Ja w a T eng ah, Indonesia. Email: F ajar .Surya w an@ums.ac.id 1. INTR ODUCTION Digital de vice de v elopment depends greatly on precise understanding ho w data propag ate between basic digital logic units, also called r e gister tr ansfer le vel . In design phase, designers often use simulation procedures to check whether their designs meet the logic requirements. An e xample of this is also encountered in senior le v el electrical/computer engineering bachelor -de gree courses such as Pr o gr ammable Lo gic Design or Computer Ar c hitectur e [1–4]. In such courses, students are ask ed to design a micro-architecture of a microprocessor based on a gi v en architecture (that is, the assembly language requirement). Students then write HDL code representing the micro-architecture and test the design ag ainst a set of instructions. T esting is generally done in simulation and, after a number of testing-coding iterations, hardw are test is performed. Softw are simulation is indispensable for its quick setting, f ast compilation, and pro vided the designer is e xperienced accurac y . Ho we v er , misma tches between simulation and synthesized hardw are are not entirely unheard of, e v en for simple design. Mismatches also occur betwee n pre-synthesis and post-synthesis simulations. T o mak e matter w orse, in post-synthesis (netlist) simulat ion one generally can only monitor the top-le v el ports; signals deeper in hierarch y are inaccessible. T o address this problem, we propose a closer -to- metal approach for the re gister transfer le v el inspection. Ef fecti v ely , this is an on-chip deb ugging technique J ournal homepage: http://journal.uad.ac.id/inde x.php/TELK OMNIKA Evaluation Warning : The document was created with Spire.PDF for Python.
1230 ISSN: 1693-6930 where signals of interest are brought up to the top le v el for output reading. The term on-c hip deb ug generally refers to a technique in microprocessor (or other digital de vice) design where a designer can inject a f ault in the de vice under de v elopment t o test its f ault tolerant beha vior [5– 8]. F or microprocessors, this is usually done using JT A G protocol based on NEXUS Consortium standard. In [9], on-chip deb ug is used in high le v el synthesis for FPGAs. On-chip deb ug has been a concern from the be ginning of computer era. FPGA has also tak en part in this field. F or e xample, w ork by J amal et.al [10, 11] proposes better functional changes during on-chip deb ug, utilizing FPGA o v erlay architecture. Contemporary w orks i n this field, particularly post-silicon deb ug, can be found in [12–15]. Indeed, post-silicon deb ug readiness needs to be prepared early in the design [16–18]. A number of authors e xtend the idea to other areas such as machine learning [19, 20]. In this article we describe on-c hip deb ug more in its literal meaning. That is, the process of deb ugging a microprocessor in which the deb ug capability is embedded in the hardw are design. Our contrib ution lies in the follo wing aspects. First, we propose a simple hardw are-oriented deb ugging method for use in an y digital design. W e hope this introductory notion will trigger student s’ creati v e f aculty to solv e some challenging problems that otherwise dif ficult to tackle. Second, we describe in a tutorial w ay the construction of a simple on-chip deb ugging feature in the design of a microprocessor using ‘type’ and ‘record’ in vhdl. W e belie v e this will help students and designers easily duplicate our w ork. 2. RESEARCH METHOD This research starts with a list of design requirements for an on-chip deb ugging feature in a microprocessor: (a) non-intrusi v eness: the deb ugging feature should be as discreet as possible so as not to obstruct the main design (b) meaningful message: the interf ace to human reader should be immediately readable (c) easy to modify: when the designer w ants to tap other signals in the microprocessor design, it should be straightforw ard to do so The main idea of this chip deb ugging feature is sho wn in Figure 1.   Microproces s or     Control   Unit   Datapath:       ALU       Registers   Memory   Rep o rting modu l e VGA   Dis p la y Cloc k   Figure 1. On-chip deb ugging principles in this paper . The ne xt step is b uilding a model microprocessor from scratch using VHDL in an FPGA chip (an Altera DE2-115 de v elopment board w as used). Here we use a scaled-do wn v ersion of the MIPS microprocessor architecture [21, 22] as our proof of concept. MIPS architecture has the adv antages of, among others, being simple and consistent for students to follo w . MIPS is of RISC-architecture and originated as a pedagogical model at Stanford Uni v ersity . W e de v eloped the chip-deb ugging feature based on a MIPS implementation described by Harris and Harris [23]. In this stage a V GA displa y module were also b uilt, consisting of a vg a sync module, tw o character memory modules, and a font R OM. The e xperimental hardw are setup is sho wn in Figure 2. W e e xtend the single-c ycle M IPS instruction set found in the main te xt of [23] by constructing a number of ne w instructions and ‘re wire’ the datapath as needed. W e then v erify using the proposed on- chip deb ugging technique that the final result w orks as e xpected. In the ne xt section we will discuss the engineering design in more detail. TELK OMNIKA T elecommun Comput El Control, V ol. 18, No. 3, June 2020 : 1229 1236 Evaluation Warning : The document was created with Spire.PDF for Python.
TELK OMNIKA T elecommun Comput El Control 1231 Figure 2. Hardw are setup. Altera DE2-115 de v elopment board is used to test the proposed technique, together with a 1024x768 resolution monitor . 3. PR O T O TYPE DESIGN The design requires se v eral aspects to w ork seamlessly together . These are: processor design with its signal inspection, information display , and e xperiment design. 3.1. Pr ocessor design and signal inspection As mentioned before, the approach w orks by sensing internal microprocessor signals (including mem- ory access ones) and sending them up through the design hierarch y . Using hierarchical design implies that man y entities and files are used, which poses a ne w challenge on ho w to tap signals from dif ferent entities in a straightforw ard and unobtrusi v e w ay . The signal tapping as sho wn in Figure 1 is implemented using a shared b us that is a v ailable across the hierarch y . Figure 3 sho ws the or g anization of modules that mak e up the entire microprocessor .   ALU   PC   Si g n  exte n d   M a i n   d eco d er   AL U   d ec o d er   I n str .   M emo r y   cl o ck   Da t a   M emo r y   Da t ap at h   MI PS   Proc es s or   Micr o pr oces so r   Sy stem   Rep o r ti n g  m o d u l e   a n d   d i s p l a y s ys tem   Te st sui t e   in   FPG A   Sh i f t   Reg i s ter s   ( 32 - b i t 3 2 )   r   V G d i s p l a y     cter  Figure 3. Module or g anization and hierarch y . The small red blocks are instantiations of the record entity that encapsulates the tapped signals’ information. The red ‘cable’ then acts as a deb ug-b us across the hierarch y . The arithmetic lo gic unit (ALU) does the arithmetic computation with the help of a set of re gisters (32-bits × 32) that functions as a scratchpad for the ALU. Pr o gr am counter (PC) acts as a pointer to instruction. Sign-e xtend module e xtends less-than-32-bit-wide numbers (for e xample in immediate-type instructions) to its 32-bit representation. The shift module functions as bit-wise shifter . All these are in the “datapath” module On-c hip deb ug ging for micr opr ocessor design (F ajar Suryawan) Evaluation Warning : The document was created with Spire.PDF for Python.
1232 ISSN: 1693-6930 which also hosts a number of multiple x ers controlling which w ay data w ould flo w into. Controlling is done by the “controller” module outside the datapath, which decode 32-bit instructions from the instructions memory . Datapath and Controller forms the ”MIPS processor” module. T ogether with Instruction Memory and Data Memory modules, the y mak e up the complete microprocessor system. The signal tapping, sho wn as red blocks in Fi gure 3, is a record-type entity instantiated at e v ery mod- ules of interest. Acting as a “deb ugging b us”, this record is ready to accept the v alue of an y signal of interest in e v ery le v el in the hierarch y . Since the b us is logically encapsulated, it does not obstruct the main design. Mod- ifying the b us’ content is straightforw ard and can be done once in the de finition, without the need to change an y code in the instantiation part. -- File name: myfuns.vhd -- ... Other statements ... -- 'Regsbundler' below is for collecting -- contents of registers. -- There are 32 registers, each 32 bit wide. type regsbundler is record R00,R01,R02,R03,R04,R05,R06,R07, R08,R09,R10,R11,R12,R13,R14,R15, R16,R17,R18,R19,R20,R21,R22,R23, R24,R25,R26,R27,R28,R29,R30,R31 : std_logic_vector ( 31 downto 0 ); end record ; -- 'Bundler' below is for collecting -- signals in datapath. -- 'Regsbundler' above is also included. type bundler is record pc : std_logic_vector ( 31 downto 0 ); instr : std_logic_vector ( 31 downto 0 ); RA1,RA2,WA3 : std_logic_vector ( 4 downto 0 ); alua,alub : std_logic_vector ( 31 downto 0 ); aluout : std_logic_vector ( 31 downto 0 ); RD1,RD2,WD3 : std_logic_vector ( 31 downto 0 ); regs : regsbundler; -- registers' contents end record ; -- ... -- File name: sc_datapath.vhd -- ... Other library declarations ... use work.myfuns. all ; Entity sc_datapath is port ( -- Other port definitions -- ... Reporter : out bundler ); end sc_datapath ; Architecture struct of sc_datapath is -- ... Other statements ... ----- Reporter collects ------------ Reporter.pc <= pc; Reporter.instr <= instr; Reporter.RA1 <= instr( 25 downto 21 ); Reporter.RA2 <= instr( 20 downto 16 ); Reporter.WA3 <= writereg; Reporter.RD1 <= srca ; Reporter.RD2 <= writedata ; Reporter.WD3 <= result ; Reporter.alua <= srca; Reporter.alub <= srcb; Reporter.aluout <= aluout; ------------------------------------ -- ... Figure 4. ’Record’ type for the construction of deb ugging b us. On the left: definition. On the right: e xample instantiation and usage in the datapath module. 3.2. Inf ormation display T o sho w the deb ugging steps, the v alues of signals of interest are displayed in V GA monitor , one ro w per clock. The Altera DE2-115 board is equipped with a V GA port, b ut users must themselv es program the V GA synchronization and character generation. Here we adopt the method described in [24], and the arrangement is depicted if Figure 5. In this layout, phase loc k ed loop (PLL) is used step up the clock frequenc y . It feeds the V GA sync module, which produces the horizontal and v ertical synchronization signals, to be used by the V GA display . The vg a sync module also generates i nformation of current pix el’ s x and y position. This information is used by the character generation and F ont-R OM modules to render appropriate character at a gi v en time. Ag ain, the interested readers are referred to [24] for further technical details re g arding the display arrangement. The microprocessor system, sho wn in Figure 5, also in Figure 3 as green-outlined box, transmits the deb ugging signals of f to the report compiler where all tapped signals are lined up and sent of f to the character generation circuit. TELK OMNIKA T elecommun Comput El Control, V ol. 18, No. 3, June 2020 : 1229 1236 Evaluation Warning : The document was created with Spire.PDF for Python.
TELK OMNIKA T elecommun Comput El Control 1233             Crys tal   O s c i l l ato r   PLL   VG A Sy n c   Fon ROM   Pi xe l  Da ta   FPGA   M i cr o p r o ces s o r   S ys tem   VG d i s p l ay   C h ar act e r   ge n e ra to r   Re p o rt  comp i l at i on   Figure 5. Display arrangement, including reporting module. 3.3. T est design The ne xt step in this w ork is to design a test which will confirm the functionality of of the o v erall set up , which will serv e as a proof of concept for our proposed method. A microprocessor architecture e xpansion task is chosen as the test. That is, ne w instructions are to be introduced to set. This will require a modification in the micro-architecture of our microprocessor and testing the functionalities of the ne w instructions. As the base architecture, we adopt [23], which in t urn w as inspired by earlier editions of [25]. Specifically , the reader are referred to Chapter 6 (Architecture) and Chapter 7 (Microarchitecture) of [23]. These added instructions are shift left lo gical (sll), shift right lo gical (srl), and shift right arithmetic (sra). The three instructions are of R-type instruction and ha v e the same in v ocation form. F or instance, the format for sll is: sll r d,rt,shamt where r d is the destination re gister , rt is the source re gister . The four -bytes data is stored in rt is shifted left by shamt amount, and then stored in r d . Similar form holds for srl and sr a . The architecture for these three re gister -type instructions is sho wn in Figure 6(a). F ollo wing the con v ention, the six most significant bits (instr[31:26], the op field) are 0, indicating R-type instructions. The 6 least significant bits (instr[5:0], the funct field) indicate which R-type instruction is operati v e. (And only last tw o bit indicate which of the three shifts will be operati v e). The shift amount is placed in the shamt field (instr[10:6]). The source and destination re gister are in the rt and rd fields, respecti v ely . Based on the abo v e architecture, a number of modifications are implemented in the microarchitect ure. Figure 6(b) sho ws part of the ne w microarchitecture design. The shifter module recei v es the instr[10:6] as the amount of shift and recei v es instr[1:0] as shift mode chooser (which of the three shift commands is operati v e). The 32 bit data (to be shifted) comes from the re gister and is output to a multipl e x er , which will choose between tw o signals: output from the original ALU or output from the shifter . op   6 b i ts   rs   5 b i ts   rt   5 b i ts   rd   5 b i ts   s h amt   5 b i ts   f u n ct   6 b i ts   sll   $ 8 ,  $ 2 1 0   0   0   2   8   1 0   0   000000   00000   00010   01000   01010   000000   (0   0 0 0 2 4 2 8 0 )   s r l     $ 7 ,  $ 7 6   0   0   7   7   6   2   000000   00000   00111   00111   00110   000010   (0  x   0 0 0 7 3 9 8 2 )   sra   $ 1 1 $ 1 0 7   0   0   10   11   7   3   000000   00000   01010   01011   00111   000011   (0  x   0 0 0 a 5 9 c3 )   ( a ) Sr c A Sr c B A L U A L U o u t 0 1 1 : 0 10 : 6 A L U C o n t r o l I n s t r 31 : 26 5 : 3 A L U _ or _ Sh i ft S hi f t e r ( b ) Figure 6. Ne w instructions for the microprocessor: sll, srl, and sra. (a) Bit arrangement within 32-bit wide instruction, conforming MIPS ISA in [23]. (b) Hardw are implementation in the micro-architecture (only modified part is sho wn). On-c hip deb ug ging for micr opr ocessor design (F ajar Suryawan) Evaluation Warning : The document was created with Spire.PDF for Python.
1234 ISSN: 1693-6930 After modifying the microarchitecture, a set of code will be used to confirm the v alidity . The test code formulated is a continuation of the test code in [23], page 437. It consists of computations in v olving all instruction by which a specific state is tar geted (“v al ue of 7 is stored in memory address 84”). A f ault in the implementation of an y instruction will render the tar get state not achie v ed. It is v ery unlik ely to produce the e xpected result under f aulty condition. Figure 7 sho ws the o v erall ins truction test, written in MIPS assembly l anguage. The blue lines (address 0 to 40 and address 64) are the original test from [23], and the green ones (address 44 to 60) are the ne w code. In the end of the test, the computed v alue (happens to be 126 = 0x7e) is stored in the memory address 84 = 0x54. # A s s e m b l y D e s c r i p t i o n A d d r e s s M a c h i n e # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - m a i n : a d d i $ 2 , $ 0 , 5 # i n i t i a l i z e $ 2 = 5 0 2 0 0 2 0 0 0 5 a d d i $ 3 , $ 0 , 1 2 # i n i t i a l i z e $ 3 = 1 2 4 2 0 0 3 0 0 0 c a d d i $ 7 , $ 3 , - 9 # i n i t i a l i z e $ 7 = 3 8 2 0 6 7 f f f 7 o r $ 4 , $ 7 , $ 2 # $ 4 = ( 3 O R 5 ) = 7 c 0 0 e 2 2 0 2 5 a n d $ 5 , $ 3 , $ 4 # $ 5 = ( 1 2 A N D 7 ) = 4 1 0 0 0 6 4 2 8 2 4 a d d $ 5 , $ 5 , $ 4 # $ 5 = 4 + 7 = 1 1 1 4 0 0 a 4 2 8 2 0 b e q $ 5 , $ 7 , e n d # s h o u l d n t b e t a k e n 1 8 1 0 a 7 0 0 0 a s l t $ 4 , $ 3 , $ 4 # $ 4 = 1 2 < 7 = 0 1 c 0 0 6 4 2 0 2 a b e q $ 4 , $ 0 , a r o u n d # s h o u l d b e t a k e n 2 0 1 0 8 0 0 0 0 1 a d d i $ 5 , $ 0 , 0 # s h o u l d n t h a p p e n 2 4 2 0 0 5 0 0 0 0 a r o u n d : s l t $ 4 , $ 7 , $ 2 # $ 4 = 3 < 5 = 1 2 8 0 0 e 2 2 0 2 a a d d $ 7 , $ 4 , $ 5 # $ 7 = 1 + 1 1 = 1 2 2 c 0 0 8 5 3 8 2 0 s u b $ 7 , $ 7 , $ 2 # $ 7 = 1 2 - 5 = 7 3 0 0 0 e 2 3 8 2 2 s w $ 7 , 6 8 ( $ 3 ) # [ 8 0 ] = 7 3 4 a c 6 7 0 0 4 4 l w $ 2 , 8 0 ( $ 0 ) # $ 2 = [ 8 0 ] = 7 3 8 8 c 0 2 0 0 5 0 j s h _ t e s t # s h o u l d b e t a k e n 3 c 0 8 0 0 0 0 1 1 a d d i $ 2 , $ 0 , 1 # s h o u l d n t h a p p e n 4 0 2 0 0 2 0 0 0 1 s h _ t e s t : s l l $ 8 , $ 2 , 1 0 # $ 8 = 7 * 2 ^ 1 0 = 7 1 6 8 4 4 0 0 0 2 4 2 8 0 a d d i $ 7 , $ 8 , 2 5 6 # $ 7 = 7 1 6 8 + 2 5 6 = 7 4 2 4 4 8 2 1 0 7 0 1 0 0 s r l $ 7 , $ 7 , 6 # $ 7 = 7 4 2 4 / 2 ^ 6 = 1 1 6 4 c 0 0 0 7 3 9 8 2 a d d i $ 9 , $ 0 , 1 2 8 0 # i n i t i a l i z e $ 9 = 1 2 8 0 5 0 2 0 0 9 0 5 0 0 s u b $ 1 0 , $ 0 , $ 9 # $ 1 0 = - 1 2 8 0 5 4 0 0 0 9 5 0 2 2 s r a $ 1 1 , $ 1 0 , 7 # $ 1 1 = - 1 2 8 0 S R A 7 = - 1 0 5 8 0 0 0 a 5 9 c 3 s u b $ 1 2 , $ 0 , $ 1 1 # $ 1 2 = 0 - ( - 1 0 ) = 1 0 5 c 0 0 0 b 6 0 2 2 a d d $ 2 , $ 7 , $ 1 2 # $ 2 = 1 1 6 + 1 0 = 1 2 6 6 0 0 0 e c 1 0 2 0 e n d : s w $ 2 , 8 4 ( $ 0 ) # w r i t e m e m [ 8 4 ] = 1 2 6 6 4 a c 0 2 0 0 5 4 Figure 7. T esting the ne w microarchitecture. Blue lines (address 0 to 40 and address 64) are test code from [23]. Green lines (address 44 to 60) are inserted to test ne w instructions implemented in the microarchitecture. 4. RESUL T AND AN AL YSIS T w o scenarios are administered to assess the proposed method. First one in v olv es simulation of the microprocessor system using ModelSim. The second one is to run the microprocessor in the Altera FPGA chip. In both scenarios, the same microprocessor module is used (green box in Figure 3). In the first scenario –the ModelSim simulation– a SystemV erilog testbench file is b uilt as wrapper for the microprocessor . The second scenario synthesizes the microprocessor and programmed the netlist into the FPGA chip. In both scenarios, the test code i n its machine form as sho wn in Figure 7, rightmost column, is em- bedded to the chip in the Instruction Memory . The instruction is e x ecuted sequentially , with jumps at specific moments. Deb ugging in microprocessor design most of the times in v olv es probing v alues of internal signals such as program counter , instruction, re gister address input, re gister data input, memory address, ALU inputs, ALU output. Indeed, these are the signals we will display in both scenarios. Figure 8 sho ws the result for the first scenario. It can be seen that a number of signal v alues are not resolv ed. Instead of sho wing v alues of signals, ‘xxxxxxxx’ are sho wn. In the second scenario, where on-chip deb ugging features are emplo yed in the FPGA-based microprocessor chip, all signal v alues are de- lineated correctly . The result is depicted in Figure 9. The dif ference in the signal display (though the final result is the same, implying correct model/design) might come from o v erly tight timing specification, coarse timing resolution, multiply dri v en signals , dif ferent initial states, or simply a b ug in the simulation softw are. The on-chip deb ugger , on the other hand, sho ws real data from the hardw are (though still limited by the display system’ s speed capability). While it is relati v ely straightforw ard (b ut not necessaril y easy) to fix b ugs in the simulation side, hardw are reporting is precious and some times the only choice a designer has. The MIPS e xample sho wn here serv es as a demonstration of this on-chip deb ugging technique. TELK OMNIKA T elecommun Comput El Control, V ol. 18, No. 3, June 2020 : 1229 1236 Evaluation Warning : The document was created with Spire.PDF for Python.
TELK OMNIKA T elecommun Comput El Control 1235 Figure 8. ModelSim output. Unresolv ed signals are indicated. Figure 9. Output of the on-chip deb ugging technique. 5. CONCLUSION This paper proposed a hardw are-oriented approach of deb ugging a chip design in FPGA, by w ay of tapping signals from an y le v el in the hierarch y up to the top le v el. These signals of interest were then displayed using V GA module pro vided by the board. The tapping w as done using a the VHDL ‘record’ type as a b us with which signals are b undled and transported up the hierarch y . A microprocessor design challenge w as used as the test case. The proposed method correctly dis - played t h e internal signals. The approach naturally sho wed higher fidelity compared to simulation. While softw are simulation of hardw are design is indispensable and will continue to get better , hardw are-le v el deb ug- ger and reporting module is in v aluable and some times the only option. This is e v en more true when the chip is already in the deplo yment stage. W e hope that this paper will inspire other researchers and students alik e to emplo y the same technique in their designs. A CKNO WLEDGMENT This research is partly funded by Uni v ersitas Muhammadiyah Surakarta’ s Doctoral Research Grant. The author w ould lik e to thank his students in Pr o gr ammable Lo gic Design and Computer Ar c hitectur e classes. On-c hip deb ug ging for micr opr ocessor design (F ajar Suryawan) Evaluation Warning : The document was created with Spire.PDF for Python.
1236 ISSN: 1693-6930 REFERENCES [1] C. K ellett, A project-based learning approach to programmable logic design and computer architecture, Education, IEEE T r ansactions on , v ol. 55, no. 3, pp. 378–383, 2012. [2] J. H. Lee, S. E. Lee, H.-C. Y u, and T . Suh, “Pipelined CPU design with FPGA in teaching computer architecture, Education, IEEE T r ansactions on , v ol. 55, no. 3, pp. 341–348, 2012. [3] W . Richard, D. T aylor , and D. Zar , A capstone computer engineering design course, Education, IEEE T r ansactions on , v ol. 42, no. 4, pp. 288–294, 1999. [4] F . Surya w an, A project-based approach to fpg a-aided teaching of digital systems, in 4th International Confer ence on Electrical Engineering , Computer Science and Informatics (EECSI) . IEEE, 2017, pp. 590–595. [5] A. Fidalgo, M. G. Gericota, G. R. Alv es, and J. M. Fe rreira, “Real-time f ault injection using enhanced on-chip deb ug infrastructures, Micr opr ocessor s and Micr osystems , v ol. 35, no. 4, pp. 441–452, 2011. [6] M. Portela-Garcia, C. Lopez-Ongil, M. Garc ´ ıa-V alderas, and L. Entrena, “F ault injection in modern mi- croprocessors using on-chip deb ugging infrastructures, IEEE T r ansactions on Dependable and Secur e Computing , v ol. 8, no. 2, pp. 308–314, 2011. [7] K. D. Maier , “On-chip deb ug support for embedded systems-on-chip, in Cir cuits and Systems, 2003. ISCAS’03. Pr oceedings of the 2003 International Symposium on , v ol. 5. IEEE, 2003, pp. V –V . [8] H. P ark, J. Xu, J. P ark, J.-H. Ji, and G. W oo, “Design of on-chip deb ug system for embedded processor , in SoC Design Confer ence , 2008. ISOCC’08. International , v ol. 3. IEEE, 2008, pp. III–11 III–12. [9] P . Fezzardi, M. Lattuada, and F . Ferrandi, “Using ef ficient path profiling to optimize memory consumption of on-chip deb ugging for high-le v el synthes is, A CM T r ansactions on Embedded Computing Systems (TECS) , v ol. 16, no. 5s, p. 149, 2017. [10] A.-S. Jamal, J. Goeders, and S. J. E. W ilton, An FPGA o v erlay architecture supporting rapid imple- mentation of functional changes during on-chip deb ug, in 2018 28th International Confer ence on F ield Pr o gr ammable Lo gic and Applications (FPL) . IEEE, 2018. [11] A.-S. Jamal, An FPGA o v erlay architecture supporting softw are-lik e compile times during on-chip deb ug of high-le v el synthesis designs, Ph.D. dissertation, Uni v ersity of British Columbia, 2018. [12] P . Mishra and F . F arahmandi, P ost-Silicon V alidation and Deb ug . Cham, Switzerland: Springer , 2019. [13] H. Oh, T . Han, I. Choi, and S. Kang, An on-chip error detection method to reduce the post-silicon deb ug time, IEEE T r ansactions on Computer s , v ol. 66, no. 1, pp. 38–44, Jan 2017. [14] H. Oh, I. Choi, and S. Kang, “DRAM-based error detection method to reduce the post-silicon deb ug time for multiple identical c ores, IEEE T r ansactions on Computer s , v ol. 66, no. 9, pp. 1504–1517, Sep. 2017. [15] Y . Cao, H. P alombo, S. Ray, and H. Zheng, “Enhancing observ ability for post-silicon deb ug with on-chip communication monitors, in 2018 IEEE Computer Society Annual Symposium on VLSI (ISVLSI) , July 2018, pp. 602–607. [16] S. Ray , “Soc i nstrumentations: Pre -silicon preparation for post-silicon readiness, in P ost-Silicon V alida- tion and Deb ug . Springer , 2019, pp. 19–32. [17] R. Abdel-Khalek and V . Bertacco, “Post-silicon platform for the functional diagnosis and deb ug of netw orks-on-chip, A CM T r ansactions on Embedded Computing Systems (TECS) , v ol. 13, no. 3s, p. 112, 2014. [18] M. Abramo vici, “In-system silicon v alidation and deb ug, IEEE Design T est of Computer s , v ol. 25, no. 3, pp. 216–223, May 2008. [19] D. Holanda Noronha, R. Zhao, J. Goeders, W . Luk, and S. J. E. W ilton, “On-chip fpg a deb ug instrumenta- tion for machine learning applications, in Pr oceedings of the 2019 A CM/SIGD A International Symposium on F ield-Pr o gr ammable Gate Arr ays . A CM, 2019, pp. 110–115. [20] K. Rahmani and P . Mishra, “Feature-based signal selection for post-silicon deb ug using machine learn- ing, IEEE T r ansactions on Emer ging T opics in Computing , pp. 1–1, 2017. [21] D. A. P atterson and J. L. Hennessy , Computer Or ganization and Design: the Har dwar e/Softwar e Inter - face , 5th ed. Mor g an Kaufmann, 2014. [22] J. L. Hennessy and D. A. P atterson, Computer Ar c hit ectur e: a Quantitative Appr oac h , 5th ed. Mor g an Kaufmann, 2012. [23] D. M. Harris and S. L. Harris, Digital Design and Computer Ar c hitectur e , 2nd ed. Mor g an Kaufmann, 2013. [24] P . P . Chu, FPGA Pr ototyping by VHDL Examples . W ile y , 2008. [25] D. A. P atterson and J. L. Hennessy , Computer Or ganization and Design: the Har dwar e/Softwar e Inter - face , 3rd ed. Mor g an Kaufmann, 2005. TELK OMNIKA T elecommun Comput El Control, V ol. 18, No. 3, June 2020 : 1229 1236 Evaluation Warning : The document was created with Spire.PDF for Python.