Bài giảng Kỹ thuật vi xử lý - Phạm Ngọc Nam
1. Giớ i thiệ u chung về hệ vi xửlý
2. Bộ vi xửlý Intel 8088/8086
3. Lậ p trình hợ p ngữcho 8086
4. Tổ chứ c vào ra dữliệ u
5. Ngắ t và xửlý ngắ t
6. Truy cậ p bộ nhớtrự c tiế p DMA
7. Các bộ vi xửlý trên thự c tế
ACU I/O Fixed multiply 17x17->34 P-cache 24 KByte 85 © DHBK 2005 Texas Instruments TMS320C8x Fixed Point Video • Series discontinued; typical app.: video phone, video conferencing, multimedia workstations • Introduced: 1995, 50 MHz, 305 pin • Multiprocessor-on-a-chip; sub-word SIMD for each DSP DSP processor 1 DSP processor 2 DSP processor 3 DSP processor 4 General purpose RISC processor Transfer controller data address 32 64 Video controller 2 Kbyte RAM1 2 Kbyte RAM16 2 Kbyte I-cache1 2 Kbyte I-cache4 4 Kbyte D-cache 2 KByte RAM 4 KByte I-cache X- bar 86 © DHBK 2005 Texas Instruments TMS320C6201 High end Fixed Point • Series continued; typical app.: modems, multimedia • 1997, 0.25 m, 5ML, 352 pin, 200 MHz, 2.5V, 1.9W, $85 • Super scalar (8 Instr./cycle), 1600 MIPS • VLIW: 256 bit instruction word fixed MUL 16x16->32 fixed MUL 16x16->32 fixed ALU 32+32->40 fixed ALU 32+32->40 fixed ALU/branch 32+32->40 fixed ALU/branch 32+32->40 integer ACU 32+32 integer ACU 32+32 16KByte D-SRAM 16KByte D-SRAM 16KByte D-SRAM 16KByte D-SRAM 64KByte P-SRAM/cache JTAG / clock pump 4 channel DMA 2 Serial ports 2 Timers Ext. memory interface data address 17 16 Host interface data address 23 32 External memory 87 © DHBK 2005 Texas Instruments TMS320C6202 High end Fixed Point • Series continued; typical app.: modems, multimedia • 1999, 0.18 m, 5ML, 352 pin, 250 MHz, 1.8V, 1.9W, $130 • Super scalar (8 Instr./cycle), 2000 MIPS, scales well till 700 MHz (6000 MIPS) • Optimum choice when all data fits in on-chip memory fixed MUL 16x16->32 fixed MUL 16x16->32 fixed ALU 32+32->40 fixed ALU 32+32->40 fixed ALU/branch 32+32->40 fixed ALU/branch 32+32->40 integer ACU 32+32 integer ACU 32+32 2x16KByte D-RAM (Shadow load) 2x16KByte D-RAM (Shadow load) 2x16KByte D-RAM (Shadow load) 2x16KByte D-RAM (Shadow load) 2x128KB P-RAM (Shadow load) JTAG / clock pump 4 channel DMA 2 Serial ports 2 Timers Ext. memory interface data address 17? 32 Expansion bus data address 23? 32 External memory 88 © DHBK 2005 Texas Instruments TMS320C6203 High end Fixed Point • Series continued; typical app.: base stations • 2000, 0.15 m, 5ML, 18 mm2 package size, 300 MHz, 1.5V, 1.5W • Super scalar (8 Instr./cycle), 2400 MIPS • Optimum choice when all data fits in on-chip memory fixed MUL 16x16->32 fixed MUL 16x16->32 fixed ALU 32+32->40 fixed ALU 32+32->40 fixed ALU/branch 32+32->40 fixed ALU/branch 32+32->40 integer ACU 32+32 integer ACU 32+32 2x64KByte D-RAM (Shadow load) 2x64KByte D-RAM (Shadow load) 2x64KByte D-RAM (Shadow load) 2x64KByte D-RAM (Shadow load) 256KByte P-RAM 128KB P-cache/RAM JTAG / clock pump 4 channel DMA 2 Serial ports 2 Timers Ext. memory interface data address 17? 32 Expansion bus data address 23? 32 External memory 89 © DHBK 2005 Texas Instruments TMS320C6211 High end Fixed Point • Series continued; typical app.: modems, multimedia • 1999, 0.18 m, 5ML, 256 pin, 150 MHz, 1.8V, 1.5W, $25 • VLIW, 1.2 GIPS; cheap (25$ in ‘99, 5$ in ‘01) • Optimum for random access to large memory space • 80% of performance of C6x with infinite on-chip memory fixed MUL 16x16->32 fixed MUL 16x16->32 fixed ALU 32+32->40 fixed ALU 32+32->40 fixed ALU/branch 32+32->40 fixed ALU/branch 32+32->40 integer ACU 32+32 integer ACU 32+32 4KByte L1 Dcache (2 way set assoc.) 4KByte L1 Pcache (2 way set assoc.) 4x16KByte L2 cache (direct map) JTAG / clock pump 16 channel DMA 2 Serial ports 2 Timers Ext. memory interface data address 17 16 Host port data address 30 32 External memory 90 © DHBK 2005 Texas Instruments TMS320C6416 High end Fixed Point • Samples June 2001, 0.12 m, 6 LM, 532 pin, 400 MHz-600 MHz, 1.2V, starts at 95$ in volume • Super scalar (8 Instr./cycle), 3200-4800 MIPS • Sub-word (8bit or 16bit) parallelism • Specialized instr.: Galois Field Mult, bit manipulation fixed MUL 16x16->32 fixed MUL 16x16->32 fixed ALU 32+32->40 fixed ALU 32+32->40 fixed ALU/branch 32+32->40 fixed ALU/branch 32+32->40 integer ACU 32+32 integer ACU 32+32 JTAG / clock pump 64 channel DMA 3 Serial ports 3 Timers 16 Kbyte L1P direct mapped 16 Kbyte L1D 2way dual access 1 Mbyte RAM/L2 4way Dual EMIF & HPI & PCI & Utopia data address ? 32 HPI data address 30 64 External memory data address 30 16 Viterbi decoder accelerator Turbo decoder accelerator 91 © DHBK 2005 Texas Instruments TMS320C6701 High end Floating Point • Series continued; typical app.: video compression • Introduced: 1998, 0.18 m, 5ML, 352 pin, 167 MHz, 1.8V • Super scalar (8 Instr./cycle); VLIW; 1 GFLOP • Foreseen for ‘00: 50$ (cf. C6211) & 3 GFLOP (cf. C6202) Fixed/Float MUL 32x32/64x64 Fixed/Float MUL 32x32/64x64 Fixed/Float ALU 32+32/64+64 Fixed/Float ALU 32+32/64+64 Fixed ALU/Branch Float 1/x &x Fixed ALU/Branch Float 1/x &x integer ACU 32+32 integer ACU 32+32 16KByte D-SRAM 16KByte D-SRAM 16KByte D-SRAM 16KByte D-SRAM 64KByte P-SRAM/cache JTAG / clock pump 4 channel DMA Serial interface 2 Timers Ext. memory interface data address 17 16 Host interface data address 23 32 External memory 92 © DHBK 2005 Texas Instruments TMS320C6711 High end Floating Point • Series continued; typical app.: video compression • 2000, 0.18 m, 5ML, 256 pin, 100 MHz, 1.8V, 2W, $20 • VLIW, 600 MFlops • Optimum for random access to large memory space • 80% of performance of C6x with infinite on-chip memory Fixed/Float MUL 32x32/64x64 Fixed/Float MUL 32x32/64x64 Fixed/Float ALU 32+32/64+64 Fixed/Float ALU 32+32/64+64 Fixed ALU/Branch Float 1/x &x Fixed ALU/Branch Float 1/x &x integer ACU 32+32 integer ACU 32+32 JTAG / clock pump 4 channel DMA Serial interface 2 Timers Ext. memory interface data address 17 16 Host interface data address 23 32 External memory 4KByte L1 Dcache (2 way set assoc.) 4KByte L1 Pcache (2 way set assoc.) 4x16KByte L2 cache (direct map) 93 © DHBK 2005 Texas Instruments TMS320C541 (1995) 94 © DHBK 2005 Texas Instruments TMS320C545 (1995) 95 © DHBK 2005 Texas Instruments TMS320C80 (1994) 96 © DHBK 2005 Chương 7: Các bộvi xửlý trên thực tế • General purpose microprocessors Intel 80x86 Xu hướng phát triển • Microcontrollers Vi điều khiển của Motorola Họvi điều khiển 8051 Họvi điều khiển AVR PSOC Xu hướng phát triển • Digital signal processors Texas Instruments Motorola Philips Xu hướng phát triển 97 © DHBK 2005 Motorola MC56xxx Audio Fixed Point • 24 bit for audio: 16 bit data + overflow 16 or 24 bit integer CPU Loop controller Selection of peripherals: ADC, DAC, comm., timers, PIO, ... ACU PRAM XRAM YRAM address data 24 18 ACU 98 © DHBK 2005 Motorola MC56002 99 © DHBK 2005 Motorola MC56166 100 © DHBK 2005 Chương 7: Các bộvi xửlý trên thực tế • General purpose microprocessors Intel 80x86 Xu hướng phát triển • Microcontrollers Vi điều khiển của Motorola Họvi điều khiển 8051 Họvi điều khiển AVR PSOC Xu hướng phát triển • Digital signal processors Texas Instruments Motorola Philips Xu hướng phát triển 101 © DHBK 2005 Philips VSP-1 Fixed Point Video • 12 bit for video: 8 bit data + overflow • Clock Frequency: 27 MHz • 1 instruction per sample period for HDTV, 2 instructions per sample period for TV 12 bit integer ALU 12 bit integer ALU 512x12 bit Memory element 512x12 bit Memory element 12 bit integer ALU10x18 cross-bar 12 12 10 102 © DHBK 2005 Philips VSP-1 Fixed Point Video ALU ALU ALU ME ME Outputs Inputs 103 © DHBK 2005 Philips VSP-1 Fixed Point Video ALU Memory Element Output FIFOs • 206K Transistors • 1.1W dissipation • 27 MHz clock • 176 pin • Introduced in 1991 104 © DHBK 2005 Philips VSP-2 Fixed Point Video • 12 bit for video: 8 bit data + overflow • Clock Frequency: 54 MHz • 2 instructions per sample period for HDTV, 4 instructions per sample period for TV 22x50 cross-bar 22 12 12 12 bit integer ALU1 12 bit integer ALU2 512x12 bit Memory element1 512x12 bit Memory element2 12 bit integer ALU12 512x12 bit Memory element4 105 © DHBK 2005 Philips VSP-2 Fixed Point Video • 1.15 M Transistors • 5W dissipation • 54 MHz clock frequency • 208 pin • Introduced in 1994 106 © DHBK 2005 Sony Graphics Engine • Playstation 3 Status: prototype in 2001 287.5 MTOR 256 Mbit on-chip embedded DRAM 2000-bit wide internal bus 462 mm2 180 nm CMOS 107 © DHBK 2005 Chương 7: Các bộvi xửlý trên thực tế • General purpose microprocessors Intel 80x86 Xu hướng phát triển • Microcontrollers Vi điều khiển của Motorola Họvi điều khiển 8051 Họvi điều khiển AVR PSOC Xu hướng phát triển • Digital signal processors Texas Instruments Motorola Philips Xu hướng phát triển 108 © DHBK 2005 Trends for DSP processors • No new generations that replace old generations, but multiple co-existing architecture lines • Word length application dependent Automotive: 16-bit fixed point (e.g. C2x) Speech: 32-bit floating point (e.g. C30) Audio: 24-bit fixed point (e.g. MC56K) Telecommunications: 16-32 bit fixed point (e.g. C5x, C6x) Video: 12-32 bit fixed point (e.g. C8x) • Single architecture line is whole family different memory & on-chip peripherals for embedded applications (cf. microcontrollers) 109 © DHBK 2005 Trends for DSP processors • Deterministic behavior no caches, no virtual memory, but on-chip RAM banks no out-of-order execution delayed branch prediction • Increasing address space: 12 -> 32 • Multiple functions on single chip: CPU, FPU, multiple RAM banks, ACUs, loop controller, ADC, DAC, PWM, serial interfaces, … • Often provisions for parallel processing
File đính kèm:
- KyThuatViXuLy.pdf