Bài giảng Kỹ thuật vi xử lý - Phạm Ngọc Nam

1. Giớ i thiệ u chung về hệ vi xửlý

2. Bộ vi xửlý Intel 8088/8086

3. Lậ p trình hợ p ngữcho 8086

4. Tổ chứ c vào ra dữliệ u

5. Ngắ t và xửlý ngắ t

6. Truy cậ p bộ nhớtrự c tiế p DMA

7. Các bộ vi xửlý trên thự c tế

pdf525 trang | Chuyên mục: Kiến Trúc Máy Tính | Chia sẻ: dkS00TYs | Lượt xem: 2071 | Lượt tải: 1download
Tóm tắt nội dung Bài giảng Kỹ thuật vi xử lý - Phạm Ngọc Nam, để xem tài liệu hoàn chỉnh bạn click vào nút "TẢI VỀ" ở trên
ACU
I/O
Fixed multiply
17x17->34
P-cache
24 KByte
85
© DHBK 2005
Texas Instruments TMS320C8x
Fixed Point Video
• Series discontinued; typical app.: video phone, video conferencing,
multimedia workstations
• Introduced: 1995, 50 MHz, 305 pin
• Multiprocessor-on-a-chip; sub-word SIMD for each DSP
DSP processor 1
DSP processor 2
DSP processor 3
DSP processor 4
General purpose
RISC processor
Transfer
controller
data
address
32
64
Video controller
2 Kbyte RAM1
2 Kbyte RAM16
2 Kbyte I-cache1
2 Kbyte I-cache4
4 Kbyte D-cache
2 KByte RAM
4 KByte I-cache
X-
bar
86
© DHBK 2005
Texas Instruments TMS320C6201
High end Fixed Point
• Series continued; typical app.: modems, multimedia
• 1997, 0.25 m, 5ML, 352 pin, 200 MHz, 2.5V, 1.9W, $85
• Super scalar (8 Instr./cycle), 1600 MIPS
• VLIW: 256 bit instruction word
fixed MUL
16x16->32
fixed MUL
16x16->32
fixed ALU
32+32->40
fixed ALU
32+32->40
fixed ALU/branch
32+32->40
fixed ALU/branch
32+32->40
integer ACU
32+32
integer ACU
32+32
16KByte D-SRAM
16KByte D-SRAM
16KByte D-SRAM
16KByte D-SRAM
64KByte
P-SRAM/cache
JTAG / clock pump
4 channel DMA
2 Serial ports
2 Timers
Ext. memory
interface
data
address
17
16
Host interface
data
address
23
32
External memory
87
© DHBK 2005
Texas Instruments TMS320C6202
High end Fixed Point
• Series continued; typical app.: modems, multimedia
• 1999, 0.18 m, 5ML, 352 pin, 250 MHz, 1.8V, 1.9W, $130
• Super scalar (8 Instr./cycle), 2000 MIPS, scales well till 700 MHz
(6000 MIPS)
• Optimum choice when all data fits in on-chip memory
fixed MUL
16x16->32
fixed MUL
16x16->32
fixed ALU
32+32->40
fixed ALU
32+32->40
fixed ALU/branch
32+32->40
fixed ALU/branch
32+32->40
integer ACU
32+32
integer ACU
32+32
2x16KByte D-RAM
(Shadow load)
2x16KByte D-RAM
(Shadow load)
2x16KByte D-RAM
(Shadow load)
2x16KByte D-RAM
(Shadow load)
2x128KB P-RAM
(Shadow load)
JTAG / clock pump
4 channel DMA
2 Serial ports
2 Timers
Ext. memory
interface
data
address
17?
32
Expansion bus
data
address
23?
32
External memory
88
© DHBK 2005
Texas Instruments TMS320C6203
High end Fixed Point
• Series continued; typical app.: base stations
• 2000, 0.15 m, 5ML, 18 mm2 package size, 300 MHz, 1.5V, 1.5W
• Super scalar (8 Instr./cycle), 2400 MIPS
• Optimum choice when all data fits in on-chip memory
fixed MUL
16x16->32
fixed MUL
16x16->32
fixed ALU
32+32->40
fixed ALU
32+32->40
fixed ALU/branch
32+32->40
fixed ALU/branch
32+32->40
integer ACU
32+32
integer ACU
32+32
2x64KByte D-RAM
(Shadow load)
2x64KByte D-RAM
(Shadow load)
2x64KByte D-RAM
(Shadow load)
2x64KByte D-RAM
(Shadow load)
256KByte P-RAM
128KB P-cache/RAM
JTAG / clock pump
4 channel DMA
2 Serial ports
2 Timers
Ext. memory
interface
data
address
17?
32
Expansion bus
data
address
23?
32
External memory
89
© DHBK 2005
Texas Instruments TMS320C6211
High end Fixed Point
• Series continued; typical app.: modems, multimedia
• 1999, 0.18 m, 5ML, 256 pin, 150 MHz, 1.8V, 1.5W, $25
• VLIW, 1.2 GIPS; cheap (25$ in ‘99, 5$ in ‘01)
• Optimum for random access to large memory space
• 80% of performance of C6x with infinite on-chip memory
fixed MUL
16x16->32
fixed MUL
16x16->32
fixed ALU
32+32->40
fixed ALU
32+32->40
fixed ALU/branch
32+32->40
fixed ALU/branch
32+32->40
integer ACU
32+32
integer ACU
32+32
4KByte L1 Dcache
(2 way set assoc.)
4KByte L1 Pcache
(2 way set assoc.)
4x16KByte L2
cache (direct map)
JTAG / clock pump
16 channel DMA
2 Serial ports
2 Timers
Ext. memory
interface
data
address
17
16
Host port
data
address
30
32
External memory
90
© DHBK 2005
Texas Instruments TMS320C6416
High end Fixed Point
• Samples June 2001, 0.12 m, 6 LM, 532 pin, 400 MHz-600 MHz, 1.2V,
starts at 95$ in volume
• Super scalar (8 Instr./cycle), 3200-4800 MIPS
• Sub-word (8bit or 16bit) parallelism
• Specialized instr.: Galois Field Mult, bit manipulation
fixed MUL
16x16->32
fixed MUL
16x16->32
fixed ALU
32+32->40
fixed ALU
32+32->40
fixed ALU/branch
32+32->40
fixed ALU/branch
32+32->40
integer ACU
32+32
integer ACU
32+32
JTAG / clock pump
64 channel DMA
3 Serial ports
3 Timers
16 Kbyte L1P
direct mapped
16 Kbyte L1D
2way dual access
1 Mbyte RAM/L2
4way
Dual EMIF & HPI &
PCI & Utopia
data
address
?
32
HPI
data
address
30
64
External memory
data
address
30
16
Viterbi decoder
accelerator
Turbo decoder
accelerator
91
© DHBK 2005
Texas Instruments TMS320C6701
High end Floating Point
• Series continued; typical app.: video compression
• Introduced: 1998, 0.18 m, 5ML, 352 pin, 167 MHz, 1.8V
• Super scalar (8 Instr./cycle); VLIW; 1 GFLOP
• Foreseen for ‘00: 50$ (cf. C6211) & 3 GFLOP (cf. C6202)
Fixed/Float MUL
32x32/64x64
Fixed/Float MUL
32x32/64x64
Fixed/Float ALU
32+32/64+64
Fixed/Float ALU
32+32/64+64
Fixed ALU/Branch
Float 1/x &x
Fixed ALU/Branch
Float 1/x &x
integer ACU
32+32
integer ACU
32+32
16KByte D-SRAM
16KByte D-SRAM
16KByte D-SRAM
16KByte D-SRAM
64KByte
P-SRAM/cache
JTAG / clock pump
4 channel DMA
Serial interface
2 Timers
Ext. memory
interface
data
address
17
16
Host interface
data
address
23
32
External memory
92
© DHBK 2005
Texas Instruments TMS320C6711
High end Floating Point
• Series continued; typical app.: video compression
• 2000, 0.18 m, 5ML, 256 pin, 100 MHz, 1.8V, 2W, $20
• VLIW, 600 MFlops
• Optimum for random access to large memory space
• 80% of performance of C6x with infinite on-chip memory
Fixed/Float MUL
32x32/64x64
Fixed/Float MUL
32x32/64x64
Fixed/Float ALU
32+32/64+64
Fixed/Float ALU
32+32/64+64
Fixed ALU/Branch
Float 1/x &x
Fixed ALU/Branch
Float 1/x &x
integer ACU
32+32
integer ACU
32+32
JTAG / clock pump
4 channel DMA
Serial interface
2 Timers
Ext. memory
interface
data
address
17
16
Host interface
data
address
23
32
External memory
4KByte L1 Dcache
(2 way set assoc.)
4KByte L1 Pcache
(2 way set assoc.)
4x16KByte L2
cache (direct map)
93
© DHBK 2005
Texas Instruments
TMS320C541 (1995)
94
© DHBK 2005
Texas Instruments
TMS320C545 (1995)
95
© DHBK 2005
Texas Instruments
TMS320C80 (1994)
96
© DHBK 2005
Chương 7: Các bộvi xửlý trên thực tế
• General purpose microprocessors
Intel 80x86
Xu hướng phát triển
• Microcontrollers
Vi điều khiển của Motorola
Họvi điều khiển 8051
Họvi điều khiển AVR
PSOC
Xu hướng phát triển
• Digital signal processors
Texas Instruments
Motorola
Philips
Xu hướng phát triển
97
© DHBK 2005 Motorola MC56xxx
Audio Fixed Point
• 24 bit for audio: 16 bit data + overflow
16 or 24 bit
integer CPU
Loop controller Selection of
peripherals:
ADC, DAC, comm.,
timers, PIO, ...
ACU
PRAM
XRAM
YRAM
address
data
24
18
ACU
98
© DHBK 2005
Motorola MC56002
99
© DHBK 2005
Motorola MC56166
100
© DHBK 2005
Chương 7: Các bộvi xửlý trên thực tế
• General purpose microprocessors
Intel 80x86
Xu hướng phát triển
• Microcontrollers
Vi điều khiển của Motorola
Họvi điều khiển 8051
Họvi điều khiển AVR
PSOC
Xu hướng phát triển
• Digital signal processors
Texas Instruments
Motorola
Philips
Xu hướng phát triển
101
© DHBK 2005 Philips VSP-1
Fixed Point Video
• 12 bit for video: 8 bit data + overflow
• Clock Frequency: 27 MHz
• 1 instruction per sample period for HDTV,
2 instructions per sample period for TV
12 bit
integer ALU
12 bit
integer ALU
512x12 bit
Memory element
512x12 bit
Memory element
12 bit
integer ALU10x18 cross-bar
12
12
10
102
© DHBK 2005 Philips VSP-1
Fixed Point Video
ALU ALU ALU ME ME
Outputs
Inputs
103
© DHBK 2005
Philips VSP-1
Fixed Point Video
ALU
Memory
Element
Output
FIFOs
• 206K Transistors
• 1.1W dissipation
• 27 MHz clock
• 176 pin
• Introduced in 1991
104
© DHBK 2005 Philips VSP-2
Fixed Point Video
• 12 bit for video: 8 bit data + overflow
• Clock Frequency: 54 MHz
• 2 instructions per sample period for HDTV,
4 instructions per sample period for TV
22x50 cross-bar
22
12
12
12 bit
integer ALU1
12 bit
integer ALU2
512x12 bit
Memory element1
512x12 bit
Memory element2
12 bit
integer ALU12
512x12 bit
Memory element4
105
© DHBK 2005 Philips VSP-2
Fixed Point Video
• 1.15 M Transistors
• 5W dissipation
• 54 MHz clock frequency
• 208 pin
• Introduced in 1994
106
© DHBK 2005
Sony Graphics Engine
• Playstation 3
Status: prototype in 2001
287.5 MTOR
256 Mbit on-chip embedded DRAM
2000-bit wide internal bus
462 mm2
180 nm CMOS
107
© DHBK 2005
Chương 7: Các bộvi xửlý trên thực tế
• General purpose microprocessors
Intel 80x86
Xu hướng phát triển
• Microcontrollers
Vi điều khiển của Motorola
Họvi điều khiển 8051
Họvi điều khiển AVR
PSOC
Xu hướng phát triển
• Digital signal processors
Texas Instruments
Motorola
Philips
Xu hướng phát triển
108
© DHBK 2005
Trends for DSP processors
• No new generations that replace old generations,
but multiple co-existing architecture lines
• Word length application dependent
Automotive: 16-bit fixed point (e.g. C2x)
Speech: 32-bit floating point (e.g. C30)
Audio: 24-bit fixed point (e.g. MC56K)
Telecommunications: 16-32 bit fixed point (e.g. C5x, C6x)
Video: 12-32 bit fixed point (e.g. C8x)
• Single architecture line is whole family
different memory & on-chip peripherals
for embedded applications (cf. microcontrollers)
109
© DHBK 2005
Trends for DSP processors
• Deterministic behavior
no caches, no virtual memory, but on-chip RAM banks
no out-of-order execution
delayed branch prediction
• Increasing address space: 12 -> 32
• Multiple functions on single chip: CPU, FPU,
multiple RAM banks, ACUs, loop controller, ADC,
DAC, PWM, serial interfaces, …
• Often provisions for parallel processing

File đính kèm:

  • pdfKyThuatViXuLy.pdf