CS 61C: Great Ideas in Computer Architecture Building Blocks for Datapaths Instructor:

advertisement
CS 61C:
Great Ideas in Computer Architecture
Building Blocks for Datapaths
Instructor:
Randy H. Katz
http://inst.eecs.Berkeley.edu/~cs61c/fa13
6/27/2016
Fall 2013 -- Lecture #18
1
You are Here!
Software
• Parallel Requests
Assigned to computer
e.g., Search “Katz”
Hardware
Smart
Phone
Warehouse
Scale
Computer
Harness
• Parallel Threads Parallelism &
Assigned to core
e.g., Lookup, Ads
Achieve High
Performance
Computer
• Parallel Instructions
>1 instruction @ one time
e.g., 5 pipelined instructions
• Parallel Data
>1 data item @ one time
e.g., Add of 4 pairs of words
• Hardware descriptions
All gates @ one time
Memory
Core
(Cache)
Input/Output
Instruction Unit(s)
Core
Functional
Unit(s)
A0+B0 A1+B1 A2+B2 A3+B3
Cache Memory
Today
Logic Gates
• Programming Languages
6/27/2016
…
Core
Fall 2013 -- Lecture #18
2
Levels of
Representation/Interpretation
High Level Language
Program (e.g., C)
Compiler
Assembly Language
Program (e.g., MIPS)
Assembler
Machine Language
Program (MIPS)
temp = v[k];
v[k] = v[k+1];
v[k+1] = temp;
lw
lw
sw
sw
0000
1010
1100
0101
$t0, 0($2)
$t1, 4($2)
$t1, 0($2)
$t0, 4($2)
1001
1111
0110
1000
1100
0101
1010
0000
Anything can be represented
as a number,
i.e., data or instructions
0110
1000
1111
1001
1010
0000
0101
1100
1111
1001
1000
0110
0101
1100
0000
1010
1000
0110
1001
1111
Machine
Interpretation
Hardware Architecture Description
(e.g., block diagrams)
Architecture
Implementation
Logic Circuit Description
(Circuit Schematic Diagrams) Fall 2013 -- Lecture #18
6/27/2016
3
Agenda
•
•
•
•
•
•
Timing and State Machines
Datapath Elements: Mux + ALU
MIPS-lite Datapath
CPU Timing
MIPS-lite Control
And, in Conclusion, …
6/27/2016
Fall 2013 -- Lecture #18
4
Agenda
•
•
•
•
•
•
Timing and State Machines
Datapath Elements: Mux + ALU
MIPS-lite Datapath
CPU Timing
MIPS-lite Control
And, in Conclusion, …
6/27/2016
Fall 2013 -- Lecture #18
5
Last Time: Summation Circuit
Register is used to
hold up the transfer
of data to adder
Square wave clock sets when things change
High (1)
Low (0)
Rough
timing …
High (1)
Low (0)
High (1)
Rounded Rectangle per clock means could be 1 or 0
Xi must be ready before clock edge due to adder delay
Low (0)
Time
6/27/2016
Fall 2013 -- Lecture #18
6
Register Internals
• n instances of a “Flip-Flop”
• Flip-flop name because the output flips and flops
between 0 and 1
• D is “data input”, Q is “data output”
• Also called “D-type Flip-Flop”
6/27/2016
Fall 2013 -- Lecture #18
7
Camera Analogy Timing Terms
• Want to take a portrait – timing right before
and after taking picture
• Set up time – don’t move since about to take
picture (open camera shutter)
• Hold time – need to hold still after shutter
opens until camera shutter closes
• Time click to data – time from open shutter
until can see image on output (viewfinder)
6/27/2016
Fall 2013 -- Lecture #18
8
Hardware Timing Terms
• Setup Time: when the input must be stable
before the edge of the CLK
• Hold Time: when the input must be stable
after the edge of the CLK
• “CLK-to-Q” Delay: how long it takes the output
to change, measured from the edge of the CLK
6/27/2016
Fall 2013 -- Lecture #18
9
FSM Maximum Clock Frequency
• What is the maximum frequency of this circuit?
Hint:
Frequency = 1/Period
Max Delay =
6/27/2016
Setup Time + CLK-to-Q Delay + CL Delay
Fall 2013 -- Lecture #18
10
Another Great (Theory) Idea:
Finite State Machines (FSM)
• You may have seen FSMs
in other classes (e.g.,
CS70)
• Same basic idea
• Function can be
represented with a
“state transition diagram”
• With combinational logic
and registers, any FSM can
be implemented in
hardware
6/27/2016
Fall 2013 -- Lecture #18
11
Example: 3 Ones FSM
FSM to detect the occurrence of 3 consecutive 1’s in the Input
Draw the FSM …
Assume state transitions are controlled by
the clock: On each clock cycle the machine checks the inputs and
moves to a new state and produces a new output …
6/27/2016
Fall 2013 -- Lecture #18
12
Hardware Implementation of FSM
Register needed to hold a representation of the machine’s state.
Unique bit pattern for each state.
+
Combinational logic circuit is used
to implement a function maps from
present state (PS) and input
to next state (NS) and output.
=
The register is used to break the feedback
path between Next State (NS) and Prior State
(PS), controlled by the clock
6/27/2016
Fall 2013 -- Lecture #18
13
Hardware for FSM:
Combinational Logic
Can look at its functional specification, truth table form
Truth table …
PS Input
00
0
00
1
01
0
01
1
10
0
10
1
6/27/2016
Fall 2013 -- Lecture #18
NS
00
01
00
10
00
00
Output
0
0
0
0
0
1
14
Hardware for FSM:
Combinational Logic
Truth table …
PS Input
00
0
00
1
01
0
01
1
10
0
10
1
6/27/2016
NS
00
01
00
10
00
00
Output
0
0
0
0
0
1
Fall 2013 -- Lecture #18
15
Hardware for FSM:
Combinational Logic
Alternative Truth Table format: list only
cases where value is a 1.Then restate as
logic equations using PS1, PS0, Input
Truth table …
PS Input
00
0
00
1
01
0
01
1
10
0
10
1
6/27/2016
NS
00
01
00
10
00
00
Output
0
0
0
0
0
1
Fall 2013 -- Lecture #18
16
Hardware for FSM:
Combinational Logic
Alternative Truth Table format: list only
cases where value is a 1.Then restate as
logic equations using PS1, PS0, Input
NS bit 0 is 1
PS Input
00
1
Truth table …
PS Input
00
0
00
1
01
0
01
1
10
0
10
1
6/27/2016
NS
00
01
00
10
00
00
Output
0
0
0
0
0
1
NS bit 1 is 1
PS Input
01
1
Output is 1
PS Input
10
1
Fall 2013 -- Lecture #18
17
Hardware for FSM:
Combinational Logic
Alternative Truth Table format: list only
cases where value is a 1.Then restate as
logic equations using PS1, PS0, Input
NS bit 0 is 1
PS Input
• NS0 = PS1PS0Input
Truth table …
PS Input
00
0
00
1
01
0
01
1
10
0
10
1
6/27/2016
NS
00
01
00
10
00
00
– NS0 = ~PS1~PS0Input
Output
0
0
0
0
0
1
00
1
NS bit 1 is 1
• NS1 = PS1PS0Input
– NS1 = ~PS1PS0Input
PS Input
01
1
Output is 1
• Output= PS1PS0Input
– Output= PS1~PS0Input
Fall 2013 -- Lecture #18
PS Input
10
1
18
Administrivia
• Project 3-2 due Sunday @ 11:59:59 PM
– This week’s lab very helpful for understanding
thread programming
6/27/2016
Fall 2013 -- Lecture #18
19
Happy Halloween!
6/27/2016
Fall 2013 -- Lecture #18
20
Rock-Scissors-Paper-Lizard-Spock
http://www.youtube.com/watch?v=cSLeBKT7-sM
6/27/2016
Fall 2013 -- Lecture #18
21
6/27/2016
Fall 2013 -- Lecture #18
22
Agenda
•
•
•
•
•
•
Timing and State Machines
Datapath Elements: Mux + ALU
MIPS-lite Datapath
CPU Timing
MIPS-lite Control
And, in Conclusion, …
6/27/2016
Fall 2013 -- Lecture #18
23
Design Hierarchy
system
control
datapath
code
registers multiplexer comparator
register
state
registers
combinational
logic
logic
switching
networks
6/27/2016
Fall 2013 -- Lecture #18
24
Conceptual MIPS Datapath
6/27/2016
Fall 2013 -- Lecture #18
25
Data Multiplexer
(e.g., 2-to-1 x n-bit-wide)
“mux”
6/27/2016
Fall 2013 -- Lecture #18
26
N Instances of 1-bit-Wide Mux
6/27/2016
Fall 2013 -- Lecture #18
27
How Do We Build a
1-bit-Wide Mux (in Logisim)?
s
6/27/2016
Fall 2013 -- Lecture #18
28
4-to-1 Multiplexer
How many rows in TT?
6/27/2016
Fall 2013 -- Lecture #18
29
Alternative Hierarchical Approach
(in Logisim)
6/27/2016
Fall 2013 -- Lecture #18
30
Arithmetic and Logic Unit
• Most processors contain a special logic block
called “Arithmetic and Logic Unit” (ALU)
• We’ll show you an easy one that does ADD,
SUB, bitwise AND, bitwise OR
6/27/2016
Fall 2013 -- Lecture #18
34
Simple ALU
6/27/2016
Fall 2013 -- Lecture #18
35
Adder/Subtractor: One-bit adder Least
Significant Bit
6/27/2016
Fall 2013 -- Lecture #18
36
Adder/Subtractor: One-bit adder (1/2)
…
6/27/2016
Fall 2013 -- Lecture #18
37
Adder/Subtractor: One-bit Adder (2/2)
…
6/27/2016
Fall 2013 -- Lecture #18
38
N x 1-bit Adders  1 N-bit Adder
Connect Carry Out i-1 to Carry in i:
b0
+
6/27/2016
+
Fall 2013 -- Lecture #18
+
39
Twos Complement Adder/Subtractor
6/27/2016
Fall 2013 -- Lecture #18
40
Critical Path
• When setting clock period in synchronous
systems, must allow for worst case
• Path through combinational logic that is worst
case called “critical path”
– Can be estimated by number of “gate delays”:
Number of gates must go through in worst case
• Idea: Doesn’t matter if speedup other paths if
don’t improve the critical path
• What might critical path of ALU?
6/27/2016
Fall 2013 -- Lecture #18
41
Agenda
•
•
•
•
•
•
Timing and State Machines
Datapath Elements: Mux + ALU
MIPS-lite Datapath
CPU Timing
MIPS-lite Control
And, in Conclusion, …
6/27/2016
Fall 2013 -- Lecture #18
42
Processor Design Process
• Five steps to design a processor:
Processor
1. Analyze instruction set 
Input
datapath requirements
Control
Memory
2. Select set of datapath
components & establish
Datapath
Output
clock methodology
3. Assemble datapath meeting
the requirements
4. Analyze implementation of each instruction to determine
setting of control points that effects the register transfer.
5. Assemble the control logic
• Formulate Logic Equations
• Design Circuits
6/27/2016
Fall 2013 -- Lecture #18
43
The MIPS-lite Subset
• ADDU and SUBU
31
op
– addu rd,rs,rt
– subu rd,rs,rt
• OR Immediate:
26
rs
6 bits
31
op
31
– lw rt,rs,imm16
– sw rt,rs,imm16
• BRANCH:
31
26
op
– beq rs,rt,imm16 6 bits
6/27/2016
5 bits
Fall 2013 -- Lecture #18
rd
shamt
funct
5 bits
5 bits
6 bits
0
16 bits
0
immediate
5 bits
21
rs
0
16
rt
5 bits
6
immediate
5 bits
21
rs
11
16
rt
5 bits
26
6 bits
5 bits
21
rs
op
16
rt
5 bits
26
– ori rt,rs,imm16 6 bits
• LOAD and
STORE Word
21
16 bits
16
rt
5 bits
0
immediate
16 bits
44
Register Transfer Language (RTL)
• RTL gives the meaning of the instructions
{op , rs , rt , rd , shamt , funct}  MEM[ PC ]
{op , rs , rt ,
Imm16}  MEM[ PC ]
• All start by fetching the instruction
Inst
Register Transfers
ADDU
R[rd]  R[rs] + R[rt]; PC  PC + 4
SUBU
R[rd]  R[rs] – R[rt]; PC  PC + 4
ORI
R[rt]  R[rs] | zero_ext(Imm16); PC  PC + 4
LOAD
R[rt]  MEM[ R[rs] + sign_ext(Imm16)]; PC  PC + 4
STORE
MEM[ R[rs] + sign_ext(Imm16) ]  R[rt]; PC  PC + 4
BEQ
if ( R[rs] == R[rt] )
then PC  PC + 4 + (sign_ext(Imm16) || 00)
else PC  PC + 4
6/27/2016
Fall 2013 -- Lecture #18
45
Step 1: Requirements of the
Instruction Set
• Memory (MEM)
– Instructions & data (will use one for each: really caches)
• Registers (R: 32 x 32)
– Read rs
– Read rt
– Write rt or rd
• PC
• Extender (sign/zero extend)
• Add/Sub/OR unit for operation on register(s) or extended
immediate
• Add 4 (+ maybe extended immediate) to PC
• Compare if registers equal?
6/27/2016
Fall 2013 -- Lecture #18
46
mux
+4
1. Instruction
Fetch
6/27/2016
rd
rs
rt
ALU
Data
memory
registers
PC
instruction
memory
Generic Steps of Datapath
imm
2. Decode/
Register
Read
Fall 2013 -- Lecture #18
3. Execute 4. Memory
5. Register
Write
47
Step 2: Components of the Datapath
• Combinational Elements
• State Elements + Clocking Methodology
• Building Blocks
OP
CarryIn
A
A
CarryOut
32
Adder
6/27/2016
B
32
32
Y
B
32
Multiplexer
Fall 2013 -- Lecture #18
32
ALU
32
Sum
A
MUX
Adder
B
32
Select
32
Result
32
ALU
48
ALU Needs for MIPS-lite + Rest of MIPS
• Addition, subtraction, logical OR, ==:
ADDU
SUBU
ORI
R[rd] = R[rs] + R[rt]; ...
R[rd] = R[rs] – R[rt]; ...
R[rt] = R[rs] | zero_ext(Imm16)...
BEQ
if ( R[rs] == R[rt] )...
• Test to see if output == 0 for any ALU operation
gives == test. How?
• P&H also adds AND, Set Less Than (1 if A < B, 0
otherwise)
• ALU from Appendix C, section C.5
6/27/2016
Fall 2013 -- Lecture #18
49
Storage Element: Idealized Memory
Write Enable
Address
• Memory (idealized)
– One input bus: Data In
– One output bus: Data Out
• Memory word is found by:
Data In
32
Clk
DataOut
32
– Address selects the word to put on Data Out
– Write Enable = 1: address selects the memory
word to be written via the Data In bus
• Clock input (CLK)
– CLK input is a factor ONLY during write operation
– During read operation, behaves as a combinational logic
block: Address valid  Data Out valid after “access time”
6/27/2016
Fall 2013 -- Lecture #18
50
Storage Element: Register (Building Block)
Write Enable
• Similar to D Flip Flop except
– N-bit input and output
– Write Enable input
• Write Enable:
Data In
Data Out
N
N
clk
– Negated (or deasserted) (0): Data Out will not
change
– Asserted (1): Data Out will become Data In on
rising edge of clock
6/27/2016
Fall 2013 -- Lecture #18
51
Storage Element: Register File
RW RA RB
Write Enable 5 5 5
• Register File consists of 32 registers:
– Two 32-bit output busses:
busA and busB
– One 32-bit input bus: busW
• Register is selected by:
busW
32
Clk
32 x 32-bit
Registers
busA
32
busB
32
– RA (number) selects the register to put on busA (data)
– RB (number) selects the register to put on busB (data)
– RW (number) selects the register to be written
via busW (data) when Write Enable is 1
• Clock input (clk)
– Clk input is a factor ONLY during write operation
– During read operation, behaves as a combinational logic block:
• RA or RB valid  busA or busB valid after “access time.”
6/27/2016
Fall 2013 -- Lecture #18
52
Step 3: Assemble DataPath Meeting
Requirements
• Register Transfer Requirements
 Datapath Assembly
• Instruction Fetch
• Read Operands and Execute
Operation
• Common RTL operations
clk
– Fetch the Instruction:
mem[PC]
– Update the program counter:
• Sequential Code:
PC  PC + 4
• Branch and Jump:
PC  “something else”
6/27/2016
Fall 2013 -- Lecture #18
PC
Next Address
Logic
Address
Instruction Word
Instruction
Memory
32
53
Step 3: Add & Subtract
• R[rd] = R[rs] op R[rt] (addu rd,rs,rt)
– Ra, Rb, and Rw come from instruction’s Rs, Rt, and Rd fields
31
26
op
6 bits
21
rs
5 bits
16
rt
5 bits
11
rd
5 bits
6
shamt
5 bits
0
funct
6 bits
– ALUctr and RegWr: control logic after decoding the instruction
rd rs rt
RegWr 5 5 5
Rw Ra Rb
32 x 32-bit
Registers
busA
32
busB
clk
ALU
busW
32
ALUctr
Result
32
32
• … Already defined the register file & ALU
6/27/2016
Fall 2013 -- Lecture #18
54
Agenda
•
•
•
•
•
•
Timing and State Machines
Datapath Elements: Mux + ALU
MIPS-lite Datapath
CPU Timing
MIPS-lite Control
And, in Conclusion, …
6/27/2016
Fall 2013 -- Lecture #18
55
Clocking Methodology
Clk
.
.
.
.
.
.
.
.
.
.
.
.
• Storage elements clocked by same edge
• “Critical path” (longest path through logic) determines length
of clock period
• Have to allow for Clock-to-Q and Setup Times too
• This lecture (and P&H sections) 4.3-4.4 do whole instruction
in 1 clock cycle for pedagogic reasons
– Project 4 will do it in 2 clock cycles via simple pipelining
– Soon explain pipelining and use 5 clock cycles per instruction
6/27/2016
Fall 2013 -- Lecture #18
56
Register-Register Timing:
One Complete Cycle
Clk
Clk-to-Q
PC Old Value
Rs, Rt, Rd,
Op, Func
Old Value
ALUctr
Old Value
RegWr
Old Value
busA, B
Old Value
busW
Old Value
New Value
Instruction Memory Access Time
New Value
Delay through Control Logic
New Value
New Value
Register File Access Time
New Value
ALU Delay
New Value
ALUctr
RegWr Rd Rs Rt
5
Rw
busW
5
5
Ra Rb
6/27/2016
clk
32
ALU
RegFile
busA
Setup Time
busB
32
Register Write
Occurs Here
32
Fall 2013 -- Lecture #18
57
Agenda
•
•
•
•
•
•
Timing and State Machines
Datapath Elements: Mux + ALU
MIPS-lite Datapath
CPU Timing
MIPS-lite Control
And, in Conclusion, …
6/27/2016
Fall 2013 -- Lecture #18
58
Register-Register Timing:
One Complete Cycle
Clk
Clk-to-Q
PC Old Value
Rs, Rt, Rd,
Op, Func
Old Value
ALUctr
Old Value
RegWr
Old Value
busA, B
Old Value
busW
Old Value
New Value
Instruction Memory Access Time
New Value
Delay through Control Logic
New Value
New Value
Register File Access Time
New Value
ALU Delay
New Value
ALUctr
RegWr Rd Rs Rt
5
Rw
busW
5
5
Ra Rb
6/27/2016
clk
32
ALU
RegFile
busA
Setup Time
busB
32
Register Write
Occurs Here
32
Fall 2013 -- Lecture #18
59
Logical Operations with Immediate
• R[rt] = R[rs] op ZeroExt[imm16]
31
26
21
op
16 15
rs
31 6 bits
0
rt
5 bits
immediate
5 bits 16 15
16 bits
0
immediate
0000000000000000
16 bits
16 bits
But we’re writing to Rt register??
And immediate ALU input??
ALUctr
RegWr Rd Rs Rt
5
Rw
busW
5
Ra Rb
6/27/2016
busA
32
ALU
RegFile
clk
5
busB
32
32
Fall 2013 -- Lecture #18
60
Logical Operations with Immediate
• R[rt] = R[rs] op ZeroExt[imm16]
31
26
21
op
rd
rt
1
0
RegWr
5
Rw
0
rt
5 bits
immediate
5 bits 16 15
0000000000000000
16 bits
16 bits
0
immediate
16 bits
2:1 multiplexor
rs
5
rt
ALUctr
5
Ra Rb
32
busA
busB
32
clk
16
ZeroExt
imm16
ALU
RegFile
32
6/27/2016
rs
31 6 bits
RegDst
16
0
32
• Already defined
32-bit MUX;
Zero Ext?
1
32
ALUSrc
Fall 2013 -- Lecture #18
61
Load Operations
• R[rt] = Mem[R[rs] + SignExt[imm16]]
Example: lw rt,rs,imm16
31
26
21
op
16
rs
6 bits
0
rt
5 bits
immediate
5 bits
16 bits
RegDst rd rt
1
RegWr
5
Rw
rs
5
ALUctr
5
Ra Rb
busA
busB
32
clk
imm16
32
ALU
RegFile
32
16
6/27/2016
rt
ZeroExt
What sign
extending??
And where is
Mem??
0
32
0
1
32
Fall 2013 -- Lecture #18
ALUSrc
62
Load Operations
• R[rt] = Mem[R[rs] + SignExt[imm16]]
Example: lw rt,rs,imm16
31
26
21
op
16
rs
6 bits
0
rt
5 bits
immediate
5 bits
16 bits
ALUctr
RegDst rd rt
1
RegWr
0
rs
5
5
Rw
busW
5
Ra Rb
busA
16
ExtOp
Extender
imm16
32
ALU
busB
32
clk
6/27/2016
rt
RegFile
32
MemtoReg
MemWr
32
0
0
1
? 32
Data In
ALUSrc
clk
32
Fall 2013 -- Lecture #18
WrEn Adr
Data
Memory
1
63
RTL: The Add Instruction
31
26
op
6 bits
21
rs
5 bits
16
rt
5 bits
11
6
0
rd
shamt
funct
5 bits
5 bits
6 bits
add rd, rs, rt
– MEM[PC]
Fetch the instruction from memory
– R[rd] = R[rs] + R[rt] The actual operation
– PC = PC + 4 Calculate the next instruction’s address
6/27/2016
Fall 2013 -- Lecture #18
64
Instruction Fetch Unit at Beginning of Add
• Fetch the instruction from Instruction memory:
Instruction = MEM[PC]
Inst
Memory
– same for
all instructions
nPC_sel
Inst Address
Adder
4
Instruction<31:0>
00
PC
Mux
Adder
PC Ext
clk
imm16
6/27/2016
Fall 2013 -- Lecture #18
65
Single Cycle Datapath during Add
31
26
op
21
16
rs
11
rt
rd
6
0
shamt
funct
R[rd] = R[rs] + R[rt]
RegWr=1
rs
5
5
Rw
busW
rt
5
Ra Rb
busB
32
imm16
16
ExtOp=x
Extender
clk
Rs Rt Rd Imm16
zero ALUctr=ADD
MemtoReg=0
MemWr=0
32
=
ALU
RegFile
32
6/27/2016
busA
32
0
0
32
1
Data In
32
ALUSrc=0
Fall 2013 -- Lecture #18
<0:15>
0
<11:15>
1
<16:20>
rt
<21:25>
rd
Instruction<31:0>
instr
fetch
unit
nPC_sel=+4
RegDst=1
clk
clk
WrEn Adr
Data
Memory
1
66
Instruction Fetch Unit at End of Add
• PC = PC + 4
– Same for all
instructions except:
Branch and Jump
Inst
Memory
nPC_sel=+4
Inst Address
Adder
4
00
PC
Mux
Adder
PC Ext
clk
imm16
6/27/2016
Fall 2013 -- Lecture #18
67
Single Cycle Datapath during OR Immediate
31
26
21
op
16
rs
0
rt
immediate
• R[rt] = R[rs] OR ZeroExt[Imm16]
Rs Rt
5
5
Rw
busW
5
Ra Rb
busA
busB
32
imm16
16
ExtOp=
Extender
clk
32
=
ALU
RegFile
32
6/27/2016
Rs Rt Rd
zero ALUctr=
0
<0:15>
RegWr=
<11:15>
1
clk
<16:20>
Rd Rt
Instruction<31:0>
instr
fetch
unit
<21:25>
nPC_sel=
RegDst=
Imm16
MemtoReg=
MemWr=
32
0
0
32
1
Data In
32
ALUSrc=
Fall 2013 -- Lecture #18
clk
WrEn Adr
Data
Memory
1
68
Single Cycle Datapath during Load
31
26
21
op
16
rs
0
rt
immediate
• R[rt] = Data Memory {R[rs] + SignExt[imm16]}
Rs Rt
5
5
Rw
busW
5
Ra Rb
busA
busB
32
imm16
16
ExtOp=
Extender
clk
32
=
ALU
RegFile
32
6/27/2016
Rs Rt Rd
zero ALUctr=
0
<0:15>
RegWr=
<11:15>
1
clk
Instruction<31:0>
<16:20>
Rd Rt
instr
fetch
unit
<21:25>
nPC_sel=
RegDst=
Imm16
MemtoReg=
MemWr=
32
0
0
32
1
Data In
32
ALUSrc=
Fall 2013 -- Lecture #18
clk
WrEn Adr
Data
Memory
1
69
Single Cycle Datapath during Store
31
26
21
op
16
rs
0
rt
immediate
• Data Memory {R[rs] + SignExt[imm16]} = R[rt]
Rs Rt
5
5
Rw
busW
5
Ra Rb
busA
busB
32
imm16
16
ExtOp=
Extender
clk
32
=
ALU
RegFile
32
6/27/2016
Rs Rt Rd
zero ALUctr=
0
<0:15>
RegWr=
<11:15>
1
clk
Instruction<31:0>
<16:20>
Rd Rt
instr
fetch
unit
<21:25>
nPC_sel=
RegDst=
Imm16
MemtoReg=
MemWr=
32
0
0
32
1
Data In
32
ALUSrc=
Fall 2013 -- Lecture #18
clk
WrEn Adr
Data
Memory
1
70
Single Cycle Datapath during Branch
31
26
21
op
•
16
rs
0
rt
immediate
if (R[rs] - R[rt] == 0) then Zero = 1 ; else Zero = 0
Rs Rt
5
5
Rw
busW
5
Ra Rb
busA
busB
32
imm16
16
ExtOp=
Extender
clk
32
=
ALU
RegFile
32
6/27/2016
Rs Rt Rd
zero ALUctr=
0
<0:15>
RegWr=
<11:15>
1
clk
<16:20>
Rd Rt
Instruction<31:0>
<21:25>
nPC_sel=
RegDst=
instr
fetch
unit
Imm16
MemtoReg=
MemWr=
32
0
0
32
1
Data In
32
ALUSrc=
Fall 2013 -- Lecture #18
clk
WrEn Adr
Data
Memory
1
71
Instruction Fetch Unit at the End of Branch
31
26
op
21
16
rs
0
rt
immediate
• if (Zero == 1) then PC = PC + 4 + SignExt[imm16]*4 ; else PC = PC + 4
Inst
Memory
Adr
nPC_sel
Zero
MUX
ctrl
nPC_sel
• What is encoding of nPC_sel?
0
00
• Direct MUX select?
• Branch inst. / not branch
Mux
PC
Adder
6/27/2016
PC Ext
imm16
Adder
4
Instruction<31:0>
1
clk
• Let’s pick 2nd option
nPC_sel
0
1
1
zero?
x
0
1
Fall 2013 -- Lecture #18
MUX
0
0
1
Q: What logic
gate?
72
Summary: Datapath’s Control Signals
• ExtOp:
• ALUsrc:
• ALUctr:
•
•
•
•
“zero”, “sign”
0  regB;
1  immed
“ADD”, “SUB”, “OR”
MemWr:
MemtoReg:
RegDst:
RegWr:
ALUctr
MemtoReg
MemWr
RegDst Rd Rt
1
Inst Address
RegWr
4
0
Rs Rt
5
5
Rw
busW
5
Ra Rb
busA
RegFile
busB
PC
Mux
32
clk
imm16
16
Extender
PC Ext
Adder
1
imm16
0
32 WrEn Adr
1
Data In
ALUSrc
clk
32
ExtOp
6/27/2016
32
0
32
clk
32
ALU
Adder
0
00
nPC_sel
1  write memory
0  ALU; 1  Mem
0  “rt”; 1  “rd”
1  write register
Fall 2013 -- Lecture #18
1
Data
Memory
73
Agenda
•
•
•
•
•
•
Timing and State Machines
Datapath Elements: Mux + ALU
MIPS-lite Datapath
CPU Timing
MIPS-lite Control
And, in Conclusion, …
6/27/2016
Fall 2013 -- Lecture #18
74
And. in Conclusion, …
Single-Cycle Processor
• Use muxes to select among input • Five steps to processor design:
– S input bits selects 2S inputs
– Each input can be n-bits wide,
independent of S
• Can implement muxes
hierarchically
• Arithmetic circuits are a kind of
combinational logic
Processor
Input
Control
Memory
Datapath
6/27/2016
Output
Fall 2013 -- Lecture #18
1. Analyze instruction set 
datapath requirements
2. Select set of datapath
components & establish
clock methodology
3. Assemble datapath meeting
the requirements
4. Analyze implementation of each
instruction to determine setting
of control points that effects the
register transfer.
5. Assemble the control logic
• Formulate Logic Equations
• Design Circuits
75
Download