其他分享
首页 > 其他分享> > 单周期CPU

单周期CPU

作者:互联网

一、设计思路

1、CPU的意义

CPU是计算机的核心,因为它是计算机指令的处理单元。

计算机体系结构包含两个方面,一个方面是指令集,一个方面是硬件实现。指令集是计算机被定义拥有的执行指令,计算机通过支持指令集的运行,来完成计算工作并为程序员编程服务。硬件实现则是具体的硬件去实现指令集,这个硬件实现的核心就是CPU的设计。

这里写的CPU的设计是32位机器的CPU,指令和数据均为32位。支持指令为简化mips指令集。

2、CPU的设计

CPU的设计包含数据通路的设计和控制器的设计。数据通路是执行指令必须的硬件(ALU、IM、DM、GRF等),控制器则是根据指令产生相应控制信号,来控制相应硬件以支持多条指令。

二、实际操作

0、设计说明

CPU架构的设计是没有很多约束的,基本要求就是能够支持指令集,基于不同的考量可以有不同的设计。举例来说:对于beq指令是否跳转的判断,可以借用ALU的减法计算,也可以直接增设CMP比较器得出,两种方式都可以,因为功能正确。为了提高吞吐量,或者为了节省成本,会选择一些特别的设计,这一点在流水线CPU 的设计上可以明显地看出。

CPU具体设计的方法是我下面进行的几步:列出所需指令,写出功能模块,连接模块,构造控制器,全部连接起来。这些表格对最终代码实现十分重要,因为代码量较大,先从表格检查起,再依据表格写码可以减少bug。

1、支持指令

列出支持指令并将其分类:

strldcal_rcal_iluib_typejjrjaljalrshamt
swlwadduoribeqsll
subusltisra
sltaddiusrl
sllv
srav
srlv

2、功能模块

先按照lw指令列出所需功能模块(lw经过模块最多),再依次检查现有模块是否支持其余指令,若不能支持,则添加相应模块。

moduleinputoutput功能描述
PCD(端口名,下同)Q指令计数器
ADD4PCPC4加4运算
IMIAIR指令存储器
RFA1RD1寄存器堆的读出部分
A2RD2
EXTI16EXTD选择输出SIMM,LIMM,UIMM
EXTOP
CMPD1RES比较两个数据的大小,以决定是否分支
D2
NPCPC4NEXTPC计算BPC,JPC
I26
NPCOP
ALUAAO执行不同计算
B
SHAMT
ALUOP
DMDARD数据存储器,支持写入字和读出字
WD
WM
RM
RFA3寄存器堆的写入部分
WD3
WR

注:上表的SIMM表示将16位立即数作符号扩展成32位;UIMM则是无符号扩展;LIMM则是专门为lui指令设计的将低16位移至高16位,并扩展至32位。BPC则是将指令地址(PC)按照分支指令地址变换方式计算分支地址,JPC则是将指令地址(PC)按照跳转指令地址变换方式计算跳转地址。

3、数据通路

在上一步列出的所有功能模块基础上,考虑如何连接这些模块。

需要分别列出所有指令的数据通路(即在功能模块之间连线),并将其整合起来,得到最终的数据通路(完成功能模块的连接)。

moduleinputLDSTRCAL_RCAL_ILUIB_TYPEJJRJALJALRSHAMT
PC
ADD4PCQQQQQQQQQQQ
IMIAQQQQQQQQQQQ
PCDPC4PC4PC4PC4PC4PC4PC4PC4PC4PC4PC4
RFA1IR[RS]IR[RS]IR[RS]IR[RS]IR[RS]IR[RS]IR[RS]
A2IR[RT]IR[RT]IR[RT]
EXTI16IR[I16]IR[I16]IR[I16]IR[I16]
CMPD1RD1
D2RD2
NPCPC4PC4PC4PC4
I26IR[I16]IR[I26]IR[I26]
PCDNEXTPCNEXTPCRD1NEXTPCRD1
ALUARD1RD1RD1RD1
BEXTDEXTDRD2EXTDRD2
SHAMTIR[SH]
DMDAAOAO
WDRD2
RFA3IR[RT]IR[RD]IR[RD]IR[RT]$31IR[RD]IR[RD]
WD3RDAOAOEXTDPC4PC4AO

注:IR[RS]是说,IR端口数据(指令)的RS字段。

4、控制器设计

INPUTINPUTOUTPUTOUTPUTOUTPUTOUTPUTOUTPUTOUTPUTOUTPUTOUTPUTOUTPUTOUTPUTOUTPUT
INSTRUCTIONOPFUNCTPCSELEXTOPNPCOPALUCTRLALUOPBSELWMRMA3SELWD3SELWR
LW100011X00X00(+)00(不写)1(读字)001
SW101011X00X00(+)01(写字)0XX0
ADD0000001000000XX15FUNCT100111
ADDU0000001000010XX15FUNCT100111
SUB0000001000100XX15FUNCT100111
SUBU0000001000110XX15FUNCT100111
SHAMT000000FUNCT0XX15FUNCT100111
ORI001101X02X22000011
SLTI001010X00X33000011
ADDIU001001X00X00000011
LUI001111X01XXXX00021
BEQ000100XRESX0XXX00XX0
J000010X1X1XXX00XX0
JR0000000010002XXXXX00XX0
JAL000011X1X1XXX00231
JALR0000000010012XXXXX00131
NOP0000000000000XXXXX00XX0

三、代码实现

Verilog语言编写的单周期CPU

module MFPC( //PC的多选器
    input [31:0] pc4,
    input [31:0] nextpc,
    input [31:0] rd1,
    input [1:0] pcsel,
    output [31:0] d
    );
     assign d = (pcsel == 2'd0)? pc4:
                     (pcsel == 2'd1)? nextpc:
                     (pcsel == 2'd2)? rd1:
                     32'h00003000;
​
endmodule
module PC( //PC
    input clk,
    input reset,
    input enable,
    input [31:0] d,
    output [31:0] q
    );
     reg [31:0] pc;
     initial begin
         pc = 32'h00003000;
     end
     always@(posedge clk) begin
         if(enable) begin
             if(reset) begin
                 pc <= 32'h00003000;
             end
             else begin
                 pc <= d;
             end
         end
     end
     assign q = pc;
​
endmodule
module ADD4( //自增4模块
    input [31:0] pc,
    output [31:0] pc4
    );
     assign pc4 = pc + 4;
​
endmodule
module IM( //指令存储器模块
    input [31:0] ia,
     input reset,
     input clk,
    output [31:0] ir
    );
     parameter [31:0] num_instr = 1024;
     reg [31:0] im [num_instr-1:0];
     integer i;
     initial begin
         $readmemh("code.txt",im);
     end
     assign ir = im[ia[11:2]];
endmodule
module MFA3(  //A3端口数据选择器
    input [4:0] ir_rt,
    input [4:0] ir_rd,
    input [1:0] a3sel,
    output [4:0] a3
    );
     assign a3 = (a3sel == 0)? ir_rt:
                     (a3sel == 1)? ir_rd:
                     (a3sel == 2)? 5'd31:
                     5'd0;
​
​
endmodule
module MFWD3( //WD3端口数据选择器
    input [31:0] rd,
    input [31:0] ao,
    input [31:0] extd,
    input [31:0] pc4,
    input [2:0] wd3sel,
    output [31:0] wd3
    );
     assign wd3 = (wd3sel == 0)? rd:
                      (wd3sel == 1)? ao:
                      (wd3sel == 2)? extd:
                      (wd3sel == 3)? pc4:
                      0;
​
endmodule
module EXT( //扩展模块
    input [15:0] i16,
    input [1:0] extop,
    output [31:0] extd
    );
     assign extd = (extop == 0)? {{16{i16[15]}},i16}:
                        (extop == 1)? {i16,16'b0}:
                        (extop == 2)? {16'b0,i16}:
                        0;
endmodule
​
module CMP( //比较器
    input [31:0] d1,
    input [31:0] d2,
    output res
    );
     assign res = (d1 == d2);
​
endmodule
module NPC( //下一条指令生成模块
    input [31:0] pc4,
    input [25:0] i26,
    input npcop,
    output [31:0] nextpc
    );
	 assign nextpc = (npcop == 0)? $signed(pc4) + $signed({{14{i26[15]}},i26[15:0],2'b0}): {pc4[31:28],i26,2'b0};

endmodule
module ALU( //ALU模块
    input [31:0] a,
    input [31:0] b,
    input [4:0] shamt,
    input [3:0] aluop,
    output [31:0] ao
    );
	 assign ao = (aluop == 0)? $signed(a) + $signed(b):
					 (aluop == 1)? $signed(a) - $signed(b):
					 (aluop == 2)? a | b:
					 (aluop == 3)? ($signed(a) < $signed(b)):
					 (aluop == 4)? (b << a[4:0]):
					 (aluop == 5)? $signed($signed(b) >>> a[4:0]):
					 (aluop == 6)? (b >> a[4:0]):
					 (aluop == 7)? (b << shamt):
					 (aluop == 8)? $signed($signed(b) >>> shamt):
					 (aluop == 9)? (b >> shamt):
					 32'b0;
	 
	

endmodule
module MFB( //B端口选择器
    input [31:0] extd,
    input [31:0] rd2,
    input bsel,
    output [31:0] b
    );
	 assign b = (bsel == 0)? extd:
					rd2;


endmodule
module RF( //寄存器堆模块
    input [4:0] a1,
    input [4:0] a2,
    input [4:0] a3,
    input [31:0] wd3,
    output [31:0] rd1,
    output [31:0] rd2,
    input reset,
    input clk,
    input wr
    );
	 reg [31:0] rf [31:1];
	 integer i;
	 initial begin
		 for(i = 1 ;i < 32; i = i + 1) begin
			 rf[i] = 32'b0;
		 end
	 end
	 always@(posedge clk) begin
		 if(reset) begin
			 for(i = 1;i < 32;i = i + 1) begin
				 rf[i] = 32'b0;
			 end
		 end
		 else if(wr && a3 != 5'b0) begin
			 rf[a3] <= wd3;
		 end
	 end
	 assign rd1 = (a1 == 5'd0)? 32'b0: 
					  rf[a1];
	 assign rd2 = (a2 == 5'd0)? 32'b0:
					  rf[a2];
	
endmodule
module DM( //数据存储器模块
    input [31:0] da,
    input [31:0] wd,
    input [1:0] wm,
    input [2:0] rm,
	 input clk,
	 input reset,
    output [31:0] rd
    );
	 parameter [31:0] num_data = 4096; // 1024words
	 reg [7:0] dm [num_data-1:0];
	 integer i;
	 wire [7:0] temp3,temp2,temp1,temp0;
	 initial begin
		 for(i = 0; i < num_data; i = i + 1)
		 dm[i] = 8'b0;
	 end
	 
	 always@(posedge clk) begin
		 if(reset) begin
			 for(i = 0; i < num_data; i = i + 1)
			 dm[i] = 8'b0;
		 end
		 else if(wm != 0) begin
			 case(wm)
				 1: begin //sw
					 dm[da[11:0]] <= wd[7:0];
					 dm[da[11:0] + 12'd1] <= wd[15:8];
					 dm[da[11:0] + 12'd2] <= wd[23:16];
					 dm[da[11:0] + 12'd3] <= wd[31:24];
				 end
				 2: begin //sh
					 dm[da[11:0]] <= wd[7:0];
					 dm[da[11:0] + 12'd1] <= wd[15:8];
				 end
				 3: begin //sb
					 dm[da[11:0]] <= wd[7:0];
				 end
			 endcase
		 end
	 end
	 assign temp3 = dm[da[11:0] + 12'd3];
	 assign temp2 = dm[da[11:0] + 12'd2];
	 assign temp1 = dm[da[11:0] + 12'd1];
	 assign temp0 = dm[da[11:0]];
	 
	 assign rd = (rm == 1)? {temp3,temp2,temp1,temp0}: //lw
					 (rm == 2)? {16'b0,temp1,temp0}: //lhu
					 (rm == 3)? {{16{temp1[7]}},temp1,temp0}: //lh
					 (rm == 4)? {24'b0,temp0}: //lbu
					 (rm == 5)? {{24{temp0[7]}},temp0}: //lb
					 0;
					 
					 

endmodule
module MAINDECODER( //主译码器模块
    input [5:0] op,
	 input [5:0] funct,
	 input [4:0] ir_rd,
	 input res,
    output reg [1:0] pcsel,
    output reg [1:0] extop,
    output reg npcop,
    output reg [3:0] aluctrl,
    output reg bsel,
    output reg [1:0] wm,
    output reg [2:0] rm,
    output reg [1:0] a3sel,
    output reg [2:0] wd3sel,
    output reg wr
    );
	 
	 always@(*) begin
		 case(op)
			 6'b0: begin
				 if(funct == 6'b001000) begin //jr
					 pcsel = 2'd2;
					 wm = 2'd0;
					 rm = 3'd0;
					 wr = 1'd0;
				 end
				 else if(funct == 6'b001001) begin //jalr
					 pcsel = 2'd2;
					 wm = 2'd0;
					 rm = 3'd0;
					 a3sel = 2'd1;
					 wd3sel = 3'd3;
					 wr = 1'd1;
				 end
				 else if(funct == 6'b0 && ir_rd == 5'b0) begin //nop
					 pcsel = 2'd0;
					 wm = 2'd0;
					 rm = 3'd0;
					 wr = 1'd0;
				 end
				 else begin //cal_r,shamt
					 pcsel = 2'd0;
					 aluctrl = 4'd15;
					 bsel = 1'd1;
					 wm = 2'd0;
					 rm = 3'd0;
					 a3sel = 2'd1;
					 wd3sel = 3'd1;
					 wr = 1'd1;
				 end
			 end
			 6'b100011: begin //lw
				 pcsel = 2'd0;
				 extop = 2'd0;
				 aluctrl = 4'd0;
				 bsel = 1'd0;
				 wm = 2'd0;
				 rm = 3'd1;
				 a3sel = 2'd0;
				 wd3sel = 3'd0;
				 wr = 1'd1;
			 end
			 6'b100101: begin //lhu
				 pcsel = 2'd0;
				 extop = 2'd0;
				 aluctrl = 4'd0;
				 bsel = 1'd0;
				 wm = 2'd0;
				 rm = 3'd2;
				 a3sel = 2'd0;
				 wd3sel = 3'd0;
				 wr = 1'd1;
			 end
			 6'b100001: begin //lh
				 pcsel = 2'd0;
				 extop = 2'd0;
				 aluctrl = 4'd0;
				 bsel = 1'd0;
				 wm = 2'd0;
				 rm = 3'd3;
				 a3sel = 2'd0;
				 wd3sel = 3'd0;
				 wr = 1'd1;
			 end
			 6'b100100: begin //lbu
				 pcsel = 2'd0;
				 extop = 2'd0;
				 aluctrl = 4'd0;
				 bsel = 1'd0;
				 wm = 2'd0;
				 rm = 3'd4;
				 a3sel = 2'd0;
				 wd3sel = 3'd0;
				 wr = 1'd1;
			 end
			 6'b100000: begin //lb
				 pcsel = 2'd0;
				 extop = 2'd0;
				 aluctrl = 4'd0;
				 bsel = 1'd0;
				 wm = 2'd0;
				 rm = 3'd5;
				 a3sel = 2'd0;
				 wd3sel = 3'd0;
				 wr = 1'd1;
			 end
			 6'b101011: begin //sw
				 pcsel = 2'd0;
				 extop = 2'd0;
				 aluctrl = 4'd0;
				 bsel = 1'd0;
				 wm = 2'd1;
				 rm = 3'd0;
				 wr = 1'd0;
			 end
			 6'b101001: begin //sh
				 pcsel = 2'd0;
				 extop = 2'd0;
				 aluctrl = 4'd0;
				 bsel = 1'd0;
				 wm = 2'd2;
				 rm = 3'd0;
				 wr = 1'd0;
			 end
			 6'b101000: begin //sb
				 pcsel = 2'd0;
				 extop = 2'd0;
				 aluctrl = 4'd0;
				 bsel = 1'd0;
				 wm = 2'd3;
				 rm = 3'd0;
				 wr = 1'd0;
			 end
			 6'b001111: begin //lui
				 pcsel = 2'd0;
				 extop = 2'd1;
				 wm = 2'd0;
				 rm = 3'd0;
				 a3sel = 2'd0;
				 wd3sel = 3'd2;
				 wr = 1'd1;
			 end
			 6'b000100: begin //beq
				 pcsel = (res == 0)? 2'd0: 2'd1;
				 npcop = 1'd0;
				 wm = 2'd0;
				 rm = 3'd0;
				 wr = 1'd0;
			 end
			 6'b000010: begin //j
				 pcsel = 2'd1;
				 npcop = 1'd1;
				 wm = 2'd0;
				 rm = 3'd0;
				 wr = 1'd0;
			 end
			 6'b000011: begin //jal
				 pcsel = 2'd1;
				 npcop = 1'd1;
				 wm = 2'd0;
				 rm = 3'd0;
				 a3sel = 2'd2;
				 wd3sel = 3'd3;
				 wr = 1'd1;
			 end
			 6'b001101: begin //ori
				 pcsel = 2'd0;
				 extop = 2'd2;
				 aluctrl = 4'd2;
				 bsel = 1'd0;
				 wm = 2'd0;
				 rm = 3'd0;
				 a3sel = 2'd0;
				 wd3sel = 3'd1;
				 wr = 1'd1;
			 end
			 6'b001010: begin //slti
				 pcsel = 2'd0;
				 extop = 2'd0;
				 aluctrl = 4'd3;
				 bsel = 1'd0;
				 wm = 2'd0;
				 rm = 3'd0;
				 a3sel = 2'd0;
				 wd3sel = 3'd1;
				 wr = 1'd1;
			 end
			 6'b001001: begin //addiu
				 pcsel = 2'd0;
				 extop = 2'd0;
				 aluctrl = 4'd0;
				 bsel = 1'd0;
				 wm = 2'd0;
				 rm = 3'd0;
				 a3sel = 2'd0;
				 wd3sel = 3'd1;
				 wr = 1'd1;
			 end
		 endcase
	 end


endmodule
module ALUDECODER( //ALU译码器模块
    input [3:0] aluctrl,
    input [5:0] funct,
    output reg [3:0] aluop
    );
	 
	 initial aluop = 0;
	 always@(*) begin
		  if(aluctrl != 4'd15) begin
			  aluop = aluctrl;
		  end
		  else begin
			 case(funct)
				 6'b100001: aluop = 0;//addu
				 6'b100011: aluop = 1;//subu
				 6'b101010: aluop = 3;//slt
				 6'b000100: aluop = 4;//sllv
				 6'b000111: aluop = 5;//srav
				 6'b000110: aluop = 6;//srlv
				 6'b000000: aluop = 7;//sll
				 6'b000011: aluop = 8;//sra
				 6'b000010: aluop = 9;//srl
			 endcase
		 end
	 end
	 

endmodule
module DECODER( //译码器模块
    input [5:0] op,
    input [5:0] funct,
	 input [4:0] ir_rd,
    input res,
    output [1:0] pcsel,
    output [1:0] extop,
    output npcop,
    output [3:0] aluop,
    output bsel,
    output [1:0] wm,
    output [2:0] rm,
    output [1:0] a3sel,
    output [2:0] wd3sel,
    output wr
    );
	 wire [3:0] aluctrl;
	 
	 MAINDECODER maindecoder (
    .op(op), 
    .funct(funct), 
    .res(res), 
	 .ir_rd(ir_rd),
    .pcsel(pcsel), 
    .extop(extop), 
    .npcop(npcop), 
    .aluctrl(aluctrl), 
    .bsel(bsel), 
    .wm(wm), 
    .rm(rm), 
    .a3sel(a3sel), 
    .wd3sel(wd3sel), 
    .wr(wr)
    );
	 
	 ALUDECODER aludecoder (
    .aluctrl(aluctrl), 
    .funct(funct), 
    .aluop(aluop)
    );
	 

endmodule
`define rs 25:21
`define rt 20:16
`define rd 15:11
`define op 31:26
`define i16 15:0
`define sh 10:6
`define i26 25:0
`define funct 5:0

module mips( //CPU模块
    input clk,
    input reset
    );
	 parameter enable = 1'd1;
	 wire [31:0] d,q,pc4,ir,wd3,rd1,rd2,extd,nextpc,ao,b,rd;
	 wire [4:0] a3;
	 wire res;
	 wire [1:0] pcsel;
    wire [1:0] extop;
    wire npcop;
    wire [3:0] aluop;
    wire bsel;
    wire [1:0] wm;
    wire [2:0] rm;
    wire [1:0] a3sel;
    wire [2:0] wd3sel;
    wire wr;
	 
	 MFPC mfpc (
    .pc4(pc4), 
    .nextpc(nextpc), 
    .rd1(rd1), 
    .pcsel(pcsel), 
    .d(d)
    );
	 
	 PC pc (
    .clk(clk), 
    .reset(reset), 
    .enable(enable), 
    .d(d), 
    .q(q)
    );
	 
	 ADD4 add4 (
    .pc(q), 
    .pc4(pc4)
    );
	 
	 IM im (
    .ia(q), 
    .reset(reset), 
    .clk(clk), 
    .ir(ir)
    );
	 
	 MFA3 mfa3 (
    .ir_rt(ir[`rt]), 
    .ir_rd(ir[`rd]), 
    .a3sel(a3sel), 
    .a3(a3)
    );
	 
	 MFWD3 mfwd3 (
    .rd(rd), 
    .ao(ao), 
    .extd(extd), 
    .pc4(pc4), 
    .wd3sel(wd3sel), 
    .wd3(wd3)
    );
	 
	 EXT ext (
    .i16(ir[`i16]), 
    .extop(extop), 
    .extd(extd)
    );
	 
	 CMP cmp (
    .d1(rd1), 
    .d2(rd2), 
    .res(res)
    );
	 
	 NPC npc (
    .pc4(pc4), 
    .i26(ir[`i26]), 
    .npcop(npcop), 
    .nextpc(nextpc)
    );
	 
	 ALU alu (
    .a(rd1), 
    .b(b), 
    .shamt(ir[`sh]), 
    .aluop(aluop), 
    .ao(ao)
    );
	 
	 MFB mfb (
    .extd(extd), 
    .rd2(rd2), 
    .bsel(bsel), 
    .b(b)
    );
	 
	 RF rf (
    .a1(ir[`rs]), 
    .a2(ir[`rt]), 
    .a3(a3), 
    .wd3(wd3), 
    .rd1(rd1), 
    .rd2(rd2), 
    .reset(reset), 
    .clk(clk), 
    .wr(wr)
    );
	 
	 DM dm (
    .da(ao), 
    .wd(rd2), 
    .wm(wm), 
    .rm(rm),
    .clk(clk), 
    .reset(reset), 
    .rd(rd)
    );
	 
	 DECODER decoder (
    .op(ir[`op]), 
    .funct(ir[`funct]), 
	 .ir_rd(ir[`rd]),
    .res(res), 
    .pcsel(pcsel), 
    .extop(extop), 
    .npcop(npcop), 
    .aluop(aluop), 
    .bsel(bsel), 
    .wm(wm),
	 .rm(rm),
    .a3sel(a3sel), 
    .wd3sel(wd3sel), 
    .wr(wr)
    );

	 always@(posedge clk) begin
		 if(reset == 0) begin
		    if(wr == 1'd1)
			 $display("@%h: $%d <= %h", q, a3, wd3);
		    if(wm != 2'd0)
			 $display("@%h: *%h <= %h", q, ao, rd2);
		 end
		 
	 end


endmodule

注:此代码仅供参考,仅支持所声明指令。

四、笔者感受

CPU实现算是大学第一次遇到的有一定代码量的编程,它给我带来的更多的是面对复杂编程的经验:越复杂,越要先抽象再具体。后面的流水线CPU会更加体现这一点,表格(设计)的重要性对CPU的编程很重要。

标签:begin,周期,31,output,input,CPU,d0,d1
来源: https://blog.csdn.net/de_pool/article/details/118500669