Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Minimum clock period for Xilinx designs keeps varying as the input is changed

I have designed a MIPS single cycle processor in Xilinx using VHDL. The abstract design is based on the theory provided by Patterson and Henessy book. After completing the design i ran few assembly codes to check it's functioning and it was giving the desired results. My problem is with the "TIMING SUMMARY" in the design summary report(".SYR" file). Every time I change the assembly code that is stored in the Instruction memory(which is my ROM) the minimum clock period for the single cycle processor keeps changing. I don't quite understand the reason?

Timing Summary:
---------------
Speed Grade: -4

   Minimum period: 17.561ns (Maximum Frequency: 56.945MHz)
   Minimum input arrival time before clock: No path found
   Maximum output required time after clock: 16.296ns
   Maximum combinational path delay: No path found

Timing Detail:
--------------
All values displayed in nanoseconds (ns)

=========================================================================
Timing constraint: Default period analysis for Clock 'clk'
  Clock period: 17.561ns (frequency: 56.945MHz)
  Total number of paths / destination ports: 6965792 / 616
-------------------------------------------------------------------------
Delay:               17.561ns (Levels of Logic = 22)
  Source:            MIPS_processor_unit/Datapath_comp/PC_reg/q_5_1 (FF)
  Destination:       MIPS_processor_unit/Datapath_comp/RegF/memory_0_0 (FF)
  Source Clock:      clk rising
  Destination Clock: clk rising

  Data Path: MIPS_processor_unit/Datapath_comp/PC_reg/q_5_1 to MIPS_processor_unit/Datapath_comp/RegF/memory_0_0
                                Gate     Net
    Cell:in->out      fanout   Delay   Delay  Logical Name (Net Name)
    ----------------------------------------  ------------
    FDCE:C->Q             2   0.591   0.622  MIPS_processor_unit/Datapath_comp/PC_reg/q_5_1 >>(MIPS_processor_unit/Datapath_comp/PC_reg/q_5_1)
     LUT2_L:I0->LO         1   0.704   0.104  Instruction_memory_unit/Mrom_Instruction_out391220_SW0 (N1361)
     LUT4:I3->O            3   0.704   0.535  Instruction_memory_unit/Mrom_Instruction_out391236_SW0 (N141)
     LUT4:I3->O           17   0.704   1.051  Instruction_memory_unit/Mrom_Instruction_out391236 (Instruction_tl_s)
     MUXF5:S->O            2   0.739   0.526  MIPS_processor_unit/Datapath_comp/RegF/mux8_8_f5 (MIPS_processor_unit/Datapath_comp/RegF/mux8_8_f5)
     LUT4:I1->O            1   0.704   0.000  MIPS_processor_unit/Datapath_comp/ALUSrc_mux/y1_F (N276)
     MUXF5:I0->O           3   0.321   0.610  MIPS_processor_unit/Datapath_comp/ALUSrc_mux/y1 (MIPS_processor_unit/Datapath_comp/ALU_2nd_input_s)
     LUT2:I1->O            1   0.704   0.000  MIPS_processor_unit/Datapath_comp/ALU_comp/Msub_y_sig_addsub0001_lut (MIPS_processor_unit/Datapath_comp/ALU_comp/Msub_y_sig_addsub0001_lut)
     MUXCY:S->O            1   0.464   0.000  MIPS_processor_unit/Datapath_comp/ALU_comp/Msub_y_sig_addsub0001_cy (MIPS_processor_unit/Datapath_comp/ALU_comp/Msub_y_sig_addsub0001_cy)
     MUXCY:CI->O           1   0.059   0.000  MIPS_processor_unit/Datapath_comp/ALU_comp/Msub_y_sig_addsub0001_cy (MIPS_processor_unit/Datapath_comp/ALU_comp/Msub_y_sig_addsub0001_cy)
     MUXCY:CI->O           1   0.059   0.000  MIPS_processor_unit/Datapath_comp/ALU_comp/Msub_y_sig_addsub0001_cy (MIPS_processor_unit/Datapath_comp/ALU_comp/Msub_y_sig_addsub0001_cy)
     MUXCY:CI->O           1   0.059   0.000  MIPS_processor_unit/Datapath_comp/ALU_comp/Msub_y_sig_addsub0001_cy (MIPS_processor_unit/Datapath_comp/ALU_comp/Msub_y_sig_addsub0001_cy)
     MUXCY:CI->O           1   0.059   0.000  MIPS_processor_unit/Datapath_comp/ALU_comp/Msub_y_sig_addsub0001_cy (MIPS_processor_unit/Datapath_comp/ALU_comp/Msub_y_sig_addsub0001_cy)
     MUXCY:CI->O           1   0.059   0.000  MIPS_processor_unit/Datapath_comp/ALU_comp/Msub_y_sig_addsub0001_cy (MIPS_processor_unit/Datapath_comp/ALU_comp/Msub_y_sig_addsub0001_cy)
     MUXCY:CI->O           0   0.059   0.000  MIPS_processor_unit/Datapath_comp/ALU_comp/Msub_y_sig_addsub0001_cy (MIPS_processor_unit/Datapath_comp/ALU_comp/Msub_y_sig_addsub0001_cy)
     XORCY:CI->O           1   0.804   0.424  MIPS_processor_unit/Datapath_comp/ALU_comp/Msub_y_sig_addsub0001_xor (MIPS_processor_unit/Datapath_comp/ALU_comp/y_sig_addsub0001)
     LUT4:I3->O            1   0.704   0.000  MIPS_processor_unit/Datapath_comp/ALU_comp/y_sig_mux0000_f5_G (N237)
     MUXF5:I1->O         259   0.321   1.334  MIPS_processor_unit/Datapath_comp/ALU_comp/y_sig_mux0000_f5 (Output_address_0_OBUF)
     RAM32X1S:A0->O        1   1.025   0.499  Data_memory_unit/Mram_data_mem1 (N10)
     LUT3:I1->O            1   0.704   0.000  inst_LPM_MUX_6 (inst_LPM_MUX_6)
     MUXF5:I0->O           1   0.321   0.000  inst_LPM_MUX_4_f5 (inst_LPM_MUX_4_f5)
     MUXF6:I0->O           1   0.521   0.455  inst_LPM_MUX_2_f6 (Read_data_tl_s)
     LUT3:I2->O            8   0.704   0.000  MIPS_processor_unit/Datapath_comp/WB_mux/y1 (MIPS_processor_unit/Datapath_comp/write_data_s)
     FDCE:D                    0.308          MIPS_processor_unit/Datapath_comp/RegF/memory_0_0
    ----------------------------------------
    Total                     17.561ns (11.401ns logic, 6.160ns route)
                                       (64.9% logic, 35.1% route)

=========================================================================


Timing Summary:
---------------
Speed Grade: -4

   Minimum period: 13.551ns (Maximum Frequency: 73.798MHz)
   Minimum input arrival time before clock: No path found
   Maximum output required time after clock: 14.466ns
   Maximum combinational path delay: No path found

Timing Detail:
--------------
All values displayed in nanoseconds (ns)

=========================================================================
Timing constraint: Default period analysis for Clock 'clk'
  Clock period: 13.551ns (frequency: 73.798MHz)
  Total number of paths / destination ports: 256927 / 278
-------------------------------------------------------------------------
Delay:               13.551ns (Levels of Logic = 13)
  Source:            MIPS_processor_unit/Datapath_comp/PC_reg/q_6 (FF)
  Destination:       MIPS_processor_unit/Datapath_comp/PC_reg/q_2 (FF)
  Source Clock:      clk rising
  Destination Clock: clk rising

  Data Path: MIPS_processor_unit/Datapath_comp/PC_reg/q_6 to MIPS_processor_unit/Datapath_comp/PC_reg/q_2
                                Gate     Net
    Cell:in->out      fanout   Delay   Delay  Logical Name (Net Name)
    ----------------------------------------  ------------
     FDCE:C->Q            71   0.591   1.354  MIPS_processor_unit/Datapath_comp/PC_reg/q_6 (MIPS_processor_unit/Datapath_comp/PC_reg/q_6)
     LUT3_D:I1->O          8   0.704   0.761  Instruction_memory_unit/Mrom_Instruction_out4711110 (N91)
     LUT4:I3->O           17   0.704   1.051  Instruction_memory_unit/Mrom_Instruction_out43111_2 (Instruction_memory_unit/Mrom_Instruction_out43111_1)
     MUXF5:S->O            1   0.739   0.000  MIPS_processor_unit/Datapath_comp/RegF/mux3_7_f5_0 (MIPS_processor_unit/Datapath_comp/RegF/mux3_7_f51)
     MUXF6:I0->O           1   0.521   0.424  MIPS_processor_unit/Datapath_comp/RegF/mux3_5_f6_0 (MIPS_processor_unit/Datapath_comp/RegF/mux3_5_f61)
     LUT4:I3->O            1   0.704   0.424  MIPS_processor_unit/Datapath_comp/RegF/read_data_11 (MIPS_processor_unit/Datapath_comp/read_data_1_s)
     LUT4:I3->O            1   0.704   0.000  MIPS_processor_unit/Datapath_comp/ALU_comp/Maddsub_y_sig_addsub0000_lut (MIPS_processor_unit/Datapath_comp/ALU_comp/Maddsub_y_sig_addsub0000_lut)
     MUXCY:S->O            1   0.464   0.000  MIPS_processor_unit/Datapath_comp/ALU_comp/Maddsub_y_sig_addsub0000_cy (MIPS_processor_unit/Datapath_comp/ALU_comp/Maddsub_y_sig_addsub0000_cy)
     MUXCY:CI->O           1   0.059   0.000  MIPS_processor_unit/Datapath_comp/ALU_comp/Maddsub_y_sig_addsub0000_cy (MIPS_processor_unit/Datapath_comp/ALU_comp/Maddsub_y_sig_addsub0000_cy)
     MUXCY:CI->O           1   0.059   0.000  MIPS_processor_unit/Datapath_comp/ALU_comp/Maddsub_y_sig_addsub0000_cy (MIPS_processor_unit/Datapath_comp/ALU_comp/Maddsub_y_sig_addsub0000_cy)
     MUXCY:CI->O           0   0.059   0.000  MIPS_processor_unit/Datapath_comp/ALU_comp/Maddsub_y_sig_addsub0000_cy (MIPS_processor_unit/Datapath_comp/ALU_comp/Maddsub_y_sig_addsub0000_cy)
     XORCY:CI->O          18   0.804   1.072  MIPS_processor_unit/Datapath_comp/ALU_comp/Maddsub_y_sig_addsub0000_xor (MIPS_processor_unit/Datapath_comp/write_data_s)
     LUT4_D:I3->O          5   0.704   0.637  MIPS_processor_unit/Controller_comp/PCSrc9 (MIPS_processor_unit/Controller_comp/PCSrc9)
     LUT4:I3->O            1   0.704   0.000  MIPS_processor_unit/Datapath_comp/Jump_mux/y1 (MIPS_processor_unit/Datapath_comp/Next_PC_1_s)
     FDCE:D                    0.308          MIPS_processor_unit/Datapath_comp/PC_reg/q_6
    ----------------------------------------
    Total                     13.551ns (7.828ns logic, 5.723ns route)
                                       (57.8% logic, 42.2% route)

=========================================================================

As can be seen I gave my Instruction_memory_unit two different assembly codes and the minimum period for the single cycle processor changes.These are my doubts:

1)Every time I change my assembly codes, does xilinx evaluate the critical path on the basis of the instructions that i have specified in my assembly code? If 'Yes', then how should i get a general minimum period for my design?

2)I have RegF as my Register file which is basically the RAM containing the 32 registers of a MIPS processor. What I can't understand is that, in both these timing summary the 'Gate delay + Net Delay' is different. Theoretically, shouldn't the register file being a memory have a fixed read time?

like image 277
Tapojyoti Mandal Avatar asked Nov 27 '25 12:11

Tapojyoti Mandal


2 Answers

It may be synthesising your ROM down into gates or LUTs or SRL16s. ... check the device usage (just before the timing report in the .syr file) to see whether it's using block memory for the ROM - it may not be.

In fact that does appear to be the problem, according to the timing report : there's a lot of LUTs in there and no sign of a BRAM.

If that's the problem, look up "attribute ram_style=blockram" in the Xilinx constraint guide (I may have the spellisg/syntax slightly wrong) - if you apply that to the array containing your ROM you may be able to overcome this. Once data is in memory, timings should be more stable.

NOTE that the BlockRams are synchronous : you present the address in one clock cycle and get the contents a cycle later. If that doesn't meet your pipeline model, you will have to re-think that in order to let synthesis implement the ROM in block memory.

Every time you implement your design, even without any logic changes, the timing results may be different. In some cases, where there is routing congestion, many levels of logic in many paths, or many "difficult" paths, you may encounter wildly different results from run to run.

As an experiment, change nothing in your design and run implementation 2 or 3 times. I bet you will get at least some variation in the runs.

There are some handles that you can play with to minimize this variability, but I don't recommend it (for example: using a fixed seed to the implementation process). Likely something else is going on here.

Other possible factors:

  1. Are all of your IO fixed to specific IO locations? If not, the tools could be randomly selecting IO pins for the IO of your design, which will greatly affect timing.
  2. Have you tried placing constraints on your design (on your clock, for example)? This will indicate to the tools "how hard they should try" in order to improve your design to meet a certain goal. If you have some performance in mind (e.g. 66MHz, 100MHz... etc) you can provide that as a constraint to the tools and they will attempt to meet that constraint.
  3. Look into how your ROM/RAM is actually implemented. The tools may be taking liberties to make optimizations depending on the contents of the ROM, which may be simplifying the design in some cases (based on the contents). In short, it may be implementing your design as LUTs instead of a RAM-type primitive. This could be helping you out, and it might just be an artifact of how your have things implemented at this time (coding style, resets, etc). If in the future the ROM becomes more generic (e.g. some run-time loading process), the tools won't be able to take the same optimization liberties and will have similar performance run-to-run.

In summary, I don't think changing the contents of your ROM/RAM is the culprit in the timing changes you are seeing, but some other factor.

like image 25
Josh Avatar answered Nov 30 '25 06:11

Josh