Chapter

Section

For each function synthesised, BOOG synthesis mostly uses the smallest cell and then the strongest drive strength. This causes a problem for the subsequent LOON synthesis. LOON optimises a netlist for speed and/or area, and does this mainly by increasing the size or drive strength of a cell. LOON is less likely to find the solution when it involves reducing a cell's drive strength.

This can happen when a critical path is loaded by a non-critical path which starts with a higher drive strength cell. This cell has a higher pin capacitance, and the critical path can be improved by choosing a cell with a lower pin capacitance. However LOON is unlikely to find this solution.

In the example on the right, the critical path in red to the r 3 output shares gates rtlgen_0_ins and not_aux0_ins with the path to r 2. Gate not_aux0_ins is loaded with a 2-NAND na2p_y_2_ins to the non-critical output r 2.

Gate na2p_y_2_ins is a high drive x2 NAND gate which loads the critical path unnecessarily. If it is reduced to an x1 drive strength, the delay to r 2 is increased by 55ps but the critical path to r 3 is reduced by 60ps.

LOON however is unable to find this optimisation.

We will synthesise the initial netlist with BOOG using a restricted library that only contains the minimum drive strength for each function except the inverter.

The netlist created by BOOG is very dependent on the drive strength of the smallest inverter it finds. In the sclib as supplied by Alliance, an error in the area parameters of all inverters except the largest make the largest inverter appear to be the smallest which means it gets chosen by BOOG.

The solution proposed here is to create 4 directories each containing the minimum drive strength cells and either the x1, x2, x4 or x8 drive strength inverters. These directories will be called sclib100_0_min_x1, sclib100_0_min_x2, sclib100_0_min_x4 and sclib100_0_min_x8. We will try the synthesis with each of these directories and choose the best result.

Again we synthesise across the whole range of BOOG and LOON opt levels and compare to the previous result. We do this for the two synthesis flows to see which one is best and choose the fastest result from testing the 4 BOOG synthesis directories

We set up 4 directories for BOOG synthesis which contain the minimum drive strengths for each function except inverters. The 4 directories each contain one inverter drive strength and we can run BOOG synthesis four times and choose the best results.

In the example here, this is with the two step synthesis flow and using the min_x4 library (with inverter ndrv_y) for BOOG synthesis. This gives a critical path delay of 19754, down 260 from the 20014 obtained previously.

A critical path (in red) loaded by a larger than required gate to a non-critical path. (Screen shot from XSCH.) Not downsizing for critical path improvement

critical path to r 3 output
rtlgen_0_ins a2p_y 1037      
not_aux0_ins na2p_y 1635 na2p_y 1575  
nao4_y_ins nao4_y 2438 nao4_y 2378  
r_3_ins xr2_y 3833 xr2_y 3773 −60
critical path to r 2 output
na2p_y_2_ins na2p_y 2058 na2_y 2113  
r_2_ins xr2_y 3457 xr2_y 3512 +55

1 $ ALLIANCE_VBE=$ALLIANCE_MOS/vbe
2 $ MBK_TARGET_LIB=$ALLIANCE_VBE/sclib100_0_min_x8
3 $ boog -l loon_0000_300_4 multi8 multi8_o
4 $ MBK_TARGET_LIB=$ALLIANCE_VBE/sclib100_0
5 $ loon -l loon_0000_300_4 multi8_o multi8_1
6 $ loon -l loon_1500_300_0 multi8_1 multi8

The critical path delays are shown below. The fastest result of 20014 is obtained from a BOOG synthesis using the x8 drive inverter, ndrvp_y.

Critical Path Delay (ps)
Opt level BOOG colour coding
LOON 0 1 2 4
0  28715   27701   27911   22397    n1
1  28113   27143   27352   22041    np1
2  23788   23715   23697   20014    ndrv
4  24413   22909   23689   20175    ndrvp

The difference in the critical paths compared to synthesising BOOG with the highest drive inverter is shown below.

Critical Path Delay Differences (ps)
Opt level BOOG
LOON 0 1 2 4
0  −1885   −2337   −2127 
1 −1710  −1479  −1270 
2 −1703  −936  −954 
4 −1536  −1792  −1012    −128 

When the fastest netlist comes from a BOOG synthesis that uses the min drive strength library with the largest inverter, ndrvp_y, then the performance is at least as good as using the full library (when BOOG chooses the ndrvp_y because it is wrongly coded with the smallest area).

In many cases though, the fastest netlist comes from a BOOG synthesis that uses a different min drive strength library, and in these cases there can be a significant performance improvement.

1 $ ALLIANCE_VBE=$ALLIANCE_MOS/vbe
2 $ MBK_TARGET_LIB=$ALLIANCE_VBE/sclib100_0_min_x4
3 $ boog -l loon_0000_300_4 multi8 multi8_o
4 $ MBK_TARGET_LIB=$ALLIANCE_VBE/sclib100_0
5 $ loon -l loon_1500_300_2 multi8_o multi8  

 
The critical path delays are shown below. The fastest result of 19754 is obtained from a BOOG synthesis using the x4 drive inverter, ndrv_y.

Critical Path Delay (ps)
Opt level BOOG colour coding
LOON 0 1 2 4
0  28715   27701   27911   22397    n1
1  28113   27143   27352   22055    np1
2  24093   24130   23775   19754    ndrv
4  23808   23454   23222   19824    ndrvp

The difference in the critical paths compared to synthesising BOOG with the highest drive inverter is shown below.

Critical Path Delay Differences (ps)
Opt level BOOG
LOON 0 1 2 4
0  −1885   −2337   −2127 
1 −1719  −1479  −1270 
2 −1829  −520  −875  −566 
4 −1605  −1247  −1479    −508