Chapter

Section

The macro VBE files are added to the directory MBK_TARGET_LIB. LOON will use all the VBE files in MBK_TARGET_LIB for its synthesis optimisation. The VST files are put on the search path defined by MBK_CATA_LIB. For this example, MBK_CATA_LIB is

$ echo $MBK_CATA_LIB
.:$ALLIANCE_MOS/vbe/sclib100_0\
:$ALLIANCE_MOS/vbe/sclib_netlist

When LOON reads a netlist, it is flattened through to the catalog. If the macro is not in the catalog, then it will be flattened to its netlist.

By catalog, we mean the cells listed in the file pointed to by MBK_CATA_NAME and on the search path MBK_CATA_LIB.

$ echo $MBK_CATAL_NAME
CATAL
$ ls $ALLIANCE_MOS/vbe/sclib100_0 |
grep "$MBK_CATAL_NAME"
CATAL

The default name is CATAL, and the file used is the one supplied with the library (which doesn't contain any macros). This means that successive LOON runs are able to use the macro and then flatten it to its real underlying cells.

I have created ten macros for this paper. If we label as x2 the drive strength of a full size unfolded inverter, then the composition and drive strengths of the macros are

macro  drive  −−−− made from −−−− name
2-AND x4 x2 2-NAND x4 INV a2drv_y
2-OR x4 x1 2-NOR x4 INV o2drv_y
BUFFER x8 x2 INV x8 INV drvp_y
2-XOR x4 x1 2-XNOR x4 INV xr2drv_y
2-XNOR x4 x1 2-XOR x4 INV nxr2drv_y
AON22 x4 x1 AOI22 x4 INV mx2drv_y
AON222 x2 x2 2-NAND x2 3-NAND mx3p_y
OAI21 x2 x1 2-OR x2 2-NAND noa3p_y
OAI31 x2 x1 3-OR x2 2-NAND noa4p_y
CARRY x4 x1 CARRY x4 BUFFER crydrv_y

It is possible to create as many as you want, but the process is slow and manual. I found two of the macros were not used in the netlist synthesis: the 2-OR and 2-XNOR.

The principle is for every weak cell to have a stronger macro available which will have a higher Prop delay and lower Ramp delay. If the weak cell drives a high load, at some point the Ramp delay is sufficiently large for the macro to be selected.

We synthesise the 2-step and 3-step flows across the whole range of BOOG and LOON opt levels and compare to the results without macros, using the results from the best BOOG synthesis directory.

The critical path of the fastest netlist is shown below. Many of these cells are macros which have been flattened into their netlists. They can generally be recognised by their instance name which ends with "_f" or "_t".

#     x 2       2                  297
#  1  ndrvp_y  51  i->f     1145   848   not_x_2_ins             not_x_2                 
#  2  op2_y     2  i1->t    1920   775   o2_y_28_ins             o2_y_28_sig
#  3  mx3_y     3  i0->t    2998  1078   not_rtlcarry_12_3_ins_t not_rtlcarry_12_3
#  4  mx3_y     3  l0->t    4065  1067   not_rtlcarry_12_4_ins_t not_rtlcarry_12_4
#  5  cry_y     2  si->f    5042   977   rtlcarry_12_5_ins       rtlcarry_12_5
#  6  p1_y      4  i->t     6029   987   mbk_buf_rtlcarry_12_5_2 mbk_buf_rtlcarry_12_5_2
#  7  nxr2_y    2  i1->f    6775   746   nxr2_y_35_ins           nxr2_y_35_sig
#  8  mx3_y     3  i0->t    7853  1078   not_rtlcarry_6_6_ins_t  not_rtlcarry_6_6
#  9  cry_y     3  si->f    8987  1134   rtlcarry_6_7_ins        rtlcarry_6_7
# 10  p1_y      2  i->t     9699   712   mbk_buf_rtlcarry_6_7    mbk_buf_rtlcarry_6_7
# 11  nxr2_y    2  i1->f   10445   746   nxr2_y_12_ins           nxr2_y_12_sig
# 12  mx3_y     2  l1->t   11459  1014   cryb_y_3_ins_t          cryb_y_3_sig
# 13  mx3_y     2  i0->t   12473  1014   cryb_y_2_ins_t          cryb_y_2_sig
# 14  mx3_y     2  i0->t   13488  1015   cryb_y_ins_t            cryb_y_sig
# 15  mx3_y     2  i0->t   14529  1041   not_rtlcarry_0_11_ins_t not_rtlcarry_0_11
# 16  np1_y     1  i->f    14827   298   ndrvp_y_ins             ndrvp_y_sig
# 17  na2p_y    1  i0->f   15185   358   noa3_y_ins_f            noa3_y_sig
# 18  nao3_y    2  i1->f   15909   724   rtlcarry_0_13_ins       rtlcarry_0_13
# 19  na2_y     1  i0->f   16339   430   na2_y_73_ins            na2_y_73_sig
# 20  no2_y     1  i0->f   16905   566   no2_y_381_ins           no2_y_381_sig
# 21  nxr2_y    1  i1->f   17712   807   r_15_ins_f              r_15_ins_f
# 22  ndrv_y    0  i->f    18119   407   r_15_ins_t              r(15)
#     r 15
#
# 1st critical path is r 15 at 18119
# 2nd critical path is r 11 at 18008

If instead of flattening the macros we add their names to the catalog file, then we can view the critical path with the macros included. The buffer insertion by LOON is slightly different which gives a slightly different timing, but we can see how many macros occur along the critical path.

#     x 2       2                  297
#  1  ndrvp_y  51  i->f     1145   848   not_x_2_ins             not_x_2
#  2  o2_y      1  i1->t    1980   835   o2_y_47_ins             o2_y_47_sig
#  3  cry_y     2  pi->f    3104  1124   rtlcarry_12_3_ins       rtlcarry_12_3
#  4  p1_y      3  i->t     3993   889   mbk_buf_rtlcarry_12_3   mbk_buf_rtlcarry_12_3   
#  5  nxr2_y    1  i1->f    4739   746   nxr2_y_38_ins           nxr2_y_38_sig
M  6  cryb_y    1  si->t    5730   991   cryb_y_6_ins            cryb_y_6_sig
M  7  cryb_y    2  pi->t    6802  1072   not_rtlcarry_6_5_ins    not_rtlcarry_6_5
M  8  cryb_y    2  si->t    7856  1054   not_rtlcarry_6_6_ins    not_rtlcarry_6_6
M  9  cry_y     3  si->f    8956  1100   rtlcarry_6_7_ins        rtlcarry_6_7
# 10  p1_y      2  i->t     9668   712   mbk_buf_rtlcarry_6_7    mbk_buf_rtlcarry_6_7
# 11  nxr2_y    1  i1->f   10414   746   nxr2_y_12_ins           nxr2_y_12_sig
M 12  cryb_y    1  si->t   11405   991   cryb_y_3_ins            cryb_y_3_sig
M 13  cryb_y    1  pi->t   12413  1008   cryb_y_2_ins            cryb_y_2_sig
M 14  cryb_y    1  pi->t   13422  1009   cryb_y_ins              cryb_y_sig
M 15  cryb_y    2  pi->t   14457  1035   not_rtlcarry_0_11_ins   not_rtlcarry_0_11
# 16  np1_y     1  i->f    14755   298   ndrvp_y_ins             ndrvp_y_sig
M 17  noa3p_y   1  i2->f   15113   358   noa3_y_ins              noa3_y_sig
# 18  nao3_y    2  i1->f   15837   724   rtlcarry_0_13_ins       rtlcarry_0_13
# 19  na2_y     1  i0->f   16267   430   na2_y_73_ins            na2_y_73_sig
# 20  no2_y     1  i0->f   16833   566   no2_y_381_ins           no2_y_381_sig
M 21  xr2drv_y  0  i1->t   18047  1214   r_15_ins                r(15)
#     r 15
#
# 1st critical path is r 15 at 18047
# 2nd critical path is r 11 at 17977

1 $ ALLIANCE_VBE=$ALLIANCE_MOS/vbe
2 $ MBK_TARGET_LIB=$ALLIANCE_VBE/sclib100_0_min_x8
3 $ boog -l loon_0000_300_4 multi8 multi8_o
4 $ MBK_TARGET_LIB=$ALLIANCE_VBE/sclib100_0
5 $ loon -l loon_0000_300_4 multi8_o multi8_1
6 $ loon -l loon_1500_300_0 multi8_1 multi8

The critical path delays are shown below. The fastest result of 18361 is obtained from a BOOG synthesis using the x8 drive inverter, ndrvp_y.

Critical Path Delay (ps)
Opt level BOOG colour coding
LOON 0 1 2 4
0  28715   27701   27911   20793    n1
1  25981   25696   25678   20451    np1
2  23539   23064   23015   18817    ndrv
4  23524   22607   23257   18361    ndrvp

The difference the macros make to the critical path delays is shown below.

  Critical Path Delay Differences (ps)  
Opt level BOOG
LOON 0 1 2 4
0     0      0      0    −1604 
1   −2132    −1447    −1465    −1590 
2   −249    −651    −682    −1197 
4   −889    −302    −432    −1814 

There is no change in netlists from BOOG with opt levels 0-2 and LOON with opt level 0 because BOOG does not select the cryb_y non-inverting carry macro and LOON does not select any macros because they are larger. Otherwise BOOG and LOON macros allow a general critical path speed-up.

1 $ ALLIANCE_VBE=$ALLIANCE_MOS/vbe
2 $ MBK_TARGET_LIB=$ALLIANCE_VBE/sclib100_0_min_x8
3 $ boog -l loon_0000_300_4 multi8 multi8_o
4 $ MBK_TARGET_LIB=$ALLIANCE_VBE/sclib100_0
5 $ loon -l loon_1500_300_2 multi8_o multi8  

The critical path delays are shown below. The fastest result of 18119 is obtained from a BOOG synthesis using the x8 drive inverter, ndrvp_y, and is 12ps faster than a synthesis using only the BOOG macro cryb_y.

Critical Path Delay (ps)
Opt level BOOG colour coding
LOON 0 1 2 4
0  28715   27701   27911   20793    n1
1  25981   25671   25678   20451    np1
2  23790   22637   22643   18119    ndrv
4  23353   22515   23005   18239    ndrvp

The difference in the critical paths from the presence of the macros is shown below.

  Critical Path Delay Differences (ps)  
Opt level BOOG
LOON 0 1 2 4
0     0      0      0    −1604 
1   −2132    −1472    −1674    −1604 
2   −303    −1493    −1132    −1635 
4   −455    −939    −217    −1585