Patent-6

UP PREV NEXT

6. Inserting buffers

Initial circuit

Scope of patent claims and buffer insertion

There is a discussion in the patent about buffer insertion, but exclusively within the context of reducing area. This is also a claim of the patent:
"10. The automated method of claim 1 further comprising: inserting a buffer in one of the plurality of wires of step (a) to reduce an area of a cell coupled to the wire.
11. The automated method of claim 10 wherein the step of inserting the buffer comprises: determining a wire in the digital circuit of step (a) into which the buffer can be inserted to reduce the area of the cell coupled to the wire by a specific area size; and inserting the buffer into the wire if the area of the buffer is less than the specifc area size, prior to the step (b) of determing the initial intended area of each of the selected plurality of cells."

This buffer insertion is done before the gates are retimed by compression or stretching and is applicable to large fanout nodes which are not present in the 4-bit adder.

Fixed timing after buffer insertion on inputs a(2) and b(2)

In order to meet the design requirements of 35fF input capacitance for the 4-bit adder, buffers must be inserted on those inputs whose capacitance is too high. Unlike the buffer insertion above, these inputs are found after the gates have been retimed and an accurate estimate made of the input capacitance. Then the buffers are inserted in the circuit with fixed delays and the gates retimed a second time.

1	constant timing with stage effort f=3.6	Fig 4a
2	retime gates to meet critical path and reduce area	Fig 5a
3	map to library timing for real capacitances	Fig 5c
4	insert buffers on Fig 4a on heavily loaded inputs	Fig 6a
5	retime gates to meet critical path including input buffer delay	Fig 6b
6	map to vsclib and calculate library timing	Fig 6c

This kind of buffer insertion to improve the timing characteristics of the circuit is not described in the patent.

Fig 5c shows that the input capacitance on pins a(0), b(0), a(2) and b(2) is too high. Two solutions are used, one for inputs a(0) and b(0); and one for inputs a(2) and b(2).

A buffer is inserted on inputs a(2) and b(2) to reduce the input capacitance. All the loads of pins a(2) and b(2) are off the critical path.

An inverter is inserted for the logic off the critical path of inputs a(0) and b(0), and the subsequent logic reworked for the inverted sense of the inputs. Fig 6a shows the circuit with the initial timing using a fixed stage effort f=3.6.

Fixed timing after compressing and stretching gate delays to meet 350ps critical path

Now each gate's timing is compressed or stretched so that the critical path to each output is 350ps, as shown in the second schematic on the right Fig 6b.

4-bit adder delays mapped to the vsclib

The 4-bit adder timing when mapped to the vsclib is shown in the third schematic on the right Fig 6c. The critical path is a false one from b(0) to s(0). The maximum input capacitance is 35fF, so the only design issue remaining is the excessive delay to the s(0) output.

This delay is caused by the b0n inverter. The ideal area for this is 1.33 tracks (Fig 6b), and the two inverters which could be chosen are the
iv1v0x1 with estimated area 1.16 tracks and iv1v0x2 with estimated area 1.80 tracks as shown in the graph below. The red curve is the actual and the blue curve is the estimated drive strength.

iv1 area match

The iv1v0x1 is chosen, but this is the wrong choice. The graph shows that a single area coefficient provides a poor fit, especially for the weaker drive strengths near the graph origin. A better algorithm is needed for mapping to the standard cell library. The one that will be tested (this is not in the patent) is an area mapping using two coefficients instead of one.

Fig 6a. Buffered adder delays with (i) fixed stage effort of f=3.6; (ii) wireload of 6fF per fanout; (iii) single area coefficient C_S to map drive strength; (iv) timing using library averages.
ideal stage effort of 3.6 buffered adder

ideal stage effort of 3.6 buffered adder

Fig 6b. Buffered adder delays with (i) fixed stage effort of f=3.6 used for initial timing; (ii) each gate delay compressed or stretched to meet critical path; (iii) wireload of 6fF per fanout; (iv) single area coefficient C_S to map drive strength; (v) timing using library averages.
ideal matched critical paths

Fig 6c. Buffered adder delays with (i) fixed stage effort of f=3.6 used for initial timing; (ii) each gate delay compressed or stretched to meet critical path; (iii) wireload of 6fF per fanout; (iv) gain limit of 5 used for non-inverting gates; (v) single area coefficient C_S to map drive strength; (vi) timing from vsclib cells.
vsclib matched critical paths

UP PREV NEXT

5-AUG-05