Abstract

Multi-operand adders are important arithmetic design blocks especially in the addition of partial products of hardware multipliers. In this paper, a current mode circuit proposed for addition of seven inputs, namely, (7,3) counter. Input operands are added up using current mode multi-valued signals with three bit output. The circuit is designed using 0.18 μm technology and simulated using HSPICE. The results show that our design consumes less power at high frequencies compared to standard CMOS implementation. The current consumption in the circuit is constant; as a result an analog friendly design is accomplished since it does not produce significant switching noise. The circuit can be used for multipliers, FIR filters and similar signal processing blocks. The design is especially advantageous for the mixed signal design environment.

1. Introduction

Standard CMOS arithmetic circuits have robust working performance; however, they generate excessive noise especially when working at high frequencies. In principle, they require huge amount of power while switching. When working with mixed-signal systems, the system is composed of both analog and digital components where usually the performance of the analog blocks is degraded with switching of digital circuits.

To overcome this issue, analog friendly circuit techniques exist, one of the most promising is the source coupled logic (SCL) circuits for CMOS design where its equivalent is emitter coupled logic (ECL) in bipolar technology. SCL circuits provide analog friendly characteristics, high performance in switching frequency [1]; however, they require more area than standard CMOS circuits in addition to that, they consume static power regardless of the switching frequency.

In this paper an alternative current mode arithmetic circuit is proposed similar to source coupled logic, and have equivalent input and output characteristics where, inputs and outputs of the circuit is complementary same as SCL. Our circuit requires less transistor count. The designed circuit is useful for adding up multiple digital inputs.

Multi-operand adders are required for building multipliers where multiple partial products must be added up to have the multiplication result. Since most of the power is consumed at the arithmetic blocks, especially at multipliers, lower power devices with lower switching noise is desirable. Current mode multi-valued logic and arithmetic circuits are studied extensively until now, some recent works appear at [2-4]. In general the circuits require complex circuit design techniques, suffer from low noise margins and have not been able to standardize because of their difficulty in design. There is still research going on for building reliable circuits. In this paper, a multi-valued circuit is designed which is suitable for constant and variable coefficient multiplier design, and the design can be employed in mixed signal circuits where low switching noise is required. Our design has a differential structure, which is important for the noise immunity. Small swings of the differential data provide fast switching at the circuit. In Section 4, since there is no standardized multi-valued counter circuit at all, our circuit is compared to standard CMOS equivalent only in terms of power and transistor count.

2. Counter design

Basically, a counter circuit is used to reduce partial products in multiplication. The simplest counter circuit is a (3,2) counter, which is a full-adder. It reduces three input operands to two. The operation of a (3,2) counter where the input operands are x, y, z and outputs are s (sum) and c (carry) can be described by:

\[ x + y + z = 2c + s \]  \( (1) \)

As an example, a three operand four bit redundant adder, namely, a carry save adder (CSA); can be built using four full adders, i.e. one full adder is required for each bit. Fig. 1.a shows the redundant addition scheme where four bit X, Y and Z inputs are added up with the result S + C. In the figure, the rectangular blocks represent full adders (FA). The redundant structures require a completion adder for getting the end result. Fig. 1.b shows a four operand adder with two stage redundant adder (two stage CSA adder) scheme with a carry propagate adder (CPA) at the third stage [5]. Here, four bit numbers, X, Y, Z and W are added up, with a six bit S output. Counters can be built for more than three input operands. Mostly used counter circuits are (3,2), (7,3) and (15,4) counters, where first operand implies the number of inputs and the second one is the number of the outputs. Larger counters can be built by combining smaller ones. The \((m,k)\) counters can be defined as:

\[ \sum_{j=0}^{k-1} x_j 2^j = \sum_{j=0}^{m-1} y_j \]  \( (2) \)
In a (7,3) counter, k=3, m=7; the inputs are \(x_0 \ldots x_6\) and the outputs are \((s_2 s_1 s_0)\). Fig. 2 shows construction of a (7,3) counter using (3,2) counters. FA notation in the figure represents Full Adder, or (3,2) counter, which can be used interchangeably. When linear structure which is simpler to wire is used, the critical path is four adder delays and whenever the tree structure is used, the critical path delay reduces to three adder delays. They both require four full adders. Both of the structures are functionally equivalent, however the tree structure has better performance in speed. The adder structures are extensively explored in [6]. In Fig. 3, an example of addition of seven operands using a series of (7,3) counters is shown. Here, in the first stage, each digit is fed into a (7,3) counter, each digit produces three digit output. The three digit outputs again tiled according to their bit weights, and this time they are fed into a (3,2) counter for reduction of the outputs to only two. If regular representation of the added up number is required, this carry-save representation can be fed into a ripple carry adder, same as Fig. 1.b. Then conventional non-redundant addition result can be achieved. There is various partial product reduction techniques, especially carry-save based implementations have special importance where, all the products are reduced to two in the end. In general, for building carry-save based algorithms, counter and compressor circuits are used. Compressor circuits are a special case of counter circuits, which are built by specially tiling the carry-save operators.

In general, partial products of multiplier are added up using carry propagation free redundant circuits such as carry-save adders, compressors, counter circuits or signed digit adder arrays until two operands are left. The advantage of these redundant addition schemes is that, the circuit delay is independent of the bit length of the input operands. At the end of the redundant adder tree structure, the two products are added up with a ripple carry adder, or a fast two operand adder such as carry look-ahead adder or any other kind of parallel prefix adder technique for obtaining the result fast.

3. Multi-valued (7,3) counter

In this section, working principle of the novel current-mode multi-valued (7,3) counter circuit is introduced. The system is composed of input block, carry generation circuit, comparator circuit and output stages. The system has complementary inputs and outputs where an input or output literal and its complement appear simultaneously. The system is biased with constant currents at all stages where the power consumption is constant at all frequency range of the circuit and at any switching condition. The circuit voltage supply \(V_{DD}\) is selected to be 1.8 V at all stages. The circuit has seven complementary input operands with complementary of the inputs and three complementary outputs, as the (7,3) counter definition implies. The input stage...
of the multi-valued counter circuit can be seen in Fig. 4. For the multi-valued input stage, a constant current is set up for each logic input level. In the example configuration, the input stage unit current $I_0$ is selected to be 1.5 $\mu$A for each bit where it can be adjusted for various performance requirements. In principle, adjusting the unit current can define the performance characteristics of the circuit, however, changing unit current may affect the circuit biasing and the transistor aspect ratios needs to be adjusted for better performance. Here, $x_0$ ... $x_6$ are a bit slice of seven numbers of the binary inputs which is required to be added up, whereas $x_0'$ ... $x_6'$ are the complementary inputs. $I_0$ is the current value of the simple current source connected to each of the differential pair where the value is set to 1.5 $\mu$A as mentioned. Resistors $R_1$ and $R_2$ carry the input stage addition currents $I_{sum}$ and $I_{sum'}$ respectively that are complements of each other. Basically, $I_{sum}$ and $I_{sum'}$ represents the addition of seven inputs, namely $x_0$ ... $x_6$. Here, the circuit operates in fully differential mode. The corresponding current values for the summation are listed in Table I. As it can be seen in the Table I, $I_{sum}$ values increase by 1.5 $\mu$A steps whereas $I_{sum'}$ decrease by 1.5 $\mu$A steps when the logic level increases.

**Table I.** Logic levels of $I_{sum}$ and $I_{sum'}$ currents at the input stage

<table>
<thead>
<tr>
<th>Logic Level</th>
<th>0</th>
<th>1</th>
<th>2</th>
<th>3</th>
<th>4</th>
<th>5</th>
<th>6</th>
<th>7</th>
</tr>
</thead>
<tbody>
<tr>
<td>$I_{sum}$ (μA)</td>
<td>0</td>
<td>1.5</td>
<td>3</td>
<td>4.5</td>
<td>6</td>
<td>7.5</td>
<td>9</td>
<td>10.5</td>
</tr>
<tr>
<td>$I_{sum'}$ (μA)</td>
<td>10.5</td>
<td>9</td>
<td>7.5</td>
<td>6</td>
<td>4.5</td>
<td>3</td>
<td>1.5</td>
<td>0</td>
</tr>
</tbody>
</table>

In the counter circuit, all the input values are added up to have the result. Since there is seven inputs and three outputs, the outputs $(s_2s_1s_0)$ will vary between (000) and (111) in binary representation with output values between 0 and 7 in decimal. The input currents are accumulated over $R_1$ and $R_2$ resistors, where all the addition will be computed over these resistors. The input stage can also be assumed as a simple digital to analog converter, where multi-valued output currents $I_{sum}$ and $I_{sum'}$ can be assumed as analog outputs of the circuit. Fig. 5 shows the $I_{sum}$ and $I_{sum'}$ characteristics (first plot) and generated voltage levels (second plot) of the $R_1$ and $R_2$ resistors named as $V_i$ and $V_i'$ at the input stage in Fig. 4. As can be seen from the output characteristics, input currents are added up linearly where logic levels of $(x_0 ... x_6)$ tuple are switched as (0000000), (1000000), (1100000), ... (1111111) where the addition of the inputs are swept from 0 to 7 and the currents $I_{sum}$ and $I_{sum'}$ show the characteristics same as Table I. At the same time, generated voltage levels over $R_1$ and $R_2$ shows nonlinear characteristics after $I_{sum}$ and $I_{sum'}$ values exceeds a threshold. This property provides perfect biasing of the differential input transistors, namely $M_1$ ... $M_{14}$ in the Fig. 4 and provides sufficient voltage headroom for multi-valued signals for the seven total logic levels. Nonlinear resistor implementation of $R_1$ and $R_2$ is implemented using bulk to drain connection of PMOS pairs which provides high resistance with nonlinear characteristics [7]. Nonlinear resistance $R_1$ and $R_2$ are implemented as seen in Fig 6.a. The transistor implementation is seen inside the dotted square as the transistor bulk is connected to drain and the figure shows the test circuit of the resistance. The current and voltage characteristics of the nonlinear transistor can be seen in Fig 6.b.

Since the circuit adds up seven bits, and the result is three bits, whenever the addition value exceeds three, the third digit of the sum switches to one. From Eq. (2):
if \[ \sum_{i=0}^{n} x_i > 3 \] \[ s_2 = 1 \] (3)

In Table I, whenever the logic level is greater than 3, \( I_{sum} \) value is greater than \( I_{sum} ' \) and \( V_i \) is greater than \( V_i' \) which can also be seen in Fig. 5. This transition can be fed into a source coupled comparator block, to be sensed as a logic level. The source coupled comparator block can be seen in Fig. 7. Here, M1 and M2 transistors are diode connected, therefore a diode voltage drop appears at the drains which provides better biasing for the next stages. The outputs \( V_{O_1} \) and \( V_{O_1'} \) represents the most significant digit, \( s_2 \) and \( s_2' \) as complementary outputs.

In addition to generation of \( s_2 \), other logic levels must be generated for the operation. For this reason, comparator stages are required for sensing of other logic levels. Comparator stages for sensing logic levels of only 1, 2 and 3 is required, where \( V_i \) and \( V_i' \) are fed into the comparator in a multiplexed fashion according to the condition that \( V_{O_1} \) is generated or not generated. The first comparator output \( V_{O_1} \) of Fig. 7 is fed into the comparators in Fig. 8 as logic inputs. In Fig 8, \( V(1) \), \( V(2) \) and \( V(3) \) are constant comparison values and \( V'(1) = V(3), V'(2) = V(2) \) and \( V'(3) = V(1) \). Summation values of the inputs, corresponding comparator outputs and required circuit outputs can be seen in Table II. Using this table, \( s_0, s_1 \) and \( s_2 \) values can be extracted by using Boolean algebra:

\[
\begin{align*}
    s_0 &= V_{out1} \cdot V_{out2'} + V_{out3} \\
    s_1 &= V_{out2} \\
    s_2 &= V_{O_1}
\end{align*}
\] (4.a) (4.b) (4.c)

<table>
<thead>
<tr>
<th>Sum</th>
<th>( V_{O_1} )</th>
<th>( V_{out1} )</th>
<th>( V_{out2} )</th>
<th>( V_{out3} )</th>
<th>( s_0,s_1,s_2 )</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>000</td>
</tr>
<tr>
<td>1</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>011</td>
</tr>
<tr>
<td>2</td>
<td>0</td>
<td>1</td>
<td>1</td>
<td>0</td>
<td>010</td>
</tr>
<tr>
<td>3</td>
<td>0</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>101</td>
</tr>
<tr>
<td>4</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>010</td>
</tr>
<tr>
<td>5</td>
<td>1</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>101</td>
</tr>
<tr>
<td>6</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>0</td>
<td>110</td>
</tr>
<tr>
<td>7</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>111</td>
</tr>
</tbody>
</table>

To get the summation results, regarding to Eq. 4.a only and-or circuit is required for \( s_0 \) which is implemented in SCL as

![Fig. 7. Comparator and output generator for \( s_2 \)](image)

![Fig. 6. Drain to bulk connected PMOS transistor used as a nonlinear resistor (a) electrical connection; (b) current-voltage characteristics](image)

![Fig. 8. Comparator stages for sensing the logic levels](image)
Fig. 10. Comparator stages for sensing the logic levels seen in Fig 9.a that computes the function \( f = a + b \cdot c' \). Other output bits \( s_1 \) and \( s_2 \) appear at the outputs of the comparators regarding to Eq. 4.b and 4.c. All the outputs are buffered by SCL buffer as seen in Fig 9.b for higher input driving strength at the output. The HSPICE simulations of the outputs can be seen in Fig.10 where summation values from 0 to 7 is swept in the transient analysis. Here, complementary outputs appear in three subplots. First plot contains \( s_0 \) and \( s_0' \), second one contains \( s_1 \) and \( s_1' \), third one contains \( s_2 \) and \( s_2' \). The output swing is set to 0.2 V as complementary outputs. The system outputs can be fed into other source coupled systems (SCL) seamlessly.

4. Results and Conclusion

In our new configuration for (7,3) counter, the transistor count is reduced. The input stage requires 23 transistors; the first comparator circuit at Fig. requires 7, three output comparators requires \( 3 \times 9 = 27 \) transistors, there is also needed an And-Or Circuit for \( s_0 \) generation seen in Fig 9.a requires 9 transistors, for each output buffer in Fig.9b is used, \( 3 \times 5 = 15 \); as total, 79 transistors are needed. For binary (7, 3) counter, four full adders are needed each needs 28 transistors [8], as total of \( 4 \times 28 = 112 \) transistors. The transistor count is reduced by 29 %. Moreover, current mode design provides better power control and reduction on switching power. The power consumption of the circuit is dependent on the current sources of the circuit, where it is fixed to 1.5 \( \mu \)A at the input stage and 5 \( \mu \)A at all of the other stages. The circuit delay is measured as 1.7 ns and the current consumption is 50 \( \mu \)A as constant. The standard CMOS implementation of the (7,3) counter designed with the same technology consumes 93 \( \mu \)A at 400 MHz, which is simulated by applying random input patterns to the circuit. At that frequency, approximately 46 \% power reduction is provided. Since the power consumption is linear with frequency for the standard CMOS circuits, higher than approximately 200 MHz, our circuit provides better power characteristics. As conclusion, an analog friendly current mode multi-operand addition cell is designed. The circuit is quite power efficient at high frequencies and can be used without any power penalty higher than 200 MHz switching frequency.

5. References