

Received 20 July 2023, accepted 19 August 2023, date of publication 24 August 2023, date of current version 30 August 2023. *Digital Object Identifier* 10.1109/ACCESS.2023.3308447

# **RESEARCH ARTICLE**

# An Improved Strong Arm Comparator With Integrated Static Preamplifier

# VALERIO SPINOGATTI, RICCARDO DELLA SALA<sup>®</sup>, CRISTIAN BOCCIARELLI, FRANCESCO CENTURELLI<sup>®</sup>, (Senior Member, IEEE), AND ALESSANDRO TRIFILETTI<sup>®</sup>

Department of Information Engineering, Electronics and Telecommunications, Sapienza University of Rome, 00184 Rome, Italy Corresponding author: Francesco Centurelli (francesco.centurelli@uniroma1.it)

**ABSTRACT** This paper presents a novel Strong Arm comparator in which the input pair is reused as a static amplifier to preamplify the input signal during the precharge phase. The proposed approach relaxes the main trade-offs that characterize the Strong Arm latch: compared to the conventional topology, the enhanced comparator achieves better input-referred noise and offset, without penalizing delay nor power consumption. In fact, the proposed topology is even more efficient than its conventional counterpart as it exhibits lower power consumption when the two circuits are sized to have the same delay. The operation of the new topology is analyzed in detail through a comprehensive theoretical analysis, providing useful design criteria. The enhanced Strong Arm comparator is validated by means of post-layout simulations in a 55 nm CMOS technology with 1 V supply. The simulations show that the proposed approach improves noise, offset and energy-delay product (EDP) respectively by 28.5%, 33.8% and 5.24% compared to the conventional Strong Arm latch.

**INDEX TERMS** Strong Arm, high speed, dynamic comparator, analog-to-digital conversion (ADC).

#### **I. INTRODUCTION**

Due to the inherent robustness of digital processing, modern communication systems are usually designed to perform most of the processing in the digital domain. This trend, combined with the demand for systems that are capable of handling high bit rates, makes the development of fast, power efficient mixed-signal building blocks a research topic of fundamental importance.

Dynamic comparators are an essential part of most mixedsignal applications, such as analog-to-digital converters (ADC) and digital low drop-out regulators (DLDO) and thus enhancing the figures of merit of these components is crucial for improving the performance of the systems they belong to [1], [2], [3].

Among dynamic comparators, the Strong Arm latch is one of the most popular topologies in high-speed applications at medium-to-high supply voltages because of its many

The associate editor coordinating the review of this manuscript and approving it for publication was Jiann-Jong Chen<sup>(D)</sup>.

attractive features, namely low power consumption, limited input-referred offset, rail-to-rail output swing and minimal circuit complexity [4]. Owing to such popularity, numerous efforts have been made in the literature with the intent of further improving the performance of this topology. Numerous proposals focus on relaxing the trade-off between speed and power consumption: in [1] and [5], for instance, forward body-bias (FBB) is exploited to improve speed with a negligible increase in power consumption. The authors of [6] propose to modify the topology so that a pair of reset devices can be removed, which results in smaller delay and power consumption. In [3], instead, an improvement in comparison speed is achieved by creating several additional current paths during the evaluation phase; such paths discharge asymmetrically the output nodes according to the input differential signal, thus helping improve the regeneration time. Similarly, [2] proposes to ease regeneration by unbalancing the outputs. Differently from the previous reference, however, they achieve this result by creating static current paths during the comparator's precharge phase.

Another (significant) part on the literature focuses on developing strategies for increasing the preamplification gain in order to improve the performance in terms of inputreferred offset and noise. The technique known as dynamic bias, for instance, boosts the  $g_m/I_d$  ratio of the transistors involved in dynamic preamplification by limiting their gatesource voltages [7], [8], [9]. The main limitation of dynamic bias consists in the fact that it increases significantly the delay of the comparator because it increases the duration of the preamplification phase [7]. Charge-pump based dynamic bias [10], [11] addressess this issue by coupling a negative voltage step to the source node of the dynamic preamplifier, which increases temporarily the overdrive voltage of the differential pair and speeds up preamplification. However, this comes at the cost of an increase in power consumption. There are also alternative approaches that investigate the use of static preamplifiers to relax the comparator's specifications. The traditional approach consists in cascading one or more static amplifiers, usually implemented as simple differential pairs, before the dynamic comparator [12]. In this way, noise and offset are suppressed and the delay improves, but power consumption and area occupation increase significantly. In [13], the input differential pair is removed and the positive feedback loop is unbalanced in the proper direction by driving the body terminals of the latch's devices. This minimizes the number of devices in the dynamic comparator. A CMOS static preamplifier with common-mode feedback (CMFB) is added to compensate for the reduction in preamplification gain, which is caused by the fact that  $g_{mb}$  (the body transconductance) is several times lower than  $g_m$ . As a result, the comparator achieves very high speed operation without impairing preamplification, as demonstrated by the low input-referred noise.

This paper introduces an improved Strong Arm comparator where the preamplification gain is boosted by amplifying the input difference during the precharge phase. This is achieved by adding a clocked resistive load to the drain nodes of the input differential pair and a static tail generator biased by a current mirror. This results in a significant improvement in terms of noise and input-referred offset with no penalty on delay nor power consumption. Indeed, our simulations show that the energy-delay product (EDP) improves as well, albeit by a smaller extent. Therefore, the proposed technique relaxes the trade-offs that affect the Strong Arm latch at the only cost of a slight increase in area and layout complexity. In addition, a detailed theoretical analysis is provided, showing that the performance of the proposed comparator can be optimized by varying the explicit load resistance and the static bias current. The theoretical analysis is then validated through circuit-level simulations.

This work is organized as follows: section II briefly recalls the operating principle and the properties of the conventional Strong Arm latch. Section III introduces the proposed topology and describes its operating principle. In section IV, a theoretical analysis of the proposed topology is carried out. Section V describes the sizing that has been adopted to simulate the proposed and the conventional Strong Arm comparator. In section VI the theoretical analysis is validated by means of pre-layout simulations and the performance of the proposed comparator is evaluated through post-layout simulations. A comparison with the recent state of the art is also provided. Finally, section VII concludes the paper.

#### **II. CONVENTIONAL STRONG ARM LATCH**

Figure 1 shows the schematic of the conventional Strong Arm latch. The operation of the comparator is as follows:

- *Precharge* (*CLK* = *GND*):  $M_7$  is in off state, while devices  $S_1$  through  $S_4$  precharge the output and intermediate nodes to  $V_{DD}$ . This eliminates the signal-dependent offset as the asymmetry that had been created by the previous decision is canceled.
- Evaluation (CLK =  $V_{DD}$ ):  $M_7$  turns on and  $S_1$  through  $S_4$  turn off.  $M_1$  and  $M_2$  discharge asymmetrically nodes P and Q. When  $V_{PQc} \triangleq (V_P + V_Q)/2$  drops to  $\approx V_{DD} V_{th,n}$  ( $V_{th,n}$  being the threshold voltage of the NMOS transistors)  $M_3$  and  $M_4$  turn on and start discharging asymmetrically the output nodes. Even though  $M_3$ - $M_4$  form a positive feedback loop, they provide little regeneration in this phase because their drain terminals are connected to the comparator's load capacitance. When  $V_{oc} \triangleq (V_{op} + V_{on})/2$  reaches  $\approx V_{DD} V_{th,p}$  ( $V_{th,p}$  being the threshold voltage of the PMOS devices) the pair of cross-coupled inverters formed by  $M_3$  through  $M_6$  starts to regenerate the signal to full swing.

The main strengths of the Strong Arm comparator are summarized below:

- Delay and power consumption are low because of the limited number of transistors.
- Static power consumption is virtually zero thanks to the CMOS configuration used in the latch.
- The input-referred offset is limited as it mainly depends on the mismatch of the input differential pair. In fact, the offset contributions from transistors  $M_3$  through  $M_6$  are attenuated because these devices turn on when the signal has been already partially amplified.
- The pair of cross-coupled latches regenerates the signal to full swing: therefore, the comparator's outputs can be connected to CMOS digital blocks without the need for interface blocks.

# A. STRONG ARM LATCH WITH PREAMPLIFIER

The delay, noise and input-referred offset of the Strong Arm latch (and, in general, of a dynamic comparator) can be improved by adding a static preamplifier before the comparator's input, as shown in Figure 2.

The preamplifier usually consists of a simple differential pair biased at constant current. If a resistive load is used (as shown in the figure) no common-mode feedback (CMFB) is required. To a first approximation, a static preamplifier with gain equal to  $A_v$  reduces the comparator's noise and offset contribution by a factor equal to  $A_v$ . Delay, on the other



FIGURE 1. Schematic of the conventional strong arm latch.



FIGURE 2. Schematic of the conventional strong arm latch with preamplifier.

hand, scales logarithmically with  $V_{id}$ : this implies that the delay  $t_d^{preamp}$  of the comparator with a preamplifier can be computed as  $t_d - ln(A_v)$ , where  $t_d$  is the delay that the comparator would exhibit if no preamplifier was used. Clearly, these improvements come at the expense of an increase in power consumption. Indeed, the minimum bias current of the

preamplifier is bounded below by the specification on the bandwidth that, in turn, depends on  $f_{CLK}$ .

#### **III. PROPOSED TOPOLOGY**

The power consumption and area overhead that come with the addition of a static preamplifier can be minimized by reusing the input pair transistors  $M_1$ - $M_2$  to preamplify the input difference during the precharge phase, as shown in Figure 3. More specifically, there are three key points to this topology:

- During the precharge phase, the tail current for  $M_1$ - $M_2$  is provided by an additional static generator that can be implemented as a current mirror. This way, the signal is preamplified with limited power consumption because  $I_b$  is set by a reference branch.
- The preamplifer's load is represented by a pair of clocked devices connected in series to resistors  $R_1$ - $R_2$ . Hereafter we will assume that  $R_1 = R_2 = R$ . Adding a resistive load boosts the preamplification gain because the load resistance seen by  $M_1$ - $M_2$  increases.
- The output nodes are reset with a modified scheme that combines the approach used in the original Strong Arm latch [14] with the pull-up based configuration employed in the Razavi Strong Arm [4]. Thanks to this configuration, only  $S_5$  is responsible for eliminating the signal-dependent offset. Transistors  $S_3$  and  $S_4$ , on the other hand can be made much smaller. Note that  $S_3$  and  $S_4$  cannot be eliminated completely: indeed, the comparator performs better if the outputs are precharged close to  $V_{DD}$ , because the circuit has more time to preamplify the signal before the latch takes over. Moreover, these devices ensure that  $S_5$  turns on completely. The combination of  $S_3$ - $S_4$ - $S_5$  helps reducing power consumption without increasing the reset time because the overall parasitic capacitance is smaller.

It should be noted that a similar approach, based on using the comparator's devices for preamplification, has been already devised in [2]. However, our comparator achieves better performance in terms of power consumption thanks to the additional tail generator that limits current consumption during the reset phase. Furthermore, the clocked resistive load causes higher differential voltages to build up during the reset phase, which improves noise and offset.

The operation of the proposed comparator is as follows:

- Precharge/preamplification (CLK = GND): During precharge, the output nodes are equalized by the reset transistor  $S_5$  while  $S_3$ - $S_4$  pull them up to  $\approx V_{DD}$ . The clocked tail transistor  $M_7$  is off. At the same time  $M_1$  and  $M_2$ , biased by the static tail generator, preamplify the input difference thanks to the clocked load  $(S_1$ - $S_2$ - $R_1$ - $R_2$ ). It should be noted that the total load seen by  $M_1$  and  $M_2$  is given by the parallel of the clocked load resistance and the source resistance of  $M_3$ - $M_4$ . The key aspect is that the explicit load resistors  $R_1$ - $R_2$  can act as a weak pull up network that limits the  $V_{gs}$  of  $M_3$ - $M_4$  and, consequently, their transconductance. Hence, a proper choice of R can prevent the active devices from limiting the preamplifier's load resistance. This idea is discussed in greater detail in section IV.
- Evaluation (CLK =  $V_{DD}$ ):  $M_7$  is on while  $S_1$  through  $S_5$  are off. The comparator works like a conventional

Strong Arm, except for the fact that  $V_P$  and  $V_Q$  have been already unbalanced according to the input difference during precharge. This increases the differential voltage that builds up at nodes P and Q until  $M_3$  and  $M_4$  turn on, and ultimately results in improved performance in terms of delay, offset, and noise. Note that there is no need to switch off the static generator during evaluation, because the CMOS latch formed by  $M_3$ - $M_4$ - $M_5$ - $M_6$  cuts the path between  $V_{DD}$  and GND as soon as the outputs saturate. Moreover, the static generator helps improve delay because it increases the total tail current.

It is important to remark that the proposed topology constrains the setup time of the input signal. Specifically, if the input difference does not settle sufficiently in advance with respect to the rising clock edge the comparator might produce a wrong output because preamplification is impaired. However, this issue also exists when a static preamplifier is added before the comparator's input; therefore the proposed preamplification technique does not add constraints with respect to the conventional one.

# **IV. THEORETICAL ANALYSIS**

# A. ANALYSIS OF THE LOAD RESISTANCE

#### 1) EXISTENCE OF THE OPTIMUM

The load seen by  $M_1$ - $M_2$  during the reset/preamplification phase depends on the nonlinear interaction between the load resistance R and  $M_3$  (or  $M_4$ ). In this subsection we show that there exists an optimal choice of R that maximizes the total small-signal load resistance.

During the reset phase  $S_5$  short-circuits the output nodes and  $M_5$ - $M_6$  turn off.  $M_3$ - $M_4$  become diode-connected. Therefore, the behavior of the total load resistance as a function of R can be investigated by considering the equivalent halfcircuit depicted in Figure 4a. In the circuit  $I_h$  represents the bias current and  $R_D$  is the explicit load resistance in parallel to  $M_3$ . The series resistance associated to the switches  $S_1$ - $S_2$ - $S_3$ - $S_4$  is neglected, which implies  $R_D = R$ . Moreover, a small-signal test current  $i_T$  is added to compute the load resistance. The conductance associated to each resistance will be denoted by the letter G (with the same pedix). Now let  $V_{gs} = V_{gs_Q} + v_{gs}$ , where  $V_{gs_Q}$  denotes the bias point and  $v_{gs}$ represents the small-signal component. Then, we can write

$$i_T = -(g_{m3,4} + G_D)v_{gs} = -\frac{v_{gs}}{R_{PO}}$$
(1)

where  $g_{m3,4}$  is the small-signal transconductance of  $M_3$  and  $R_{PQ} \triangleq 1/(g_{m3,4} + G_D)$ . The drain-source conductance  $g_{ds3}$  is neglected. Now we note that  $g_{m3,4} = g_{m3,4} (V_{gs_Q} (R_D))$ . To prove that  $R_{PQ}$  has a maximum, it is sufficient to demonstrate that  $V_{gs_Q}$  is a strictly increasing function of  $R_D$ . Indeed, if this is true,  $G_{PQ} = 1/R_{PQ}$  is the sum of a strictly decreasing function that goes to infinity as  $R_D \rightarrow 0$  and tends to 0 as  $R_D \rightarrow \infty$  (i.e., the function  $G_D = 1/R_D$ ), and a strictly increasing function  $g_{m3,4}$ ). This means that  $G_{PQ}$  has a (unique) minimum point.



FIGURE 3. Schematic of the proposed enhanced strong arm latch.



**FIGURE 4.** Equivalent circuits for analyzing the load resistance as a function of  $R_D$  a) and the evolution of  $V_{PQd}$  as a function of time b).

The relationship between  $V_{gs_Q}$  and  $R_D$  can be studied by setting  $i_T = 0$  (which implies  $v_{gs} = 0$ ) and writing Kirchhoff's law at node P:

$$G_D V_{gs_0} + I_{d3,4}(V_{gs_0}) = I_h \tag{2}$$

where  $I_{d3,4}$  represents the drain current of  $M_3$ . In this analysis, we adopt the exponential model to describe  $I_{d3,4}$  in the subthreshold region and the square law to describe its

behavior when  $V_{gs_Q} > V_{th}$ :

$$\begin{cases} I_{d3,4}(V_{gs_Q}) = I_{d0}e^{\frac{V_{gs_Q} - V_{th}}{nU_T}}(1 - e^{\frac{-V_{gs_Q}}{U_T}}) & V_{gs_Q} \le V_{th} \\ I_{d3,4}(V_{gs_Q}) = k(V_{gs_Q} - V_{th})^2 & V_{gs_Q} > V_{th} \end{cases}$$
(3)

where  $U_T$  is the thermal voltage,  $V_{th}$  is the transistor's threshold voltage and  $k \triangleq \frac{1}{2}\mu C_{ox}\frac{W}{L}$  is the transconductance factor of the device. It should be noted that equation (2) does not

have a closed-form solution when  $V_{gs_Q} < V_{th}$ . This is not an issue, however, because we can prove that  $\partial g_{m3,4}/\partial R_D > 0$  without solving equation (2) in the unknown  $V_{gs_Q}$ . First, we solve the equation for  $R_D$ , obtaining

$$R_D(V_{gs_Q}) = \frac{V_{gs_Q}}{I_h - I_{d3,4}(V_{gs_Q})}$$
(4)

We then take the first derivative of  $R_D$  with respect to  $V_{gs_0}$ :

$$\frac{\partial R_D}{\partial V_{gs_Q}} = \frac{I_h - I_{d3,4}(V_{gs_Q}) + V_{gs}g_{m3,4}(V_{gs_Q})}{\left(I_h - I_{d3,4}(V_{gs_Q})\right)^2}$$
(5)

Now we observe that  $I_h = I_{d3,4} + G_D V_{gs_Q} \ge I_{d3,4}$  and that  $g_{m3,4}(V_{gs_Q}) > 0$ . Hence, the right side of equation (5) must be positive as well. But  $R_D(V_{gs_Q})$  is the inverse function of  $V_{gs_Q}(R_D)$ , so its derivative must have the same sign as  $\partial V_{gs_Q}/\partial R_D$ . This means  $g_{m3,4}(V_{gs_Q}(R_D))$  is a strictly increasing function, because it is the composition of two strictly increasing functions. We have thus demonstrated that  $R_{PQ}(R_D)$  has a maximum point. We will denote by  $R_D^*$  the value of  $R_D$  that maximizes  $R_{PQ}$ .

# 2) EVALUATION OF THE OPTIMUM

In order to compute analytically the value of  $R_D$  that maximizes  $R_{PQ}$ , we make an assumption on the transistor's operating region and solve the equation

$$\frac{\partial G_D(V_{gs_Q})}{\partial V_{gs_Q}} + \frac{\partial g_{m3,4}(V_{gs_Q})}{\partial V_{gs_Q}} = 0 \tag{6}$$

The solution of equation (6), which we call  $V^*$ , represents the voltage drop across the load when  $R_D = R_D^*$ . Once the analytical expression of  $V^*$  is known,  $R_D^*$  may be computed by letting  $V_{gs_Q} = V^*$  in equation (4).

Assume that  $V^* > V_{th}$ , i.e., the transistor is biased above the threshold when  $R_D = R_D^*$ . Then, using equations (3) and (4), equation (6) can be rewritten as

$$2k - \frac{2k(V_{gs_Q} - V_{th})V_{gs_Q} + I_h - k(V_{gs_Q} - V_{th})^2}{V_{gs_Q}^2} = 0 \quad (7)$$

By solving for  $V_{gs_Q}$  we get

$$V^* = \sqrt{\frac{I_h - kV_{th}^2}{k}} = \sqrt{\frac{I_h - \tilde{I}}{k}}$$
(8)

where  $\tilde{I}$  represents the drain current of the diode-connected transistor  $M_3$  when  $V_{gs_Q} = 2V_{th}$ . Note that  $\tilde{I}$  is an intrinsic property of the device. Equation (8) is useful in two ways:

- It provides an analytical expression of  $V^*$  in the case  $V^* > V_{th}$
- It provides a criterion to establish whether  $V^* > V_{th}$  or  $V^* < V_{th}$ . Indeed, the solution we found has a physical meaning only if  $\sqrt{\frac{I_h \tilde{I}}{k}} > V_{th}$ , which is equivalent to  $I_h > 2\tilde{I}$ . If  $I_h < 2\tilde{I}$ , there is no meaningful solution to equation (7), which means that  $M_3$  must be in the subthreshold region when  $R_D = R_D^*$ .

By substituting  $V^*$  into equation (4) and simplifying the resulting expression we obtain

$$R_D^* = \frac{1}{2kV_{th}} \tag{9}$$

Recalling that  $G_{PQ} = g_{m3,4}(R_D) + 1/R_D$  and using equation (9), we obtain

$$R_{PQ}^{*} = \frac{1}{2\sqrt{k(I_{h} - \tilde{I})}}$$
(10)

If  $V^* < V_{th}$ , the drain current follows an exponential law and equation (6) does not have a closed-form solution. However, the analysis we performed in the previous subsection guarantees that  $R_{PQ}(R_D)$  has a unique maximum point. In addition, we can carry out an approximated analysis by making the following assumptions:

- The drain current of the active device is small when  $R_D = R_D^*$ , i.e., we have  $I_{d3,4}(V^*) \ll I_h$ .
- The exponential behavior of  $I_{d3,4}(V_{gs})$  is approximated as a on-off behavior: more precisely, we suppose that  $M_3$  does not conduct any current for  $V_{gs} \leq V^*$  and it turns on abruptly when  $V_{gs} > V^*$ .

If these hypotheses are verified the equation

$$V_{gs} \approx R_D I_h \tag{11}$$

represents a reasonable approximation for  $V_{gs} \leq V^*$ . It follows that

$$R_{PQ}^* \approx R_D^* \approx \frac{V^*}{I_h} \tag{12}$$

because we assume that  $R_{PQ}$  starts to decrease as soon as  $V_{gs}$  exceeds  $V^*$ . Clearly, the analysis we just presented is limited in that it does not provide an expression for  $V^*$ ; nonetheless equation (12) is useful because it describes with reasonable accuracy the behavior of  $R_{PQ}^*$  as a function of  $I_h$ . This point will be clarified in subsection IV-C, where we study the compromise between power consumption and static preamplification gain in the proposed topology.

# B. ANALYSIS OF THE PREAMPLIFICATION GAIN

In this subsection we derive an expression for the preamplification gain of the proposed comparator. In the analysis that follows we denote by  $V_{PQd}$  and  $V_{PQc}$  the output differential mode voltage and the output common mode voltage of the preamplifier, respectively. The duration of the reset phase will be denoted by  $t_{rst} \triangleq t_{ready} + t_s$ , with  $t_{ready} \ge 0$  being the worst-case settling time of the input signal (measured from the beginning of the reset phase) and  $t_s > 0$  being the time available for  $V_{PQd}$  to settle. Furthermore, we denote by  $t_d$ the duration of the dynamic preamplification phase, that starts with the rising edge of the clock. Finally, we assume that  $V_{id}$ is a step function:

$$\begin{cases} V_{id} = 0, & t < t_{step} \\ V_{id} = V_s, & t \ge t_{step} \end{cases}$$
(13)

where  $V_s$  is a small constant voltage. For the sake of convenience, the origin of the time axis is set at the instant in which  $V_{id}$  changes its value, which implies  $t_{step} = 0$ . With these hypotheses, the falling clock edge that marks the beginning of the reset phase occurs at  $t = -t_{ready}$  and the rising clock edge that marks the end of the reset phase occurs at  $t = t_s$ .

As a starting point, we observe that during the interval  $[0, t_s]$  the input differential voltage  $V_s$  is amplified by the subcircuit consisting of the active devices  $M_1-M_2-M_3-M_4$ , the resistors  $R_1-R_2$  and the static generator  $I_b$ . This creates a non-zero initial condition for the subsequent phase, that takes place during the interval  $[t_s, t_s + t_d]$ . During this phase, nodes P and Q are discharged asymmetrically and their voltages drop by different amounts, which we call  $\Delta V_P$  and  $\Delta V_Q$ , respectively. We can thus write

$$V_P(t_s + t_d) = V_{PQc}(t_s) + \frac{V_{PQd}(t_d)}{2} - \Delta V_P \qquad (14)$$

$$V_Q(t_s + t_d) = V_{PQc}(t_s) - \frac{V_{PQd}(t_d)}{2} - \Delta V_Q$$
 (15)

Subtracting equation (15) from equation (14) we obtain

$$V_{PQd}(t_s + t_d) = V_{PQd}(t_s) - (\Delta V_P - \Delta V_Q)$$
(16)

Now we will derive analytical expressions for  $V_{PQd}(t_s)$ ,  $\Delta V_P$  and  $\Delta V_Q$ .

We start by computing  $V_{PQd}(t_s)$ . To this end, we linearize the circuit, based on the following observations:

- The differential input signal  $V_s$  is assumed to be small.
- The differential signal starts to be preamplified  $t_{ready}$  seconds after the beginning of the reset phase. If  $t_{ready}$  is greater than zero and represents a significant fraction of  $t_{rst}$  (e.g. at least 1/3 of the reset time), we can assume  $V_{PQc}$  to have settled at least partially by the time  $V_{PQd}$  begins to settle. This hypothesis is verified in several applications, such as SAR ADCs, and it allows us to neglect the variation of the small signal parameters caused by the settling of  $V_{PQc}$ . If the condition we just described does not hold, the circuit may still be analyzed by using small signal parameters averaged over the preamplifier's output common mode swing. In this case, of course, the error associated to the small signal approximation will be higher.

Within the linear approximation, the gain error can be computed by referring to the equivalent small-signal circuit depicted in Figure 4. By analyzing the circuit in the Laplace domain, it is straightforward to show that

$$V_{PQd}(t_s) = -g_{m1,2}R_{PQ}(1 - e^{-\frac{t_s}{\tau}})V_s$$
(17)

where  $\tau = R_{PQ}C_{PQ}$ . Note that the quantity  $g_{m1,2}R_{PQ}$  represents the steady state gain  $A_{vd}^{\infty}$ .

In order to compute  $\Delta V_P$  and  $\Delta V_Q$  we have to take into account that  $M_7$  is on during the interval  $[t_s, t_s + t_d]$ . This implies that:

• The total tail current is given by  $I_{tail} = I_{d7} + I_b$ , with  $I_{d7} \gg I_b$ .

• The transconductance of  $M_1$ - $M_2$  increases significantly because their drain current increases. We denote by  $g'_{m1,2}$  the transconductance of the input pair during dynamic preamplification.

For the sake of simplicity,  $I_{d7}$  is assumed to be constant: this is obviously an approximation, as  $M_7$  enters the triode region shortly after turning on and its current varies significantly during dynamic preamplification. With this hypothesis,  $I_{d1}$ and  $I_{d2}$  are also constant and we may write

$$\Delta V_P = \frac{I_{d1} t_d}{C_{PO}} \tag{18}$$

$$\Delta V_Q = \frac{I_{d2} t_d}{C_{PQ}} \tag{19}$$

Subtracting the two equations we obtain

$$\Delta V_P - \Delta V_Q = \frac{(I_{d1} - I_{d2})t_d}{C_{PQ}} = \frac{g'_{m1,2}t_d V_s}{C_{PQ}}$$
(20)

If, as it is usually done, we assume that the end of the preamplification phase coincides with the time instant in which  $V_{POc}$  falls below  $V_{DD} - V_{th}$  [4], we may express  $t_d$  as

$$t_d = \frac{C_{PQ}(V_{PQc}(t_s) - V_{th})}{I_{tail}}$$
(21)

Then, equation (20) can be rewritten as

$$\Delta V_P - \Delta V_Q = \frac{g'_{m1,2}(V_{PQc}(t_s) - V_{th})V_s}{I_{tail}}$$
(22)

The preamplification gain is defined as

$$A_{pre} \triangleq \frac{V_{PQd}(t_s + t_d)}{V_s} \tag{23}$$

Hence, we have

$$A_{pre} = A_{pre,s} + A_{pre,d} \tag{24}$$

where

$$\begin{cases} A_{pre,s} = -g_{m1,2}R_{PQ}(1 - e^{-\frac{t_s}{\tau}}) \\ A_{pre,d} = -\frac{g'_{m1,2}(V_{PQc}(t_s) - V_{th})}{I_{tail}} \end{cases}$$
(25)

We now recall the expression of the preamplification gain of the conventional Strong Arm [4]:

$$A_{pre}^{conv} = -\frac{g'_{m1,2}(V_{DD} - V_{th})}{I_{tail}}$$
(26)

Comparing equations (24) and (26), we notice that the proposed topology improves the preamplification gain with the addition of the term  $A_{pre,s}$ , but this comes at the expense of a decrease in the magnitude of the dynamic term  $A_{pre,d}$  because  $V_{PQc}(t_s)$  is less than  $V_{DD}$ . Despite this, the total gain can be expected to improve compared to the conventional Strong Arm because the static preamplifier can achieve a higher gain compared to a purely dynamic one. This is confirmed by the simulations, which show that the enhanced Strong Arm has better performance in terms of input-referred noise and offset with respect to its conventional counterpart.

# C. TRADE-OFF BETWEEN SETTLING ERROR AND ENERGY EFFICIENCY

#### 1) ENERGY-SETTLING TRADE-OFF AS t<sub>rst</sub> CHANGES

The presence of the static current generator  $I_b$  gives rise to a trade-off between the energy efficiency and the settling error affecting the static preamplification gain. Assume that it is possible to change the duration of the reset phase  $t_{rst}$ . A shorter reset time results in smaller power consumption, at the expense of the static preamplification gain  $A_{pre,s}$ . If, instead, we increase  $t_{rst}$  the gain improves because the preamplifier is given more time to settle. However, power consumption increases because the tail generator draws more charge from the supply. After a certain point, increasing the preamplification time becomes counterproductive because the gain saturates. To formalize this reasoning we propose the following reward function:

$$\mathcal{R}(t_{rst}) = \frac{\left|A_{pre,s}(t_{rst})\right|}{A_{pre,s}^{\infty}E} = \frac{1 - e^{-\frac{t_{rst} - t_{ready}}{\tau}}}{E_{dyn} + V_{DD}I_{b}t_{rst}}$$
(27)

where  $E_{dyn}$  is the energy absorbed to charge and discharge the parasitic capacitances during a clock cycle and  $V_{DD}I_b(t_{rst})$  is the energy absorbed by the static tail generator during a clock cycle. The quantity  $E(t_{rst}) \triangleq E_{dyn} + V_{DD}I_bt_{rst}$  represents the total energy absorbed by the circuit to perform a comparison (reset phase included). Hence,  $\mathcal{R}(t_{rst})$  is the ratio between the (normalized) static preamplification gain and the energy per comparison. Three important observations must be done before carrying on with the analysis:

- The dynamic term is assumed to be constant because we hypothesize that  $t_{rst}$  is long enough for  $V_{PQc}$  to achieve at least a rough settling and, in general, that the clock period is enough for all the main charge/discharge transients to settle. The transient of  $V_{PQd}$  is not included in this hypothesis because the energy consumption associated to it is negligible (recall that  $V_{id}$  is small).
- $A_{pre,s}$  has been rewritten as a function of  $t_{rst}$  by using the relation  $t_s = t_{rst} t_{ready}$ .

We now wish to study the behavior of  $\mathcal{R}(t_{rst})$  in the interval  $[t_{ready}, \infty)$ . To this end, we take its first derivative:

$$\frac{\partial R}{\partial t_{rst}} = \frac{e^{-\frac{t_{rst}-t_{ready}}{\tau}}E(t_{rst}) - V_{DD}I_b(1 - e^{-\frac{t_{rst}-t_{ready}}{\tau}})}{E(t_{rst})^2} \quad (28)$$

In order to maximize  $\mathcal{R}(t_{rst})$  we should solve the transcendental equation  $\partial \mathcal{R}(t_{rst})/\partial t_{rst} = 0$ . Although this equation does not have a closed-form solution, we can prove that  $\mathcal{R}(t_{rst})$  has a maximum. Indeed, by inspecting the numerator of equation (28) it is easy to recognize that it is the sum of a strictly positive term and a negative term. The positive term, that is,  $e^{-\frac{t_{rst}-t_{ready}}{\tau}} E(t_{rst})$ , vanishes as  $t_{rst} \to \infty$ . The negative term, that is,  $-V_{DD}I_b(1-e^{-\frac{t_{rst}-t_{ready}}{\tau}})$ , is null when  $t_{rst} = t_{ready}$  and tends to a finite negative value as  $t_{rst} \to \infty$ . Therefore  $\mathcal{R}(t_{rst})$ must have at least a maximum point in the interval  $[t_{ready}, \infty)$ because it is a continuous function. It is also possible to prove that the maximum is unique, however the proof is omitted as At this point, it should be remarked that usually the designer is not allowed to choose  $t_{rst}$ , because the frequency and duty cycle of the clock signal are both fixed. In this case, we can rewrite  $\mathcal{R}$  as a function of  $I_b$  and  $R_{PQ}$  and formulate a constrained optimization problem in which  $R_{PQ} = R_{PQ}^*$ . This requires us to distinguish between the case in which  $V^* > V_{th}$  and the case in which  $V^* < V_{th}$ .

# 2) ENERGY-SETTLING TRADE-OFF AS $I_b$ CHANGES AND $V^* > V_{th}$

As shown in subsection IV-A, the condition  $V^* > V_{th}$  implies  $R_{PQ}^* = 1/2\sqrt{k(I_h - \tilde{I})}$ , where  $I_h = I_b/2$ . By substituting this relationship into the expression of  $\mathcal{R}$  we obtain

$$\mathcal{R}(I_b) = \frac{1 - e^{-\frac{2I_s}{C_{PQ}}\sqrt{k(\frac{I_b}{2} - \tilde{I})}}}{E_{dyn} + V_{DD}I_b t_{rst}}$$
(29)

where we used the fact that  $t_s = t_{rst} - t_{ready}$  to simplify the expression. Similarly to the previous paragraph, the equation  $\partial \mathcal{R}(I_b)/\partial I_b = 0$  does not have a closed-form solution, but we can prove the existence of at least one maximum point. Indeed, the numerator of the derivative  $\partial \mathcal{R}(I_b)/\partial I_b$  consists of two terms, which we denote by  $A(I_b)$  and  $B(I_b)$ , whose expressions are reported below:

$$A(I_b) = \frac{2kt_{pre}e^{-\frac{t_s}{2C_{PQ}}\sqrt{k(\frac{I_b}{2}-\tilde{I})}}}{C_{PQ}\sqrt{\frac{I_b}{2}-\tilde{I}}}(E_{dyn}+V_{DD}t_{rst}(\frac{I_b}{2}-\tilde{I}))$$
(30)

$$B(I_b) = -\frac{V_{DD}t_{rst}}{2} (1 - e^{-\frac{t_s}{2C_{PQ}}\sqrt{k(\frac{I_b}{2} - \tilde{I})}})$$
(31)

Clearly,  $A(I_b) > 0$  for every  $I_b \ge 0$  and  $\lim_{I_b \to \infty} A(I_b) = 0$ . Moreover, we have that  $B(I_b) < 0$  for every  $I_b > 0$ ,  $B(I_b = 0) = 0$  and  $\lim_{I_b \to \infty} B(I_b) < 0$ . Hence,  $\mathcal{R}(I_b)$  has at least one maximizer.

# 3) ENERGY-SETTLING TRADE-OFF AS $I_b$ CHANGES AND $V^* < V_{th}$

When  $V^* < V_{th}$  there is no closed-form expression for  $V^*$ . However, as explained in subsection IV-A, we may carry out an approximated analysis if we hypothesize that  $I_{d3,4}(V^*) \ll$  $I_h$  and that  $I_{d3,4}$  starts to increase abruptly when  $V_{gs} >$  $V^*$ . With these assumptions, we have that  $R^*_{PQ} \approx 2V^*/I_b$ . In addition, we approximate  $V^*$  as independent from  $I_b$ . This assumption is not backed by mathematical considerations, but it works well in practice, as we shall demonstrate in section VI. By substituting the relationship  $R^*_{PQ} = 2V^*/I_b$  in the expression of  $\mathcal{R}$  we obtain

$$\mathcal{R}(I_b) = \frac{1 - e^{-\frac{I_b t_s}{V_{th} C_{PQ}}}}{E_{dyn} + V_{DD} I_b t_{rst}}$$
(32)

In this case, the function  $\mathcal{R}(I_b)$  has the same form as  $\mathcal{R}(t_{rst})$ , which means the considerations we made for  $\mathcal{R}(t_{rst})$  also hold

for  $\mathcal{R}(I_b)$ . Specifically, for a given  $t_{rst}$  and a given  $t_{ready}$  there exists an optimum choice of the tail current, which denote as  $I_b^*$ , that maximizes the objective function.

# V. SIZING

To assess the effectiveness of the proposed approach, the enhanced Strong Arm comparator was designed and laid out in Cadence Virtuoso in a 55 nm CMOS technology with 1 V supply. A standard Strong Arm comparator was also designed and laid out in the same technology to provide a reliable benchmark. Both layouts were carefully optimized so as to maximize symmetry, in order to limit the systematic input-referred offset, and are shown in Figure 5. The area occupation of the conventional Strong Arm latch is about 12.38  $\mu$ m × 9.44  $\mu$ m  $\approx$  116.9  $\mu$ m<sup>2</sup>; the enhanced comparator, instead, occupies an area of approximately 14.2  $\mu$ m × 10.26  $\mu$ m  $\approx$  145.7  $\mu$ m<sup>2</sup>. All the results reported in section VI have been obtained from post-layout simulations, except where otherwise stated. Table 1 shows the sizing of the active devices in the conventional and in the proposed comparator. The two circuits were sized in such a way to obtain the same delay. The channel widths of the active devices are almost identical for the two circuits, the only exceptions being  $S_1, S_2$ ,  $S_3$ ,  $S_4$  and  $M_7$ . In the proposed topology the width of  $M_7$  is smaller because  $V_{PQc}$  is precharged to  $V_{DD} - V_{PQc}(t_s)$ , which causes the latch to turn on earlier. For this reason, the tail current is reduced in order to slow down the discharge of the intermediate nodes and ensure that the two comparators have the same delay.

A small aspect ratio is chosen for  $S_1$ - $S_2$  in order to use their equivalent resistance as part of the load; in other words, we have  $R_D = R_{switch} + R$ , where  $R_{switch}$  is the on resistance of the switches. This helps reduce the area and the parasitic capacitance associated to the resistive load because the same value of  $R_D$  can be achieved with a smaller R. The aspect ratio of  $S_3$ - $S_4$  is also small because in the proposed comparator these transistors are not responsible for equalizing the outputs.

The proposed topology also features additional components, namely  $S_5$ ,  $R_1$  and  $R_2$ . The value of  $R_1$  and  $R_2$ , which is not reported in the table, is 10 k $\Omega$  for both resistors. As it will be shown in section VI, this value is suboptimal compared to the one that would maximize  $R_{PQ}$ . This choice was made because it led to slightly better performance in terms of power consumption and settling error compared to  $R_D^*$ . Nonetheless,  $R_D^*$  is generally worth estimating because it represents a good starting point from which the designer can iterate. The static tail generator is implemented as a MOS device with  $W = 1 \ \mu m$  and  $L = 60 \ nm$ , biased by a current mirror. The static tail current  $I_b$  is 40  $\mu$ A.

It is worth mentioning that the conventional Strong Arm latch with a preamplifier (as shown in figure 2) was not simulated because the proposed topology has a smaller EDP compared to the conventional Strong Arm. Therefore, if a stand-alone static preamplifier was added to both topologies, the proposed comparator would still perform better. **TABLE 1.** Sizing of the conventional and proposed comparators. All transistors have minimum channel length, that is, L = 60 nm.

| Device         | W [µm]       |          |  |  |
|----------------|--------------|----------|--|--|
|                | Conventional | Proposed |  |  |
| $M_1$          | 8            | 8        |  |  |
| $M_2$          | 8            | 8        |  |  |
| $M_3$          | 2            | 2        |  |  |
| $M_4$          | 2            | 2        |  |  |
| $M_5$          | 2            | 2        |  |  |
| $M_6$          | 2            | 2        |  |  |
| $M_7$          | 16           | 6        |  |  |
| $S_1, S_2$     | 0.5          | 0.2      |  |  |
| $S_{3}, S_{4}$ | 0.5          | 0.15     |  |  |
| $S_5$          | -            | 0.4      |  |  |

#### **VI. SIMULATIONS**

# A. VALIDATION OF THE THEORETICAL ANALYSIS

1) LOAD RESISTANCE OPTIMIZATION

This section contains a validation of the analysis developed in subsection IV-A. The validation is based on pre-layout simulations because it requires a sweep on the value of R. As already explained in section V, in our sizing  $R_D$  does not coincide with R because the on resistance of  $S_3$ - $S_4$  is not negligible; specifically we have  $R_D = R + R_{switch}$  where  $R_{switch}$  is comprised between  $\approx 6 \text{ k}\Omega$  and  $9 \text{ k}\Omega$ . In the discussion that follows we assume  $R_{switch} \approx 7.5 \text{ k}\Omega$ . Because  $\tilde{I}$  is several hundreds of  $\mu A$  and  $I_h = 20 \ \mu A$  we have  $I_h < 2\tilde{I}$ , which implies that  $M_3$ - $M_4$  are biased in the subthreshold region for  $R_D = R_D^*$ .

Figure 6 shows the simulated behavior of the small-signal resistance of the preamplifier load as a function of *R*. In accordance with the theory,  $R_{PQ}$  has a unique maximum point in  $R = R^* \triangleq 8 \ k\Omega$ , which implies  $R_D^* = 15 \ k\Omega$ . It should be noted that  $R_D^*$  does not have an analytical expression because the optimum falls in the subthreshold region. Despite this, we can compare the results of the simulation to the approximated analysis developed in section IV-A because  $I_{d3,4}(R_D^*) \approx 0.62 \ \mu A \ll I_h$ . The simulation yields  $V^* \approx 280 \ mV$ , which implies  $R_{PQ}^* \approx R_D^* \approx V^*/I_h = 14k\Omega$  and, consequently,  $R^* \approx 6.5 \ k\Omega$ . It is also interesting to notice that for  $R < R^*$  the curve  $R_{PQ}(R)$  is well approximated by a line whose slope is 1. Obviously the curve is shifted upwards because of the series resistance of the switch ( $R_{PQ}(R = 0) = 6 \ k\Omega$ , which coincides with the switch resistance).

#### 2) ENERGY-SETTLING TRADE-OFF

This section validates the theory developed in subsection IV-C regarding the trade-off between energy efficiency and settling error affecting the preamplification gain. Figures 7 and 8 compare the simulated behavior of the reward function  $\mathcal{R}$  with the variation of  $t_{rst}$  and  $I_b$ , respectively. In both cases the simulated curves have been obtained from pre-layout simulations. In order to plot the theoretical behavior of  $\mathcal{R}$ , the constants  $E_{dyn}$  and  $C_{PQ}$  were extracted from the simulator. Their values are 11.5 fJ and 6 fF, respectively.

The figures confirm the existence of an optimum when  $\mathcal{R}$  is varied both as a function of  $t_{rst}$  and as a function of



FIGURE 5. Layout of the proposed a) and conventional b) comparator.



**FIGURE 6.** Simulated (pre-layout) small-signal load resistance  $R_{PQ}$  as a function of R.

 $I_b$ . The theoretical curve in Figure 8 was obtained by using the approximation  $R_{PQ} \approx V^*/I_h$  and assuming  $V^*$  to be independent from  $I_b$ . The agreement between theoretical and simulated curves demonstrates that this hypothesis works well in practice, at least for  $I_{d3,4}(V^*) \ll I_h$ .

Table 2 summarizes the optimum values obtained from the theoretical and simulated curves shown in Figures 7 and 8, again highlighting good agreement between theory and simulations. It is worth noticing that the function  $\mathcal{R}$  does not







**FIGURE 7.** Theoretical and simulated (pre-layout) behavior of  $\mathcal{R}$  as a function of  $t_{rst}$ , with  $I_b = 40 \ \mu$ A.

**TABLE 2.** Optimum values of  $t_{rst}$  and  $I_b$  based on the theoretical and simulated reward function  $\mathcal{R}$ .

|                  | Theoretical | Simulated |
|------------------|-------------|-----------|
| $t_{rst}^*$ [ps] | 275         | 263       |
| $I_b^*$ [µA]     | 40          | 45        |

represent an actual physical quantity and has been defined somewhat arbitrarily. Its relevance lies especially in the fact that it allows visualizing the trade-off between settling error



**FIGURE 8.** Theoretical and simulated (pre-layout) behavior of  $\mathcal{R}$  as a function of  $I_b$ , with  $t_{rst} = T_{CLK}/2 = 250$  ps.



**FIGURE 9.** Noise, offset and energy absorption of the proposed comparator versus duty cycle at  $f_{CLK} = 2$  GHz (post-layout). A higher duty cycle implies a shorter  $t_{rst}$ .

and energy and it provides a useful criterion to size the circuit. Finally, in order to showcase the importance of optimizing the energy-settling trade-off, we report in Figure 9 the behavior of noise, offset and energy consumption as the duty cycle (and hence  $t_{rst}$ ) varies. When the duty cycle of the comparator is increased,  $E_{tot}$  decreases linearly at the expense of the input-referred noise and offset which, to a first approximation, are inversely proportional to the preamplification gain. Conversely, the energy absorption can be traded to improve noise and offset by giving the preamplifier more time to settle. The proposed reward function provides an immediate and intuitive way to manage this compromise. By using a simulation-aided approach to maximize  $\mathcal{R}$  as a function of either  $t_{rst}$  or  $I_b$ , the designer can easily find an optimal (or near-optimal) sizing.

### **B. PERFORMANCE OF THE COMPARATOR**

The comparator was tested at 2 GHz clock frequency by applying an input differential voltage  $V_{id}$  such that



**FIGURE 10.** Transient behavior of the output signals of the proposed and conventional Strong Arm comparator.

 $|V_{id}| = 1$  mV. An input common mode of  $V_{DD}/2 = 0.5$  V was used for all the simulations. The sign of  $V_{id}$  toggled every 2 cycles during the central part of the reset/preamplification phase. By using the notation introduced in section IV, this means that  $t_{ready} = T_{ck}/4$ . A static differential voltage was superimposed to the signal to compensate the residual systematic offset caused by parasitics, which was estimated to be  $\approx 1$  mV in the proposed topology and  $\approx 0.6$  mV in the conventional Strong Arm latch. Each output of the comparator was loaded with a 2 fF capacitor.

Figure 10 compares the transient behavior of the output voltages in the proposed and in the conventional Strong Arm latch. The figure highlights that, as expected, the delays of the two comparators are almost identical. Specifically, the simulated delay is about 80.0 ps for the proposed comparator and 79.4 ps for the conventional comparator. It should be noted that the choice of sizing the comparators to have the same delay is arbitrary, and that the designer could choose to obtain an advantage in terms of speed at the expense of power consumption, for instance by increasing the aspect ratio of  $M_7$ . Figure 10 also shows that  $V_P$  and  $V_Q$  diverge slightly during the reset phase. It is important to highlight that this phenomenon, which is caused by the capacitive feedthrough of the outputs' reset transient, can ease preamplification or penalize it depending on the sign of the input differential voltage. In order to clarify this point, suppose that at a given clock cycle  $V_{id} > 0$ . If, at the subsequent clock cycle, the sign of the input differential voltage remains unchanged, preamplification starts from a disfavorable condition because the capacitive coupling causes  $V_{POd}$  to be positive at the beginning of the reset phase (note that  $V_{id} > 0$  implies that  $V_{POd}$  should be negative). If, instead, the sign of  $V_{id}$  toggles with respect to the previous decision, preamplification is favored because the capacitive feedthrough causes  $V_{POd}$  to be already negative when the reset phase begins. All the results reported below have been obtained by considering the less favorable case.



**FIGURE 11.** Transient behavior of the output differential voltage in the proposed and in the conventional comparator during the reset phase.



**FIGURE 12.** Transient behavior of  $V_{POd}$  in the proposed comparator.

Figure 11 compares the output differential voltages in the proposed and in the conventional comparator during the reset phase. The combination of equalization and pull up transistors allows the proposed comparator to cancel the memory effect faster than the conventional topology. At the same time, the total channel width of  $S_3$ - $S_4$ - $S_5$  in the proposed topology is only 0.7  $\mu$ m while in the conventional Strong Arm the total width of  $S_3$ - $S_4$  amounts to 1  $\mu$ m. As a consequence, the total parasitic capacitance associated to the reset switches is smaller in the proposed topology because all transistors have the same channel length, which in turn results in lower power consumption.

Figure 12 shows a detail of the transient behavior of  $V_P$  and  $V_Q$  during static preamplification. It should be noted that this plot has been obtained from the pre-layout version of the comparator because in the post-layout version the presence of systematic offset makes it more difficult to intepret the behavior of the two voltages. The differential signal  $V_{PQd}$  is initially positive due to capacitive feedthrough. Around t = 0.56 ns the sign toggles because the input differential voltage is positive, which implies that  $V_{PQd}$  should converge to a negative value. The maximum value reached by  $|V_{PQd}|$  before the beginning of the evaluation phase is approximately

2.4 mV, which corresponds to a gain of about 7.6 dB. The discontinuity in the slope of the two curves around t = 0.64 ns is due clock feedthrough (caused by the rising edge of the clock).

 TABLE 3. Performance comparison between the conventional Strong Arm

 comparator and proposed topology.

|                        | Conventional | Proposed | Improvement |
|------------------------|--------------|----------|-------------|
| EDP [fJ/GHz]           | 3.53         | 3.38     | 4.25%       |
| $\sigma_{offset}$ [mV] | 8.09         | 5.36     | 33.8%       |
| $\sigma_{IRN}$ [mV]    | 1.13         | 0.808    | 28.5%       |

Table 3 compares the performance of the proposed comparator to that of the conventional Strong Arm. Overall, noise and offset experience the largest improvement: this is in accordance with the theory because noise and offset suppression are enhanced as the preamplification gain increases. The EDP also decreases, which means the proposed topology relaxes all the main trade-offs despite the additional power consumed by the static tail generator. We summarize below the three aspects that contribute to improving the energy efficiency:

- Nodes P and Q are precharged only partially, because during the reset phase V<sub>PQc</sub> settles to V<sub>DD</sub> - V<sub>PQc</sub>(t<sub>s</sub>) < V<sub>DD</sub>. This reduces dynamic power consumption.
- If the comparator is sized to achieve the same delay as the conventional Strong Arm latch, the aspect ratio of the clocked tail transistor  $M_7$  can be reduced because the partial precharge of nodes P and Q causes the latch to turn on earlier. This reduces the parasitic capacitance at the source node and thus power consumption.
- The reset technique implemented at the output nodes helps saving power consumption because the total parasitic capacitance of the reset switches can be smaller.

The only cost of such improvements is a small increase in area occupation due to the addition of  $R_1$ ,  $R_2$  and  $S_5$ .

**TABLE 4.** Performance summary of the proposed Strong Arm comparator with integrated preamplifier under PVT variations. In the table  $V_{DD}^{max} = V_{DD} + 10\% V_{DD}$ ,  $V_{DD}^{min} = V_{DD} - 10\% V_{DD}$ ,  $T^{max} = 80$  °C and  $T^{min} = 0$  °C.

|                     | $V_{DD}^{min}$ | $V_{DD}^{max}$ | $T^{min}$ | $T^{max}$ | FF   | SS    | FS   | SF   |
|---------------------|----------------|----------------|-----------|-----------|------|-------|------|------|
| $P_{avg}$ [µW]      | 65.5           | 106.1          | 82.1      | 89.4      | 88.8 | 81.7  | 84.3 | 84.6 |
| $t_d$ [ps]          | 105.8          | 65.3           | 80.3      | 81.0      | 69.0 | 113.9 | 77.5 | 87.7 |
| EDP [fJ/GHz]        | 3.46           | 3.47           | 3.30      | 3.62      | 3.06 | 4.65  | 3.29 | 3.71 |
| $\sigma_{IRN}$ [mV] | 0.84           | 0.79           | 0.75      | 0.92      | 0.84 | 0.80  | 0.90 | 0.73 |

Table 4 shows the performance of the enhanced Strong Arm comparator under process, voltage and temperature (PVT) variations. The comparator shows good robustness under a wide range of operating conditions. The offset of the proposed comparator was evaluated over 200 Monte Carlo mismatch iterations. Figure 13 shows the resulting histogram. As already reported in table 3, the input-referred offset of the proposed comparator exhibits a standard deviation of 5.36 mV. The histogram shows that the offset remains within the [-15mV, 15mV] range. The offset distribution of the



**FIGURE 13.** Histogram of the input-referred offset of the proposed comparator under mismatch variations.

conventional Strong Arm comparator, on the other hand, has higher variance and covers the [-20mV, 20mV] range.

#### C. COMPARISON WITH STATE OF THE ART

Table 5 compares the performance of the proposed comparator to the recent state of the art. As the table shows, our topology achieves the best performance in terms of inputreferred noise and energy per comparison, the only exceptions being [13], which has better energy efficiency, and [10], that exhibits lower input-referred noise. The input-referred offset of the enhanced Strong Arm comparator is the highest among those reported in the table; however the comparison with the conventional Strong Arm latch demonstrates that the proposed approach is also beneficial in terms of offset, as already highlighted in the previous subsection. Overall, the proposed technique allows for a considerable improvement in terms of noise and input-referred offset compared to the conventional Strong Arm latch. This is even more interesting if we consider that these benefits do not come at the expense of EDP, which on the contrary experiences a significant improvement.

 
 TABLE 5. Comparison between recent literature and the topologies that have been simulated in this work.

|                        | This  | Work  |      | Literature |       |      |       |  |  |
|------------------------|-------|-------|------|------------|-------|------|-------|--|--|
|                        | Prop. | Conv. | [15] | [13]       | [2]   | [3]  | [10]  |  |  |
| Year                   | 2023  | 2023  | 2019 | 2021       | 2020  | 2020 | 2022  |  |  |
| Technology [nm]        | 55    | 55    | 65   | 22         | 90    | 65   | 22    |  |  |
| $V_{DD}$ [V]           | 1     | 1     | 1.2  | 0.8        | 1.2   | 1    | 0.8   |  |  |
| $V_{id}$ [mV]          | 1     | 1     | 10   | 1          | 0.03  | 2    | 0.174 |  |  |
| $f_{ck}$ [GHz]         | 2     | 2     | 6    | 4          | 2     | 1    | 1     |  |  |
| $P_{avg}$ [µW]         | 84.4  | 89.1  | 381  | 144        | 106   | 157  | 75    |  |  |
| $t_d$ [ps]             | 80.0  | 79.4  | 57.7 | 36         | 190   | 167  | 280   |  |  |
| EDP [fJ/(GHz)]         | 3.38  | 3.53  | 2.16 | 1.40       | 10.07 | 26.2 | 21    |  |  |
| E [fJ/comp]            | 42.2  | 44.6  | 63.5 | 36.0       | 53.0  | 78.5 | 75    |  |  |
| $\sigma_{IRN}  [mV]$   | 0.808 | 1.13  | -    | 1.4        | -     | -    | 0.174 |  |  |
| $\sigma_{offset}$ [mV] | 5.36  | 8.28  | 3.87 | -          | -     | 2.05 | -     |  |  |

#### **VII. CONCLUSION**

In this paper a new enhanced Strong Arm latch is introduced and validated. The proposed topology exploits the input differential pair to preamplify the input difference during the reset phase. An additional tail transistor, biased by a current mirror, limits static power consumption while allowing for

current to flow through the switched load resistors. This benefits the comparator's performance in two ways. Firstly, the input differential signal is amplified at the comparator's intermediate nodes, improving noise, offset and, to a lesser extent, EDP with respect to the conventional Strong Arm latch. Secondly, the intermediate nodes are never fully charged during preamplification. This characteristic, combined with the fact that the clocked tail device can be made smaller, causes power consumption to improve with respect to the conventional topology. Additionally, we showed that resetting the outputs with a combination of pull up and equalizing devices allows to reduce the total area of the reset devices with respect to the conventional Strong Arm topology, which instead only uses pull up devices. This results in a further improvement in power consumption. In order to provide a benchmark, a conventional Strong Arm latch was sized and simulated. Our simulations show that the proposed approach relaxes the all the main trade-offs that characterize dynamic comparators: EDP, offset and noise are reduced respectively by about 5.24%, 33.8% and 28.5% compared to the conventional Strong Arm latch.

#### REFERENCES

- A. Alshehri, M. Al-Qadasi, A. S. Almansouri, T. Al-Attar, and H. Fariborzi, "StrongARM latch comparator performance enhancement by implementing clocked forward body biasing," in *Proc. 25th IEEE Int. Conf. Electron., Circuits Syst. (ICECS)*, Dec. 2018, pp. 229–232.
- [2] X. Zhang, S. Li, R. Siferd, and S. Ren, "High-sensitivity high-speed dynamic comparator with parallel input clocked switches," *AEU-Int. J. Electron. Commun.*, vol. 122, Jul. 2020, Art. no. 153236.
- [3] R. K. Siddharth, Y. J. Satyanarayana, Y. B. N. Kumar, M. H. Vasantha, and E. Bonizzoni, "A 1-V, 3-GHz strong-arm latch voltage comparator for high speed applications," *IEEE Trans. Circuits Syst. II, Exp. Briefs*, vol. 67, no. 12, pp. 2918–2922, Dec. 2020.
- [4] B. Razavi, "The StrongARM latch [a circuit for all seasons]," *IEEE Solid StateCircuits Mag.*, vol. 7, no. 2, pp. 12–17, Spring. 2015.
- [5] V. Spinogatti, C. Bocciarelli, F. Centurelli, R. D. Sala, and A. Trifiletti, "Robust body biasing techniques for dynamic comparators," in *Proc. 18th Conf. Ph.D Res. Microelectron. Electron. (PRIME)*, Jun. 2023, pp. 25–28.
- [6] A. Almansouri, A. Alturki, A. Alshehri, T. Al-Attar, and H. Fariborzi, "Improved StrongARM latch comparator: Design, analysis and performance evaluation," in *Proc. 13th Conf. Ph.D. Res. Microelectron. Electron.* (*PRIME*), Jun. 2017, pp. 89–92.
- [7] H. S. Bindra, C. E. Lokin, A.-J. Annema, and B. Nauta, "A 30 fJ/comparison dynamic bias comparator," in *Proc. 43rd IEEE Eur. Solid State Circuits Conf.*, Sep. 2017, pp. 71–74.
- [8] H. S. Bindra, C. E. Lokin, D. Schinkel, A.-J. Annema, and B. Nauta, "A 1.2-V dynamic bias latch-type comparator in 65-nm CMOS with 0.4-mV input noise," *IEEE J. Solid-State Circuits*, vol. 53, no. 7, pp. 1902–1912, Jul. 2018.
- [9] X. Tang, L. Shen, B. Kasap, X. Yang, W. Shi, A. Mukherjee, D. Z. Pan, and N. Sun, "An energy-efficient comparator with dynamic floating inverter amplifier," *IEEE J. Solid-State Circuits*, vol. 55, no. 4, pp. 1011–1022, Apr. 2020.
- [10] H. S. Bindra, J. Ponte, and B. Nauta, "A 174 μVRMS input noise, 1 GS/s comparator in 22 nm FDSOI with a dynamic-bias preamplifier using tail charge pump and capacitive neutralization across the latch," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, vol. 65, Feb. 2022, pp. 1–3.
- [11] C. Bocciarelli, F. Centurelli, R. D. Sala, V. Spinogatti, and A. Trifiletti, "A 2.5 GHz, 0.6 V body driven dynamic comparator exploiting charge pump based dynamic biasing," in *Proc. 18th Conf. Ph.D Res. Microelectron. Electron. (PRIME)*, Jun. 2023, pp. 37–40.
- [12] B. P. Ginsburg and A. P. Chandrakasan, "Dual time-interleaved successive approximation register ADCs for an ultra-wideband receiver," *IEEE J. Solid-State Circuits*, vol. 42, no. 2, pp. 247–257, Feb. 2007.

- [13] H. Zhuang, H. Tang, X. Peng, and Y. Li, "A back-gate-input clocked comparator with improved speed and reduced noise in 22-nm SOI CMOS," in *Proc. IEEE Int. Symp. Circuits Syst. (ISCAS)*, May 2021, pp. 1–5.
- [14] T. Kobayashi, K. Nogami, T. Shirotori, and Y. Fujimoto, "A currentcontrolled latch sense amplifier and a static power-saving input buffer for low-power architecture," *IEEE J. Solid-State Circuits*, vol. 28, no. 4, pp. 523–527, Apr. 1993.
- [15] H. Ghasemian, R. Ghasemi, E. Abiri, and M. R. Salehi, "A novel high-speed low-power dynamic comparator with complementary differential input in 65 nm CMOS technology," *Microelectron. J.*, vol. 92, Oct. 2019, Art. no. 104603.



**VALERIO SPINOGATTI** was born in Rome, Italy, in 1997. He received the master's degree in electronic engineering from the Sapienza University of Rome, Rome, in 2021, where he is currently pursuing the Ph.D. degree with the Department of Information Engineering, Electronics, and Telecommunications. He is also working in the field of hardware security, where he is focusing on side channel attacks on implementations of cryptographic primitives. His research inter-

ests include dynamic comparators, high-speed analog-to-digital converters, digital calibration techniques based on adaptive filters for communication systems and fast digitizers, and analog wideband filters.



**RICCARDO DELLA SALA** was born in April 1996. He received the bachelor's and M.S. degrees (summa cum laude) in electronics engineering from the Sapienza University of Rome, Italy, in 2018 and 2020, respectively. His main research interests include the design and development of PUFs and TRNGs for hardware security. Furthermore, in the context of analog design, his research activity is focused on ultra-low voltage ultralow power topology for the IoT and biomedical applications, such as OTAs and comparators.



**CRISTIAN BOCCIARELLI** was born in Narni, Italy, in February 1998. He received the bachelor's and M.S. degrees (summa cum laude) in electronics engineering from the Sapienza University of Rome, Italy, in 2019 and 2021, respectively, where he is currently pursuing the Ph.D. degree in electronic engineering with the Department of Information Engineering, Electronics, and Telecommunications. His research interests include the design and development of high speed

ADC, dynamic comparators, wide band filters, and digital calibration techniques of analog circuits. Furthermore, in the context of hardware security, his research activity is focused on SCA on cryptography primitive.



**FRANCESCO CENTURELLI** (Senior Member, IEEE) was born in Rome, Italy, in 1971. He received the Laurea and Ph.D. degrees (cum laude) in electronic engineering from the Sapienza University of Rome, Rome, in 1995 and 2000, respectively. In 2006, he became an Assistant Professor with the DIET Department, Sapienza University of Rome. He has been also involved in research and development activities held in collaboration between the Sapienza University of

Rome and some industrial partners. He has published more than 100 papers on international journals and refereed conferences. His research interests include system-level analysis and design of clock recovery circuits and high-speed analog integrated circuits, and now concern the design of analog-to-digital converters and very low-voltage circuits for analog and RF applications.



**ALESSANDRO TRIFILETTI** was born in Rome, in Italy, in October 1959. In 1991, he joined the Electronic Engineering Department, Sapienza University of Rome, as a Research Assistant, where he was involved in research activities dealing with analogue, RF, and microwave ICs design. In 2001, he became an Assistant Professor. In 2005, he was an Associate Professor. In 2019, he was a Full Professor with the Engineering Faculty, Sapienza University of Rome. He has worked

in the field of microelectronics, both from the point of view of design methodologies and circuit topologies. On these subjects, he has (co)authored over 210 publications, of which about 80 published on international journals, the others published on the proceedings of major international conferences (a large part of these sponsored by the IEEE). In last 20 years, he has been engaged in the coordination of research teams from DIET (previously DIE) in the framework of national and international programs, involving both industrial and academic partners. From an industrial perspective, his expertise covers topics about analogue and RF microelectronics, radar and ESM systems, high-speed communication systems, security issues in cryptographic algorithms implementation, and embedded system design. He is currently a Reviewer of some IEE and IEEE reviews, among them: IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES, IEEE TRANSACTIONS ON CIRCUIT AND SYSTEMS—I: REGULAR PAPERS, IEEE TRANSACTIONS ON CIRCUIT AND SYSTEMS—II: EXPRESS BRIEFS, IEE Proceedings on Circuits, Devices and Systems, and IEE Electronic Letters.

...

Open Access funding provided by 'Università degli Studi di Roma "La Sapienza" 2' within the CRUI CARE Agreement