# A Very Low Voltage Frequency Divider in Folded MOS Current Mode Logic with Complementary n- and p-Type Flip-Flops 

Francesco Centurelli ${ }^{1}$, Member, IEEE, Giuseppe Scotti ${ }^{1}$, Senior, Member, IEEE, and Gaetano Palumbo ${ }^{2}$, Fellow, IEEE


#### Abstract

In this paper, a static frequency divider based on Folded MOS Current Mode Logic (FMCML) is presented. The design is based on alternating FMCML Flip-Flops with complementary PMOS or NMOS input differential pairs, since common-mode problems arise by using only one type of FMCML Flip-Flops.

The design is carried out after a detailed theoretical modeling and analysis versus the Flip-Flop bias current, thus allowing to define optimized design strategies for the maximum speed, or the minimum power-delay product (PDP). The frequency divider architecture and design strategies are validated considering a commercial 28nm FDSOI CMOS technology. Post-layout simulations of a divider-by-16 show a maximum frequency of about 12 GHz with $74 \mu \mathrm{~W}$ power consumption for the high-speed design and a maximum frequency of 10 GHz with $53 \mu \mathrm{~W}$ power consumption for the minimum PDP design.


Index Terms-Current Mode Logic, frequency divider, logic design, nanometer CMOS, delay model.

## I. INTRODUCTION

MANY high-speed analog/RF and digital applications require frequency dividers as key building blocks, when the generation of subharmonic signals from a high frequency source is required. Examples include PLL-based frequency synthesizers, clock generators, high-speed SerDes subsystems and time-interleaved analog-to-digital converters [1-6].
Several architectures are adopted in the literature for highspeed frequency dividers, such as static frequency divider (SFD) [7], regenerative frequency divider (RFD) [8], and injection-locked frequency divider (ILFD) [9]. Among them, the SFD presents the advantages of a very wide frequency range (from dc to very high frequencies) and of a structure that only uses standard digital blocks. This simplifies the design and allows design reuse and application in reconfigurable systems, making them the most common frequency divider architecture, unless extremely high frequencies are required.

Most of the above applications refer to mixed-signal integrated circuits, that set additional requirements on the
frequency divider block: in addition to a suitable frequency range for the specific application, low phase noise, and low area footprint, to ease integration, low sensitivity to noise (e.g., from substrate and supply rails) and low di/dt noise (not to disturb sensitive analog blocks) are required. Furthermore, minimization of power consumption is a fundamental issue for such systems, to enable a very high level of integration, ease portability, and simplify the design of packaging and heat dissipation. Among the available techniques to cope with this issue, reduction of the supply voltage can be adopted, due also to the reduction of breakdown voltage of scaled MOS devices.

In fact, the scaling of CMOS technology now provides devices with high frequency performance up to tens of GHz (transition frequencies up to $350 / 200 \mathrm{GHz}$ for NMOS and PMOS devices [10]), that require low supply voltages around 1 V or less, and provide a lower power consumption with respect to their bipolar counterparts. For high frequency applications, these devices are used to build logic families based on current steering in a differential approach, to exploit the benefits of fast switching, low sensitivity to common-mode noise and disturbances, and low power supply switching noise, that eases integration of analog and digital blocks in mixed-signal integrated circuits. The reference logic family is therefore the MOS Current Mode Logic (MCML) [11]-[12], that allows higher maximum speed than standard CMOS logic, and could even provide lower power consumption at frequencies that are still suitable for CMOS [13].

The MCML exploits series gating to implement logic functions, and typically just two stacked levels are used to limit the supply voltage. This allows implementing And-Or-Inverter (AOI) gates that can be exploited to build every complex combinatorial function, as well as XOR, MUX and D-type latch. Even for a 2 -level gate, the minimum supply voltage cannot be lower than
$V_{D D, \text { min }}=2 V_{T H}+3 V_{o v}+V_{R}$
where $V_{T H}$ is the threshold voltage of the devices, $V_{o v}=V_{G S}-$ $V_{T H}$ is the overdrive voltage and $V_{R}$ is the dc voltage drop across

[^0]the load resistor, whose value is constrained by the need to fully switch the differential pairs.

Several solutions [14]-[16] have been proposed in the literature, by modifying the standard MCML, to reduce the minimum supply voltage of current mode logic gates, thus allowing sub-1V operation, and application in a very lowvoltage environment. Among them, the Folded MCML (FMCML) approach seems particularly promising [17]-[19]. The FMCML exploits the complementary nature of CMOS technology, using a PMOS differential pair for the lower level of the stack, and a current mirror to connect it to the upper level NMOS differential pairs. This allows reducing the minimum supply voltage to
$V_{D D, \text { min }}=V_{T H}+2 V_{o v}+V_{R}$
which is equal to the one of the MCML gates with stack levels reduced by one. Moreover, an approach named Multi-Folded (MF) MCML which generalizes the FMCML topology idea, thus allowing a minimum power supply equal to a single-level MCML (i.e., the MCML inverter) regardless the number of inputs, was also proposed [20].
In this paper we present a static frequency divider architecture realized with the FMCML which exploits the complementary nature of CMOS technology, and thanks to the derived dedicated design criteria allows to achieve high performance at very low voltage. Note also that MCML frequency divider design approaches previously treated in the literature [21]-[22] are not suited for the proposed architecture. Indeed, design procedures which are based on the conventional MCML style do not take into account the peculiarities of FMCML. In particular, as will be shown in the following of the paper, unlike from the standard conventional MCML, the FMCML logic style has a weak dependence on the bias current, thus very different design criteria arise.

The paper is structured as described in the following. In Section II we describe the proposed frequency divider architecture which exploits the FMCML D-latch as basic building block. In Section III we present a complete analysis of the clock-to-output propagation delay of the basic FMCML divide-by-2 (DIV2) cell, which is then exploited in Section IV to derive design guidelines for multistage frequency dividers. Validation of the proposed models and design case studies referring to a 28 nm FDSOI CMOS technology are reported in Section V. Finally, some remarks and the conclusions are drawn in Section VI.

## II. The frequency divider architecture

Static frequency dividers are realized as the cascade of DIV2 stages, implemented as Toggle Flip-Flops (TFFs) with the input $T$ set to one (see Fig. 1a), in order to toggle at every rising clock edge. Referring to the MCML logic style, such behavior can be easily obtained by using a D Flip-Flop (DFF) and connecting in feedback the input to the inverted output as shown in Fig. 1b.
In a FMCML implementation, the $D F F$ is based on a MasterSlave configuration (i.e., a topology realized cascading two Dlatches with counter-phase clock signals), and the schematic of a single D -latch, which is the main building block, is reported in Fig. 2. In a $2^{N}$ divider, $N D F F$ s are used, with the output of
each $D F F$ connected to the clock input of the next one. Referring to the schematic in Fig. 2, this requires the not feasible interconnection between the output of a NMOS differential pair and the input of a PMOS differential pair (the clock input).


Fig. 1. Static frequency divider: a) based on $T F F$; b) based on $D F F$.


Fig. 2. FMCML D-latch with PMOS input at the lower level (nType).
Indeed, considering the output common-mode voltage, $V_{C M, o}$, of the D-latch in Fig. 2 (equal to the output common-mode voltage of a DFF realized with this D-latch), we can write
$V_{C M, 0, n T y p e}=V_{D D}-\frac{V_{S W}}{4}$,
where the voltage swing is defined as
$V_{S W}=2 \Delta V=2 R_{D} I_{S S}$
where $I_{S S}$ is the tail current (see Fig. 2) and $R_{D}$ is the equivalent resistance of the triode PMOS load $M_{D}$.
The maximum input common-mode level which has to be guaranteed for the PMOS differential pair of the latch is
$V_{C M, \text { imax, } n T y p e}=V_{D D}-\left|V_{D S s a t}\right|-\left|V_{G S}\right|=V_{D D}-\left|V_{T H}\right|-$
$2 V_{o v}$
where the terms $V_{T H}$ and $V_{o v}$ are the MOS threshold voltage and the overdrive voltage, $V_{o v}=V_{D S s a t}=\left|V_{G S}\right|-\left|V_{T H}\right|$, respectively. Thus, usually the value from (5) is significantly lower than the one provided by (3). For example with a deep submicron CMOS technology, where $\left|V_{T H}\right|$ is typically lower than 0.35 V , the minimum $V_{o v}$ can be about 50 mV and a suitable value of $V_{S W}$ is about 600 mV , we get a $V_{C M, i m a x, n T y p e}$ at least 300 mV lower than $V_{C M, o, n T y p e}$.


Fig. 3. Topology of a nType (a) and pType (b) D Flip-Flop in Folded MCML logic style.


Fig. 4. Proposed architecture for the implementation of a frequency divider by 16.

The typical solution to this problem is the use of a source follower as level shifter between the DFFs; this however increases the power consumption, and, for maximum speed performance, the power of the source follower can result a significant fraction of the overall power consumption. In this paper we propose a different approach: input and output common-mode levels of each divide-by-2 (DIV2) block are made compatible by alternating complementary FMCML $D F F$ stages, thus avoiding any additional stage in between. In fact, by considering the dual of the D -latch in Fig. 2, designed using complementary devices, the minimum input common-mode voltage is
$V_{C M, \text { imin, } p \text { Type }}=V_{G S}+V_{D S S a t}=V_{T H}+2 V_{o v}$
that results fully compatible with (3), and similarly the output common-mode voltage is now
$V_{C M, o, p T y p e}=\frac{V_{S W}}{4}$
that is suited to drive the next D-latch with PMOS input.
In the following we will refer with nType (pType) to the $D F F$ with the output given by an NMOS (PMOS) differential pair.

The schematic of the nType and pType FMCML $D F F$ s are reported in Fig. 3a and in Fig. 3b respectively. By using these building blocks, we can realize a $2^{N}$ generic static frequency divider combining them as shown in Fig. 4 for the example of a frequency divider by 16 .

## III. Delay model of the FMCML

Usually, the speed performance of a static frequency divider is set by the TFF maximum toggle frequency [21], [23], which
is imposed by the clock-to-output propagation delay ${ }^{1}, t_{\text {CKQ }}$, of the latch used to realize the $D F F$. In fact, referring to a generic master-slave $D F F$ as the one reported in Fig. 3a or in Fig. 3b, the $t_{C K Q}$ of the whole $D F F$ is equal to the $t_{C K Q}$ of the slave latch. Furthermore, since our divider core is based on a unitary feedback $D F F$ as shown in Fig. 1b, in order for the basic divider cell to operate properly, the minimum period of the input clock signal has to be greater than $2 t_{\text {СКQ }}$. In fact, starting from one clock edge, we have to guarantee the $t_{C K Q}$ time of the slave latch for the stable intermediate output (output of the master latch) to become the output of the DFF and at the same time the new input of the master latch (due to the unitary feedback). From this instant an additional $t_{C K Q}$ of the master latch is required to have a stable signal at the output of the master latch (intermediate output) before the next clock edge.

In the following, the clock-to-output propagation delay of the basic FMCML frequency divider by 2 (DIV2) cell is derived and used to estimate the speed response of the proposed frequency divider architecture.

## A. FMCML time constants

To evaluate the $t_{C K Q}$ of the FMCML $D F F$, we have to evaluate the propagation delay from the clock input node to the output of the FMCML $D F F$, which can be calculated, as shown in [16], by using the open-circuit time-constant method on the linearized circuit model.

The small-signal differential half-circuit model for the evaluation of the clock-to-Q delay of the FMCML $D F F$ s in Fig. 3 is reported in Fig. 5 (this model applies to both $D F F$ s in Fig. 3a and in Fig. 3b).

Referring to Fig. 5, the signal path is divided into three main sections:

- the clock input section from $v_{i}$ to $v_{D}$ : it includes the differential pair $M_{1}-M_{2}$ whose parameters are denoted with the suffix $C K$, loaded by the diode connected devices $M_{7}-M_{8}$ whose parameters are denoted with the suffix $C M$; the admittance $Y_{\text {diode }}=g_{m C M}+\mathrm{s}\left(C_{g S C M}+C_{d b C M}\right)$ is shown in Fig. 5;
- the folding from $v_{D}$ to $v_{S}$, given by the unity-gain current mirror (hence parameters are denoted with the suffix $C M$ ); in particular, we consider the current mirror output towards the slave latch $M_{9 B}-M_{10 B}$;
- the output section from $v_{S}$ to $v_{o}$, implemented by the track differential pair of the slave latch $M_{3 B}-M_{4 B}$, whose parameters are denoted with the suffix $D P$, loaded by the triode devices $M_{D}$. For the differential half-circuit model, the loading effect of $M_{4 B}$ capacitances on the source node of $M_{3 B}$ has been taken into account through the capacitances $C_{g s D P}$ and $C_{s b D P}$ in the dashed box denoted as "Load at source of $M_{3 B}$."
The loading admittance $Y_{i C M}$ in Fig. 5 represents the loading effect of the current mirror branch towards the master latch $\left(M_{9 A}-M_{10 A}\right)$, whereas $Y_{L A T C H}$ accounts for the loading effect of the hold differential pair $\left(M_{5 B}-M_{6 B}\right)$ of the slave latch.


Fig. 5. Small-signal equivalent circuit of the FMCML DFFs in Fig. 3.
According to this modeling strategy, the $t_{C K Q}$ of the $D F F \mathrm{~s}$ in in Fig. 3a and in in Fig. 3b can be expressed as follows:
$t_{\text {CKQ, } n \text { Type }}=\ln 2\left(\tau_{1, n \text { Type }}+\tau_{2, n \text { Type }}+\tau_{3, n \text { Type }}\right)$
$t_{\text {CKQ, } p \text { Type }}=\ln 2\left(\tau_{1, p \text { Type }}+\tau_{2, p \text { Type }}+\tau_{3, p \text { Type }}\right)$
where the three time constant $\tau_{1, n \text { Type }}, \tau_{2, n \text { Type }}, \tau_{3, n \text { Type }}$ (and $\left.\tau_{1, \text { pType }}, \tau_{2, p \text { Type }}, \tau_{3, p \text { Type }}\right)$ are related to the three main sections along the $C K-$ to- $Q$ signal path of the slave latch.

Without loss of generality, considering the nType DFF in Fig. 3a and assuming unity-gain current mirrors, the three time constants can be written as:
$\tau_{1, n \text { Type }}=\frac{C_{g d C K}+C_{d b C K}+C_{d b C M}+3 C_{g s C M}+3 C_{g d C M}}{g_{m}}$
$\tau_{2, n T y p e}=\frac{C_{g d C M}+C_{d b C M}+2 C_{g s D P}+2 C_{s b D P}}{g_{m D P}+g_{m b D P}}$
$\tau_{3, n T y p e}=R_{D}\left(C_{g d D P}+C_{d b D P}+C_{L A T C H}+C_{R D}+C_{L}\right)$
where $C_{R D}$ is the parasitic capacitance of the triode transistor $M_{D}$ which provides the equivalent resistive load $R_{D}$ [8], $C_{L}$ is the load capacitance, and $C_{L A T C H}$ accounts for the load effect of the hold differential pair:
$C_{L A T C H}=C_{g D P}+C_{S b D P}$
where $C_{g D P}$ is the capacitance seen at the gates of the latch differential pair with the source at ground. Finally, the other parameters have the usual meaning of MOS small-signal parameters.

## B. Clock-to-Q delay versus bias current

Following the transistor sizing strategy in [11], we start by setting the voltage swing
$V_{\text {swing }}=2 \Delta V=2 R_{D} I_{S S, n \text { Type }}$,
and the required noise margin, that can be expressed as follows:
$N M=\Delta V\left(1-\frac{\beta}{A_{V}}\right)$,
where $A_{V}$ is the small signal gain of the gate:

[^1]$A_{v}=g_{m C K} R_{D}$
and $\beta$ is a factor ranging from $\sqrt{2}$, for the quadratic MOS model, to 1 for a submicron linear MOS model. From (13)-(15), considering the $\alpha$-power MOS model [24], we can express the gate width of the transistors of the input clock stage as:
$W_{C K}=\frac{2^{\alpha-1}}{K_{C K}}\left(\frac{A_{V}}{\alpha \Delta V}\right)^{\alpha} I_{S S, n T y p e}$,
where $K_{C K}$ and $\alpha$ are technology parameters which tend to 1 and $v_{s a t} C_{o x}$, respectively, in a short channel device, but are equal to 2 and $\mu_{p} C_{o x} / 2 L_{C K}$ if we can assume a long channel device, and $A_{V}$ can be derived from (14).
Again, setting a suitable overdrive voltage and considering all the NMOS transistors with equal aspect ratios (i.e., transistors with suffix $C M$ equal to the ones with suffix $D P$ ), the resulting gate width is
$W_{n}=\frac{I_{S S, n T y p e}}{2 K_{n} V_{o v}^{\alpha}}$.
Unless for $C_{R D}$ and $C_{L}$, all the capacitances in (9)-(11) are proportional to the device width; considering the dependence of (16) and (17) on bias current and substituting them into (9) and (11), since both numerator and denominator are directly proportional to the current, the time constants $\tau_{1, n \text { Type }}$ and $\tau_{2, n T y p e}$ can be assumed to be constant with respect to bias current variations. Regarding the third time constant, $\tau_{3, n \text { Type }}$, we can consider it composed by three terms:
$\tau_{3, n T y p e}=\tau_{3 \text { MOS }, n T y p e}+\tau_{R D}+R_{D} C_{L}$
with
$\tau_{3 M O S, n T y p e}=R_{D}\left(C_{g d D P}+C_{d b D P}+C_{L A T C H}\right)$
and
$\tau_{R D}=R_{D} C_{R D}$.
In particular, the term $\tau_{3 M O S, n T y p e}$, like $\tau_{1, n T y p e}$ and $\tau_{2, n T y p e}$, is independent on $I_{S S, n T y p e}$, while the behavior of $\tau_{R D}$ as a function of the bias tail current is dependent on the implementation of the load, a MOS in triode region or a true resistor [25]. Focusing on VLSI applications, where area minimization is mandatory, a MOS triode load is considered and, unless for very low tail currents, we can again assume $\tau_{R D}$ to be constant ${ }^{2}$.
Regarding the last term in (18), we have to estimate the value of the load capacitance $C_{L}$. In this specific application in which a DIV2 is implemented, the DFF has a unitary feedback and is also loaded by another $D F F$, but whose input differential pair is made up with complementary transistors type with respect to the driver gate. Hence, for the example under consideration with a nType DIV2 as driving cell and a pType DIV2 as load cell, we can assume the load as the sum of two contributions. The first contribution is given by the input capacitance of the track NMOS differential pair of the slave latch within the driving DIV2 stage, which we denote as $C_{i n, n, n T y p e}$. The second contribution is the input capacitance at the $C K$ input of the load

[^2]pType DIV2 stage (which is an NMOS differential pair), denoted as $C_{i n, n, p \text { Type }}$. Note, however, that the capacitive contribution $C_{i n, n, p \text { pype }}$ of the loading DIV2 stage depends on its bias current $I_{S S, p T y p e}$.

In conclusion, by expressing $C_{i n, n, n T y p e}$ and $C_{i n, n, p T y p e}$ as follows:

$$
\begin{align*}
C_{\text {in,n,nType }} & =c_{\text {in,n,nType }} \cdot I_{S S, n T y p e}  \tag{21a}\\
C_{\text {in,n,pType }} & =c_{\text {in,n,pType }} \cdot I_{S S, p T y p e} \tag{21b}
\end{align*}
$$

to show the bias current dependence of the input capacitances, we can summarize the nType clock-to-Q delay equal to:
$t_{\text {CKQ,nType }}=\ln 2\left(\tau_{\text {int, } \mathrm{nType}}+c_{\text {in,n,pType }} \frac{I_{S S, p T y p e}}{I_{S S, n T y p e}} \Delta V\right)$,
where
$\tau_{\text {int }, n T y p e}=\tau_{1, n \text { Type }}+\tau_{2, n \text { Type }}+\tau_{3 M O S, \mathrm{nType}}+\tau_{R D}+$
$R_{D} C_{\text {in, } n, n \text { Type }}$
includes all the effects independent from the $D F F$ bias current, $I_{S S, n T y p e}$, and the last term in the brackets depends on the ratio of the bias currents of the loading and driving DIV2 stages, that are of complementary type. By exchanging nType and pType, the same equations are valid for the CK-to-Q delay of the pType DIV2 stage loaded by an nType DIV2.

## IV. FMCML $T F F$ AND DIVIDER DESIGN

In the following, starting from the analysis and considerations carried out above, we focus on the design strategies for the frequency divider, and in particular we start with the design guidelines of the single TFF (DIV2 stage).

## A. TFF design guidelines

The proposed frequency divider architecture shown in Fig. 4 is based on alternating complementary FMCML TFF stages. Since the first TFF has to provide the minimum propagation delay $t_{C K Q}$, it has to be implemented through the nType topology. In fact, assuming the same bias current, a pType TFF stage will surely be slower, due to the lower transition frequency of PMOS devices. However, considering that the second TFF stage works with the divided-by-2 signal, in order to guarantee that the speed performance is set by the propagation delay $t_{C K Q, n T y p e}$ of the nType stage, the propagation delay $t_{C K Q, p T y p e}$ of the pType $T F F$ has to fulfill the following condition:
$t_{\text {CKQ,pType }} \leq 2 t_{\text {CKQ,nType }}$.
Otherwise, the divider speed performance will be limited by the second divide by 2 stage (i.e., the pType TFF).

By inspection of (22), in order to provide the minimum $T F F$ propagation delay, we have to set $I_{S S, n T y p e}$ sufficiently higher with respect to $I_{S S, p T y p e}$. As we will show in the next section, the contribution of the propagation delay due to $\tau_{\text {int,nType }}$ is about the $75 \%$ of the whole $t_{C K Q, n T y p e}$ when $I_{S S, n T y p e}=I_{S S, p T y p e}$. Thus, a current $I_{S S, n T y p e}$ two or three times
higher than $I_{S S, p T y p e}$ allows a $t_{C K Q, n T y p e}$ value very close to the ideal asymptotic minimum value.

On the other hand, a different goal is to minimize the powerdelay product $(P D P)$, that is given by:

$$
\begin{equation*}
P D P=3 \cdot I_{S S, n T y p e} \cdot t_{C K Q, n T y p e} . \tag{26}
\end{equation*}
$$

In this case, we have to use the minimum allowable $I_{S S, n T y p e}$ value, since (22) and (26) show that the delay decreases with the current much more slowly that the increase in power consumption, due to the constant term $\tau_{\text {int,nType }}$.


Fig. 6. Detail with adopted improved current mirror for the nType $T F F$.
TABLE I. MAIN PROCESS PARAMETERS OF THE 28 NM FDSOI CMOS

| $\mu_{n} C_{o x}$ | $210 \frac{\mu A}{V^{2}}$ |
| :---: | :---: |
| $\mu_{p} C_{o x}$ | $78 \frac{\mu A}{V^{2}}$ |
| $V_{T N}^{*}$ | 0.3 V |
| $\left\|V_{T P}^{*}\right\|$ | 0.38 V |
| $W_{\min }$ | 80 nm |
| $L_{\min }$ | 28 nm |

*In FDSOI processes $V_{T N}$ and $\left|V_{T P}\right|$ can be adjusted by means of body bias. In our design the body of NMOS and PMOS devices has been connected to ground and $V_{D D}$ respectively.

TABLE II. Design Parameters for the $T F F$ in Fig. 3 at minimum $P D P$.

| PDP. |  |  |
| :---: | :---: | :---: |
|  | nType $T F F$ | pType $T F F$ |
| $L$ | 28 nm | 28 nm |
| $V_{D D}$ | 800 mV | 800 mV |
| $V_{C M, D}$ | 650 mV | 150 mV |
| $V_{C M, C K}$ | 150 mV | 650 mV |
| $V_{C M, Q}$ | 650 mV | 150 mV |
| $\Delta V$ | 300 mV | 300 mV |
| $I_{S S}$ | $5 \mu A$ | $7 \mu A$ |
| $R_{D}$ | $60 \mathrm{k} \Omega$ | $43 \mathrm{k} \Omega$ |
| $W_{D} / L_{D} / V_{G}$ | $80 \mathrm{~nm} / 45 \mathrm{~nm} / 100 \mathrm{mV}$ | $98 \mathrm{~nm} / 60 \mathrm{~nm} / 500 \mathrm{mV}$ |
| $W_{1,2}$ | 500 nm | 700 nm |
| $W_{3 A, 4 A, 54,6 A}$ | 250 nm | 700 nm |
| $W_{3 B, 4,, 5,, 6 B}$ | 250 nm | 700 nm |
| $W_{7 A, 8 A, 7,, 8 B}$ | 125 nm | 700 nm |
| $W_{9 A, I O A, 9 B, 10 B}$ | 125 nm | 700 nm |

## B. Divider design strategies

Considering the $2^{N}$ frequency divider implemented through the cascade of $N T F F$ building blocks as shown in Fig. 4, we can follow a minimum PDP design strategy, and according to the analysis in the previous sub-section, we can design and use nType and pType TFFs with minimum bias current, provided that relationship (25) is satisfied. To be more precise, as shown in [25], the optimum bias current to minimize the $P D P$ is the
current that corresponds to minimum-size triode load devices: for lower currents, the resistance of the triode load is scaled by varying its gate voltage, thus making the time constant $\tau_{R D}$ inversely proportional to the current.

In the case we want to maximize the divider speed performance, we can simply modify the first nType TFF by increasing its bias current. The bias current of the second stage will be chosen as the minimum current which fulfils (25), whereas all the other stages will be biased with the minimum current.

## V. Cases of Study and simulation results

To validate the analysis and the proposed design strategies, we consider the commercial 28nm FDSOI CMOS technology from STMicroelectronics [26], whose main technology parameters are reported in Table I. However, we choose not to exploit the specific features of FDSOI technologies, to present more general results.

## A. TFF simulations and model validation

In order to minimize the channel length modulation effect, and thus improve the accuracy of the current mirrors involved in the clock switching part of the TFF, the topology shown in Fig. 6 [19] has been used. According to Fig. 6, transistors $M_{7}$ and $M_{8}$ in Fig. 3a are replaced by transistors $M_{7 A}, M_{7 B}, M_{8 A}$ and $M_{8 B}$. Furthermore, transistors $M_{7 B}$ and $M_{8 B}$ are equally sized to $M_{3}$ and $M_{4}$, thus, setting the bias voltage $V_{B}$ equal to the common-mode voltage of the $D$ signals, the drain-source voltage $V_{D S}$ of $M_{7 A}, M_{9 A}, M_{9 B}, M_{8 A}, M_{10 A}$ and $M_{10 B}$ is equalized. The complementary improved current mirror has been adopted for the pType TFF Fig. 3b.

From preliminary simulations on the 28 nm CMOS process, we have found that the minimum allowable tail current of the nType cell $I_{S S, n T y p e}$ to keep all the devices in the strong inversion region is $5 \mu \mathrm{~A}$. Surely the divider could be operated with the devices in subthreshold region, but we chose to avoid it since our model was not derived for that condition. $5 \mu \mathrm{~A}$ is also the current corresponding to the minimum width for the triode PMOS load device.

Regarding the pType cell, the current $I_{S S, p \text { Type }}$ that corresponds to the minimum width for the NMOS triode load is $7 \mu \mathrm{~A}$ : for higher currents, the value of $R_{D}$ is changed by acting on the width of the load transistor, whereas using lower currents requires acting on the gate voltage [25] and results in an increased propagation delay. Also in this case, bias currents below $5 \mu \mathrm{~A}$ result in subthreshold operation.

Using the minimum gate length for all the devices except $M_{D}$ to minimize parasitic capacitances, and setting the gate widths according to the required noise margin and gate-source voltages, we get the transistor dimensions summarized in Table II for the case of optimum currents (those minimizing PDP), corresponding to minimum width for the load devices. Transistor widths are scaled with bias current to keep current densities constant, and the number of gate fingers is modified accordingly. The width of the load devices $M_{D}$ is increased
when increasing the bias current ${ }^{3}$.
The behavior of the propagation delay of the nType TFF versus the bias current $I_{S S, n T y p e}$ at different p Type bias currents of the next stage and for different ratios between pType and nType bias currents is reported in Fig. 7a and Fig. 7b, respectively. Fig. 7 shows that the weight of the second term in (22) increases with the pType bias current, providing a nType propagation delay that shows some dependence on the nType bias current. Such dependence is however limited, unless for very large pType-to-nType current ratios.

The propagation delay of the pType TFF versus the bias current $I_{S S, p T y p e}$ is shown in Fig. 8a, for different bias currents of the nType load, and in Fig. 8b, for different nType-to-pType current ratios. Similar considerations apply to the complementary case, but now the dependence of propagation delay on bias current is even weaker, due to the increased weight of the constant term in (22) for the pType cell, that can be attributed to the larger time constant of the PMOS current mirror. A sharp increase of the delay below $7 \mu \mathrm{~A}$ is observed, due to the effect of the triode NMOS load.


Fig. 7. Propagation delay of the nType $T F F$ versus bias current: (a) for different bias current of the next pType TFF ( $I_{p T y p e}=0$ means unloaded nType TFF); (b) for different current ratios between the next stage and current stage bias currents.

By using data in Fig. 7 and Fig. 8, some considerations on the performance of nType and pType cells and on the design guidelines discussed in the previous section can be drawn. First of all, by comparing the delays of the loaded and unloaded

[^3]cases, we can estimate the weight of the constant term in (22) as about the $70 \%$ and $87 \%$ of the overall delay at unitary current ratio for the nType and pType cells respectively. The higher value for the pType is due to lower speed of PMOS devices in the current mirror, and this justifies a weaker dependence of the propagation delay when varying the load. A comparison of the curves in Fig. 7 and Fig. 8 also shows that the pType cell is about $30 \%$ slower than the nType one as expected, thus implying that the first DIV2 block has to be of nType.

A more detailed analysis of the propagation delays shows that, for $I_{S S, p T y p e}$ of at least $7 \mu \mathrm{~A}$, the delay of the pType cell, even when heavily loaded, is always less than twice the delay of the nType cell driving it, thus satisfying (25). As an example, the delay of a pType cell biased at $7 \mu \mathrm{~A}$ is between 116 ps (unloaded case) and 175ps (when loaded by a nType cell biased at $30 \mu \mathrm{~A}$ ). The delay of a nType cell loaded by the $7 \mu \mathrm{~A}$ pType cell is, instead, between 91 ps (when the nType cell is biased at $30 \mu \mathrm{~A}$ ) and 125 ps (nType cell at the minimum $5 \mu \mathrm{~A}$ current). Furthermore, in the case of a pType cell loaded by a nType cell (e.g., second and third stage of the divider), the dual of (25) is always satisfied.


Fig. 8. Propagation delay of the pType TFF versus bias current: (a) for different bias current of the next nType TFF ( $I_{n T y p e}=0$ means unloaded pType TFF); (b) for different current ratios between the next stage and current stage bias currents.

## B. PVT variations, mismatches, and supply voltage scaling

To analyze the sensitivity of the delay to process, supply voltage and temperature (PVT) variations, we have considered

[^4] for the pType cell below $7 \mu \mathrm{~A}$.
the clock-to-Q propagation delay of both nType and pType cells biased at $10 \mu \mathrm{~A}$ and loaded by the cell of the opposite type. Simulations have been performed using a suitable loop to properly bias the gates of the triode loads to keep the voltage swing approximately constant [27].

Tab. III reports the values of $t_{C K Q}$ for the different process corners and for $\mathrm{a} \pm 10 \%$ variation of the nominal 800 mV supply voltage, and Fig. 9 shows the dependence of the delay on the temperature. These results highlight the robustness of the proposed architecture to PVT variations. We have also considered the effect of mismatches between devices: a Monte Carlo analysis has revealed a standard deviation to mean value ratio of $11 \%$ and $12 \%$ respectively for the delay of nType and pType cells.

TABLE III. Clock-To-Q delay variation due to process and TEMPERATURE

| Process Corner | $n$ Type | Type |
| :--- | :--- | :--- |
| TT | 110.53 ps | 130.14 ps |
| FF | 107.56 ps | 128.94 ps |
| FS | 111.66 ps | 115.86 ps |
| SF | 109.47 ps | 139.76 ps |
| SS | 113.00 ps | 129.16 ps |
| Supply Voltage | $n$ Type | $p$ Type |
| $720 \mathrm{mV}(-10 \%)$ | 109.65 ps | 130.74 ps |
| $880 \mathrm{mV}(+10 \%)$ | 111.00 ps | 126.14 ps |



Fig. 9. Temperature dependence of the $t_{\text {СКQ }}$ for nType and pType cells.
Since the FMCML approach is aimed at very low-voltage applications, we have also evaluated the behavior of the divider cells for lower supply voltages. Simulations have shown that the nType and pType cells can operate with supply voltages as low as 475 mV and 550 mV respectively, without significant variation of the propagation delay.

## C. Dividers design and validation

According to the design strategies presented in the previous Section and the results on the FMCML TFFs reported in Fig. 7 and Fig. 8, the frequency divider architecture reported in Fig. 4 can be implemented by sizing the first TFF to achieve the minimum $P D P$ with $5 \mu \mathrm{~A}$ bias current. The second stage of the divider has to be implemented through a pType TFF biased with the minimum bias current which allows to fulfil (25): this current has been found to be $7 \mu \mathrm{~A}$. Considering that the third (nType) and fourth (pType) stages of the divider operate at lower frequencies they can both be implemented with $5 \mu \mathrm{~A}$ bias current which is the minimum value to avoid subthreshold operation.

It has to be noted that the frequency divider can also be designed to achieve minimum power consumption, by using in all the stages the minimum bias current of $5 \mu \mathrm{~A}$. In this case, the achieved maximum operating frequency of the divider is strongly reduced with respect to the minimum $P D P$ design, since its speed is limited by the pType TFF (the second stage of the divider), and the maximum frequency results:
$f_{\text {max }} \leq \min \left\{\frac{1}{2 t_{C K Q, n T y p e}} ; \frac{1}{2 t_{C K Q, p T y p e}}\right\}=\frac{1}{2 t_{C K Q, p T y p e}}$
If a divider which maximizes the speed performance is required, according to the results in the previous sub-section, we can set the current $I_{S S, n T y p e}$ of the first $T F F$ to be about two times higher than $I_{S S, p T y p e}$ to allow a $t_{C K Q, n T y p e}$ value very close to its asymptotic minimum value. The maximum speed design is therefore obtained by setting $I_{S S, n T y p e}$ to $14 \mu \mathrm{~A}$ in the first stage and keeping $I_{S S, p T y p e}$ to $7 \mu \mathrm{~A}$ in the second stage. The third and fourth stages can still be biased with the minimum current.

Of course, considering the divider power consumption and denoting with $P_{D I V 2, i}$ the power consumption of the i-th DIV2 block in the cascade of n stages, since in any design case all the TFFs after the first nType and the first pType stages will be equally sized, we can express the total power consumption $P_{T O T}$ for both the design strategies as follows:
$P_{T O T}=P_{D I V 2,1}+P_{D I V 2,2}+(N-2) P_{D I V 2,3}$.
Thus, the divider power consumption, which linearly increases with $N$, for the maximum speed performance shows a power consumption increase, with respect the minimum $P D P$ design, which may be not so significant and reduces with the increase of $N$.

| TFF1 <br> TType <br> $I_{S S}(\mu A)$ | TFF2 <br> $p T y p e$ <br> $I_{S S}(\mu A)$ | TFF3 <br> $n T y p e$ <br> $I_{S S}(\mu A)$ | TFF4 <br> $p T y p e$ <br> $I_{S S}(\mu A)$ | Power <br> $(\mu W)$ | $t_{C K, M I N}$ <br> $(p s)$ | $f_{C K, \max }$ <br> $(G H z)$ |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| 5 | 7 | 5 | 5 | 52.8 | 95 | 10.5 |
| 14 | 7 | 5 | 5 | 74.4 | 82 | 12.2 |
| 5 | 5 | 5 | 5 | 48 | $>200$ | $<5$ |



Fig. 10. Output waveforms of the FMCML static frequency divider designed for minimum PDP and simulated at the maximum speed with an input $t_{C K}$ of 95ps.

All the cases we have designed and analyzed are summarized in Table IV. The output waveforms at the maximum operating frequency of the DIV2 stages of the frequency divider designed for the minimum $P D P$ are reported in Fig. 10 for an input clock signal with a period $t_{C K}$ equal to 95 ps and an amplitude of 0.3 V . Simulated phase noise is about $-107.5 \mathrm{dBc} / \mathrm{Hz}$ at 100 kHz offset after the first DIV2 stage, and 3dB higher for the divided-by-16 output.

## D. Design of the layout

The results reported in Tab. IV refer to post-layout simulations, since at high frequencies it is important to take the effect of layout parasitics into account. A modular layout approach based on the divide-by-4 (the cascade of a nType and a pType cell) has been adopted, and Fig. 11 shows the layout of the first divide-by-4 for the minimum PDP case (i.e. $I_{S S, n T y p e}=5 \mu \mathrm{~A}$ and $I_{S S, p T_{y p e}}=7 \mu \mathrm{~A}$ ).


Fig. 11. Layout of a divide-by-4 divider (nType cell biased at $5 \mu \mathrm{~A}$ and p Type cell biased at $7 \mu \mathrm{~A}$ ).

The layout has been optimized to minimize the length of interconnections and to maintain the symmetries of the original structure [28], according to the design practices of analog high frequency applications, resulting in an area footprint of 23.73 x $3.5 \mu \mathrm{~m}^{2}$ for the divide-by-4 (total area is approximately doubled for the full divide-by-16, taking also into account the bias generators).

## VI. Conclusion

A novel architecture, in which complementary nType and pType low-voltage FMCML $D F F$ s are exploited to implement high-speed and power-efficient static frequency dividers, has been presented in this work.

Thanks to a detailed analysis of the propagation delay of the FMCML Flip-Flops, two design strategies have been also presented; the first one in which minimum $P D P$ is achieved and another one in which the maximum speed performance is pursued.
The two strategies have been validated designing divide-by16 circuits with a 28 nm FDSOI CMOS technology. The results reported in Table IV, which are in good agreement with the theoretical analysis, confirm the effectiveness of the proposed architecture and design strategies. In particular, a maximum operating frequency higher than 12 GHz is achieved with a power consumption as low as $74.4 \mu \mathrm{~W}$. The power consumption can be further reduced (of about $30 \%$ ) adopting the minimum $P D P$ strategy, with a reduction in the maximum speed performance of about $15 \%$.

Tab. V compares the performance of the proposed divider with CMOS frequency dividers operating in the multi- GHz range. A figure of merit (FOM) that takes into account maximum operating frequency, total power consumption and
the division factor has been computed as:
$F O M=\frac{f_{M A X}}{P_{T O T} / l_{2} N_{D I V}}$.

| Ref. | Arch. | Tech. | $\mathbf{V}_{\text {DD }}$ | $\mathbf{P}_{\text {TOT }}$ | $\mathrm{f}_{\text {MAX }}$ | $\mathrm{N}_{\text {dIV }}$ | FOM |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| [29] | TSPC | 90 | 0.5 | 0.25 | 7.2 | 2 | 28.8 |
| [30] | DFD | 32 | 1 | 4.8 | 70 | 4 | 29.2 |
| [8] | RFD | 65 | 0.4 | 1.6 | 64.2 | 2 | 40.1 |
| [7] | MCML | 65 | 1 | 6.25 | 67 | 4 | 21.4 |
| [31] | ILFD | 65 | 0.42 | 1.2 | 62 | 2 | 51.7 |
| [32] | MCML | 65 | 1.3 | 0.78 | 21.5 | 2 | 27.6 |
| [33] | TSPC | 22 | 0.9 | 0.35 | 70 | 2 | 195 |
| [33] | TSPC | 22 | 0.4 | 0.0244 | 25.7 | 2 | 1058 |
| This work $^{4}$ | FMCML | 28 | 0.8 | 0.0528 | 10.5 | 16 | 795 |
|  | FMCML | 28 | 0.8 | 0.0744 | 12.2 | 16 | 656 |
|  |  | nm | V | mW | GHz |  | GHz/mW |

TSPC: SFD exploiting true single-phase clock logic style
DFD: dynamic frequency divider
RFD: regenerative frequency divider
MCML: SFD exploiting MCML logic style
ILFD: injection-locked frequency divider
FMCML: SFD exploiting folded MCML logic style
Tab. V shows a very high efficiency for the proposed approach, as highlighted by the FOM; comparable or even higher values for the FOM are achieved using SFDs based on true single-phase clock (TSPC) logic style, that however presents a higher sensitivity to noise, due to its single-ended nature, and a much higher power supply switching noise, that could disturb other blocks on the same chip.

## REFERENCES

[1] C. Feng, X.P. Yu, W.M. Lim, and K.S. Yeo, 'A 40 GHz 65 nm phaselocked loop with optimized shunt-peaked buffer,' IEEE Microwave Wireless Comp. Lett., vol. 25, no. 1, pp. 34-36, Jan. 2015.
[2] G. S. Jeong, W. Kim, J. Park, T. Kim, H. Park, and D.-K. Jeong, 'A 0.015 mm 2 inductorless $32-\mathrm{GHz}$ clock generator with wide frequencytuning range in $28-\mathrm{nm}$ CMOS technology,' IEEE Trans. Circuits and Systems Part II, vol. 64, no. 6, pp. 655-659, Jun. 2017.
[3] G. Shu; W. S. Choi; S. Saxena; M. Talegaonkar; T. Anand; A. Elkholy; A. Elshazly; and P. K. Hanumolu, 'A 4-to-10.5 Gb/s continuous-rate digital clock and data recovery with automatic frequency acquisition,' IEEE J. Solid-State Circuits, vol. 51, no. 2, pp. 428-439, Feb. 2016.
[4] J. Lee, P. Chiang, P. Peng, L. Chen, and C. Weng, 'Design of $56 \mathrm{~Gb} / \mathrm{s}$ NRZ and PAM4 SerDes transceivers in CMOS technologies,' IEEE J. Solid-State Circ., vol. 50, no. 9, pp. 2061-2073, Sept. 2015.
[5] A. I. Hussein, S. Vasadi, J. Paramesh, 'A 450 fs $65-\mathrm{nm}$ millimeter-wave time-to-digital converter using statistical element selection for all-digital PLLs,' IEEE J. Solid-State Circ., vol. 53, no. 2, pp. 357-374, Feb. 2018.
[6] L. Kull, D. Luu, C. Menolfi, M. Brändli, P. A. Francese, T. Morf, M. Kossel, A. Cevrero, I. Ozkaya, and T. Toifl, 'A 24-72-GS/s 8-b timeinterleaved SAR ADC with 2.0-3.3-pJ/conversion and $>30 \mathrm{~dB}$ SNDR at Nyquist in 14-nm CMOS FinFET,' IEEE J. Solid-State Circ., vol. 53, no. 12, pp. 3508-3516, Dec. 2018.
[7] A. I. Hussein, J. Paramesh, 'Design and self-calibration techniques for inductor-less millimeter-wave frequency dividers,' IEEE J. Solid-State Circ., vol. 52, no. 6, pp. 1521-1541, Jun. 2017.
[8] Y.-H. Lin, H. Wang, 'A 35.7-64.2 GHz low power Miller divider with weak inversion mixer in 65 nm CMOS, IEEE Microwave Wireless Components Lett., vol. 26, no. 11, pp. 948-950, Nov. 2016.
[9] S.-L. Jang, W.-C. Lai, G.-Z. Li, and Y.-W. Chen, 'High even-modulus injection-locked frequency dividers,' IEEE Trans. Microw. Theory Techn., vol. 67, no. 12, pp. 5069-5079, Dec. 2019.
[10] R. L. Schmid, A. Ç. Ulusoy, S. Zeinolabedinzadeh, and J. D. Cressler, 'A comparison of the degradation in RF performance due to device interconnects in advanced SiGe HBT and CMOS technologies,' IEEE Trans. Electron Devices, vol. 62, no. 6, pp. 1803-1810, Jun. 2015.

[^5]11] M. Alioto, G. Palumbo, Model and Design of Bipolar and MOS Current Mode Logic: CML, ECL and SCL Digital circuits, Springer 2005.
[12] M. Alioto, G. Palumbo, 'Power-aware design techniques for nanometer MOS current-mode logic gates: a design framework,' IEEE Circuits and Systems Mag., vol. 61, no. 4, pp. 40-59, Sep. 2006.
13] Y. Bai, Y. Song, M. N. Bojnordi, A. Shapiro, E. G. Friedman, and E. Ipek, 'Back to the future: current-mode processor in the era of deeply scaled CMOS,' IEEE Trans. VLSI Systems, vol. 24, no. 4, pp. 1266-1279, Apr. 2016.
[14] B. Razavi, 'The role of PLLs in future wireline transmitters,' IEEE Trans. Circuits and Systems Part I, vol. 56, no. 8, pp. 1786-1793, Aug. 2009
[15] K. Gupta, N. Pandey, and M. Gupta, 'Analysis and design of MOS current mode logic exclusive-OR gate using triple-tail cells,' Microelectron. J., vol. 44, no. 6, pp. 561-567, 2013.
[16] K. P. Sai Pradeep, S. Suresh Kumar, 'Design and development of high performance MOS current mode logic (MCML) processor for fast and power efficient computing,' Cluster Computing, vol. 22, pp. 1338713395, 2019.
[17] G. Scotti, D. Bellizia, A. Trifiletti, and G. Palumbo, 'Design of lowvoltage high-speed CML D-latches in nanometer CMOS technologies,' IEEE Trans. VLSI Systems, vol. 25, no. 12, pp. 3509-3520, Dec. 2017.
[18] D. Bellizia, G. Palumbo, G. Scotti, and A. Trifiletti, 'A novel very low voltage topology to implement MCML XOR gates,' PRIME 18 IEEE Conf. on PhD Research in Microelectronics and Electronics, pp. 157-160, Prague 2018
[19] G. Scotti, A. Trifiletti, and G. Palumbo, 'A novel 0.5V MCML D-flipflop topology exploiting forward body bias threshold lowering,' IEEE Trans. Circuits and Systems Part II, vol. 67, no. 3, pp. 560-564, Mar. 2020.
[20] G. Palumbo, G. Scotti, 'A Multi-Folded MCML for Ultra-Low-Voltage High-Performance in Deeply Scaled CMOS', IEEE Trans. on CAS part I, Vol. 67, No. 12, pp. 4696-4706, December 2020.
[21] M. Alioto, R. Mita, and G. Palumbo, 'Design of high-speed powerefficient MOS current-mode logic frequency dividers,' IEEE Trans. Circuits and Systems Part II, vol. 53, no. 11, pp. 1165-1169, Nov. 2006.
[22] R. Nonis, E. Palumbo, P. Palestri, and L. Selmi, 'A design methodology for MOS current-mode logic frequency dividers,' IEEE Trans. Circuits and Systems Part I, vol. 54, no. 2, pp. 245-254, Feb. 2007.
[23] W. Fang, A. Brunnschweiler, and P. Ashburn, 'An analytical maximum toggle frequency expression and its application to optimizing high-speed ECL frequency dividers,' IEEE J. Solid-State Circ., vol. 25, no. 4, pp. 920-931, Aug. 1990.
[24] T. Sakurai, A. R. Newton, 'Alpha-power law MOSFET model and its application to CMOS inverter delay and other formulas,' IEEE J. SolidState Circ., vol. 25, no. 2, pp. 584-594, Apr. 1990.
[25] F. Centurelli, G. Scotti, A. Trifiletti, and G. Palumbo, 'Delay models and design guidelines for MCML gates with resistor or PMOS load,' Microelectron. J., vol. 99, paper 104755, May 2020.
[26] D. Golanski, P. Fonteneau, C. Fenouillet-Beranger, A. Cros, F. Monsieur, N. Guillard, C.-A. Legrand, A. Dray, C. Richier, H. Beckrich, P. Mora, G. Bidal, O. Weber, O. Savod, J.-R. Manouvrier, P. Galy, N. Planes, and F. Arnaud, 'First demonstration of a full 28 nm high-k/metal gate circuit transfer from Bulk to UTBB FDSOI technology through hybrid integration,' VLSI 13 IEEE Symp. VLSI Circuits, pp. 1-24-125, Jun. 2013.
[27] J.M. Musicer, J. Rabaey, 'MOS current mode logic for low power, low noise CORDIC computation in mixed-signal environment,' ISLPED 00 Int. Symp. Low-Power Electronics and Design, Rapallo (Italy) 26-27 Jul. 2000, pp. 102-107-
[28] F. Centurelli, P. Monsurrò, G. Scotti, P. Tommasino, and A. Trifiletti, 'A power efficient frequency divider with 55 GHz self-oscillating frequency in SiGe BiCMOS,' MDPI Electronics, vol. 9, no. 11, paper 1968, Nov. 2020.
[29] W. Deng, K. Okada, and A. Matsuzawa, 'A 0.5-V, $0.005-\mathrm{to}-3.2 \mathrm{GHz}, 4.1-$ to- 6.4 GHz LC-VCO using E-TSPC frequency divider with forward body bias for sub-picosecond-jitter clock generation,' ASSCC 10 IEEE Asian Solid-State Circ. Conf., 2010.
[30] A. Ghilioni, A. Mazzanti, and F. Svelto, 'Analysis and design of mmwave frequency dividers based on dynamic latches with load modulation,' IEEE J. Solid-State Circ., vol. 48, no. 8, pp. 1842-1850, Aug. 2013.
[31] J. Zhang, Y. Cheng, C. Zhao, Y. Wu, and K. Kang, 'Analysis and design of ultra-wideband mm-wave injection-locked frequency dividers using transformer-based high-order resonators,' IEEE J. Solid-State Circ., vol. 53, no. 8, pp. 2177-2189, Aug. 2018.
[32] Y. Zhang, Z. Wen, and X. Hou, 'A 0.78 mW inductor-less 21 GHz CML frequency divider in 65 nm CMOS,' ITNEC 19 IEEE Information

Technology, Networking, Electronic and Automation Control Conf., pp. 1395-1399, 2019.
[33] Z. Tibenszky, C. Carta, and F. Ellinger, 'A 0.35 mW 70 GHz divide-by4 TSPC frequency divider on 22 nm FD-SOI CMOS technology,' RFIC 20 IEEE Radio Frequency Integrated Circuits Symp., pp. 243-246, 2020.


Francesco Centurelli was born in Roma in 1971. He received the laurea degree (cum laude) and the Ph.D. degree in Electronic Engineering from the University of Roma "La Sapienza", Roma, Italy, in 1995 and 2000 respectively.
In 2006 he became an Assistant Professor at the DIET department of the University of Roma La Sapienza.
His research interests were initially focused on system-level analysis and design of clock recovery circuits and high-speed analog integrated circuits, and now concern the design of analog-to-digital converters and very low-voltage circuits for analog and RF applications.
He has published more than 100 papers on international journals and refereed conferences, and has been also involved in R\&D activities held in collaboration between Università "La Sapienza" and some industrial partners.


Giuseppe Scotti was born in Cagliari, Italy, in 1975. He received the M.S. and Ph.D. degrees in electronic engineering from the University of Rome "La Sapienza", Rome, Italy, in 1999 and 2003, respectively. In 2010, he became a Researcher (Assistant Professor) at the DIET department of the university of Rome "La Sapienza" and in 2015 he was appointed Associate Professor in the same department. He teaches undergraduate and graduate courses on basic electronics and microelectronics. His research activity was mainly concerned with integrated circuits design and focused on design methodologies able to guarantee robustness with respect to parameter variations in both analog circuits and digital VLSI circuits. In the context of analog design his research activity was concerned with circuit topologies for the realization of low-voltage analog building blocks using ultra-short channel CMOS technology, whereas in the context of cryptographic hardware his focus has been on novel PAAs methodologies and countermeasures. He has been also involved in R\&D activities held in collaboration between "La Sapienza" University and some industrial partners, which led, between 2000 and 2015, to the implementation of 13 ASICs. He has coauthored more than 45 publications in international Journals, about 70 contributions in conference proceedings and is the coinventor of 2 international patents.


Gaetano Palumbo (F'07) was born in Catania, Italy, in 1964. He received the Laurea degree in Electrical Engineering in 1988 and the Ph.D. degree in 1993 from the University of Catania. In 1994 he joined the University of Catania, where he is full professor. His primary research interests are in analog and digital circuits.
He was co-author of four books by Kluwer Academic Publishers and Springer, in 1999, 2001, 2005, 2014 respectively, and a textbook on electronic devices in 2005. He is the author of more than 440 scientific papers on referred international journals (190+) and in conferences. Moreover, he has co-authored several patents.
He served as an Associated Editor of the IEEE Transactions on Circuits and Systems part I in 1999-2001, 2004-2005 and 2008-2011, and of the IEEE Transactions on Circuits and Systems part II in 2006-2007.
In the period 2011-2013 he served as a member of the Board of Governors of the IEEE CAS Society.
In 2005 he was one of the 12 panelists in the scientific-disciplinary area 09 industrial and information engineering of the CIVR (Committee for Italian Research Assessment), In 2003 he received the Darlington Award.
In 2015 he has been a panelist of GEV (Group of Evaluation Experts) in the scientific area 09 - industrial and information engineering of the ANVUR for the Assessment of Italian Research Quality in 2011-2014.


[^0]:    Copyright (c) 2008 IEEE. Personal use of this material is permitted. However, permission to use this material for any other purposes must be obtained from the IEEE by sending an email to pubs-permissions@ieee.org.

[^1]:    ${ }^{1}$ The propagation delay is defined as the time taken by the output to reach $50 \%$ starting from the point in which the input has reached its $50 \%$ variation.

[^2]:    ${ }^{2}$ In practical cases also with a resistive load, since $\tau_{R D}$ is inversely proportional to $I_{T A I L}^{2}$, the contribution of $\tau_{R D}$ to the overall propagation delay

[^3]:    ${ }^{3}$ For lower currents the transistor width is kept constant, and the resistance of the triode device is changed by acting on the gate voltage. In this case, $\tau_{R D}$

[^4]:    depends on the bias current, resulting in an increase of $t_{C K Q}$ as shown in Fig. 8

[^5]:    ${ }^{4}$ Simulated results.

