

# Nonlinear models and algorithms for RF systems digital calibration

Department of Information Engineering, Electronics and Telecommunications

Doctor of Philosophy in Information and Communication Technologies Cycle  $\mathsf{XXX}$ 

Candidate Felice Rosato ID number 799026

Thesis Advisor Prof. Alessandro Trifiletti

A thesis submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in ICT

Thesis defended on 22 February 2018.

This thesis was evaluated by the two following external referees:

Prof. Paolo Colantonio, University of Rome "Tor Vergata", Rome, Italy Prof. Gian Carlo Cardarilli, University of Rome "Tor Vergata", Rome, Italy

The time and effort of the external referees in evaluating this thesis, as well as their valuable and constructive suggestions are very much appreciated and greatly acknowledged.

Nonlinear models and algorithms for RF systems digital calibration Ph.D. thesis. Sapienza – University of Rome

© 2018 Felice Rosato. All rights reserved

This thesis has been typeset by LATEX and the Sapthesis class.

Version: February 16, 2018

Author's email: felice.rosato@uniroma1.it

Dedicated to Ilenia

#### Abstract

Focusing on the receiving side of a communication system, the current trend in pushing the digital domain ever more closer to the antenna sets heavy constraints on the accuracy and linearity of the analog front-end and the conversion devices. Moreover, mixed-signal implementations of Systems-on-Chip using nanoscale CMOS processes result in an overall poorer analog performance and a reduced yield. To cope with the impairments of the low performance analog section in this "dirty RF" scenario, two solutions exist: designing more complex analog processing architectures or to identify the errors and correct them in the digital domain using DSP algorithms. In the latter, constraints in the analog circuits' precision can be offloaded to a digital signal processor.

This thesis aims at the development of a methodology for the analysis, the modeling and the compensation of the analog impairments arising in different stages of a receiving chain using digital calibration techniques. Both single and multiple channel architectures are addressed exploiting the capability of the calibration algorithm to homogenize all the channels' responses of a multi-channel system in addition to the compensation of nonlinearities in each response. The systems targeted for the application of digital post compensation are a pipeline ADC, a digital-IF sub-sampling receiver and a 4-channel TI-ADC.

The research focuses on post distortion methods using nonlinear dynamic models to approximate the post-inverse of the nonlinear system and to correct the distortions arising from static and dynamic errors. Volterra model is used due to its general approximation capabilities for the compensation of nonlinear systems with memory. Digital calibration is applied to a Sample and Hold and to a pipeline ADC simulated in the 45 nm process, demonstrating high linearity improvement even with incomplete settling errors enabling the use of faster clock speeds. An extended model based on the baseband Volterra series is proposed and applied to the compensation of a digital-IF sub-sampling receiver. This architecture envisages frequency selectivity carried out at IF by an active band-pass CMOS filter causing in-band and out-ofband nonlinear distortions. The improved performance of the proposed model is demonstrated with circuital simulations of a 10<sup>th</sup> order band pass filter, realized using a five-stage  $G_m$ -C Biquad cascade, and validated using out-of-sample sinusoidal and QAM signals. The same technique is extended to an array receiver with mismatched channels' responses showing that digital calibration can compensate the loss of directivity and enhance the overall system SFDR. An iterative backward pruning is applied to the Volterra models showing that complexity can be reduced without impacting linearity, obtaining state-of-the-art accuracy/complexity performance.

Calibration of Time-Interleaved ADCs, widely used in RF-to-digital wideband receivers, is carried out developing ad hoc models because the steep discontinuities generated by the imperfect canceling of aliasing would require a huge number of terms in a polynomial approximation. A closed-form solution is derived for a 4-channel TI-ADC affected by gain errors and timing skews solving the perfect reconstruction equations. A background calibration technique is presented based on cyclo-stationary filter banks architecture. Convergence speed and accuracy of the recursive algorithm are discussed and complexity reduction techniques are applied.

#### Ringraziamenti

Ringrazio innanzitutto il mio Tutor, Prof. Alessandro Trifiletti, per avermi fornito continui stimoli ed avermi spronato a dare sempre il massimo. Ringrazio i Professori Francesco Centurelli, Pasquale Tommasino e Giuseppe Scotti per il (purtroppo poco) tempo passato al Centro Studi G. Barzilai, ricco di discussioni, riflessioni e critiche che mi hanno aiutato a definire problemi e raffinare soluzioni. Una menzione speciale va a Pietro Monsurrò, per il supporto concreto durante tutto il percorso, per essere fonte inesauribile di spunti e per essere un esempio di eccellenza a cui tendere.

Un sentito ringraziamento va ai Professori dell'Advisory Board, Aurelio Uncini, Debora Pastina e Fabrizio Palma per la disponibilità e per i preziosi suggerimenti. In particolare, ringrazio il Prof. Uncini per avermi dato dei validi riferimenti nella sconfinata letteratura relativa al digital signal processing e per le critiche costruttive dall'alto della sua decennale esperienza nel settore.

Ringrazio i miei compagni di dottorato Danilo, Davide e Gaetano per i bei momenti passati insieme e per aver reso più leggere le giornate iniziate col piede sbagliato. Ci siamo sostenuti a vicenda consolidando il nostro rapporto di stima reciproca e di amicizia.

Voglio inoltre ringraziare Giuseppe Tomasicchio per la disponibilità degli strumenti e delle attrezzature nei laboratori di Thales Alenia Space Italia e Guglielmo Lulli per i tanti insegnamenti riguardanti l'analisi e il design di sistemi RF e di architetture di elaborazione digitale. Ci tengo a ringraziare anche il mio collega e amico Marco, che ho ammorbato con chiacchiere e grafici alla lavagna per tre anni.

Ringrazio la mia famiglia, mamma, papá e i miei nonni, per essere sempre stati un solido riferimento, e in particolare mio fratello Angelo per gli aperitivi del venerdì. Sono orgoglioso di loro.

L'energia, la pazienza e la tenacia sono state risorse preziose in questi tre anni, diviso tra ricerca e lavoro. Devo ringraziare chi le ha alimentate e ha condiviso con me ogni singolo giorno (e notte): Ilenia. A lei dedico questa tesi, frutto di tanti sacrifici.

## Contents

| 1        | Intr | oduction                                          | 1  |
|----------|------|---------------------------------------------------|----|
|          | 1.1  | Scenario and Motivation                           | 1  |
|          | 1.2  | Digital calibration background                    | 2  |
|          | 1.3  | Research framework – Scope of the Thesis          | 3  |
|          | 1.4  | Thesis Outline                                    | 4  |
| <b>2</b> | Nor  | linear system modeling and estimation             | 7  |
|          | 2.1  | Nonlinear System Modeling                         | 7  |
|          |      | 2.1.1 Volterra series                             | 8  |
|          |      | 2.1.2 $p$ -th order inverse Volterra model        | 10 |
|          | 2.2  | Compensation techniques for nonlinear dynamic     |    |
|          |      | systems                                           | 10 |
|          | 2.3  | System Identification elements                    | 12 |
|          |      | 2.3.1 Linear Least Squares estimator              | 13 |
|          |      | 2.3.2 Recursive Least Squares                     | 14 |
|          |      | 2.3.3 Input excitation design                     | 15 |
|          | 2.4  | Practical Volterra post-inverse system estimation | 17 |
| 3        | AD   | C digital calibration using post compensation     | 19 |
|          | 3.1  | A/D Converters theory of operation                | 19 |
|          |      | 3.1.1 Performance Metrics                         | 21 |
|          | 3.2  | ADC Architectures overview                        | 24 |
|          |      | 3.2.1 Flash ADC                                   | 24 |
|          |      | 3.2.2 Pipeline ADC                                | 25 |
|          |      | 3.2.3 Successive Approximation ADC                | 26 |
|          |      | 3.2.4 $\Sigma$ - $\Delta$ ADC                     | 27 |
|          | 3.3  | Pipeline ADC with 1.5-bit stages                  | 28 |
|          | 3.4  | Post compensation methods for ADCs                | 31 |
|          |      | 3.4.1 Dithering                                   | 32 |
|          |      | 3.4.2 Look Up Table based methods                 | 32 |
|          |      | 3.4.3 Post inversion methods                      | 33 |
|          | 3.5  | ADC calibration techniques                        | 34 |
|          |      | 3.5.1 Foreground calibration                      | 34 |
|          |      | 3.5.2 Background calibration                      | 34 |
|          | 3.6  | S/H digital calibration in 45nm CMOS process      | 35 |
|          |      | 3.6.1 Switched Capacitors S/H                     | 36 |

|          |                      | 3.6.2 Volterra parameters estimation                               | 37  |
|----------|----------------------|--------------------------------------------------------------------|-----|
|          |                      | 3.6.3 Simulation results                                           | 37  |
|          | 3.7                  | ADC pipeline digital calibration in 45nm CMOS process              | 41  |
|          |                      | 3.7.1 Switched Capacitors Pipeline ADC                             | 42  |
|          |                      | 3.7.2 Simulation results                                           | 43  |
|          | 3.8                  | Conclusions                                                        | 46  |
| <b>4</b> | $\operatorname{Tin}$ | ne-Interleaved ADC calibration using filter banks                  | 47  |
|          | 4.1                  | TI-ADC Architecture                                                | 48  |
|          | 4.2                  | 2-channel TI-ADC                                                   | 49  |
|          |                      | 4.2.1 Ideal reconstruction                                         | 49  |
|          |                      | 4.2.2 Introducing time skew on branches: non integer delay         | 49  |
|          |                      | 4.2.3 Non-ideal reconstruction                                     | 50  |
|          |                      | 4.2.4 Reconstruction without equalization                          | 51  |
|          |                      | 4.2.5 Reconstruction with equalization                             | 51  |
|          | 4.3                  | 4-channel TI-ADC Perfect Reconstruction Filters                    | 53  |
|          | 4.4                  | First-Order Taylor Approximation                                   | 57  |
|          |                      | 4.4.1 Compact expression synthesis                                 | 58  |
|          | 4.5                  | Behavioral Simulations and Results                                 | 60  |
|          | 4.6                  | Background calibration using cyclostationary filter banks          | 61  |
|          |                      | 4.6.1 Expression of the filter base                                | 63  |
|          |                      | 4.6.2 Complexity reduction                                         | 65  |
|          |                      | 4.6.3 Behavioral simulations and results                           | 66  |
|          | 4.7                  | Conclusions and future work                                        | 68  |
| <b>5</b> | Dig                  | ital-IF receiver nonlinear calibration                             | 71  |
|          | 5.1                  | Receiver Target Architecture                                       | 72  |
|          | 5.2                  | Baseband Volterra models for bandpass systems                      | 73  |
|          |                      | 5.2.1 Classic baseband model derivation                            | 74  |
|          |                      | 5.2.2 Model extension considering finite out-of-band attenuation . | 79  |
|          |                      | 5.2.3 Baseband modeling of sub-sampled bandpass systems            | 82  |
|          |                      | 5.2.4 Behavioral simulations                                       | 85  |
|          | 5.3                  | System operating point impact on digital calibration performance   | 88  |
|          | 5.4                  | 40nm CMOS AAF calibration                                          | 90  |
|          |                      | 5.4.1 Anti-Aliasing Filter design                                  | 91  |
|          |                      | 5.4.2 Digital calibration results                                  | 95  |
|          | 5.5                  | Conclusions and future work                                        | 99  |
| 6        | $\mathbf{RF}$        | Array Receiver Calibration 1                                       | .01 |
|          | 6.1                  | Introduction                                                       | 102 |
|          | 6.2                  | Wideband Volterra calibration architecture                         | 02  |
|          | 6.3                  | System setup for parameters estimation 1                           | 104 |
|          |                      | 6.3.1 DBFN simulation model                                        | 104 |
|          | 6.4                  | Simulation Results                                                 | 105 |
|          | 6.5                  | Implementation complexity and parallelism 1                        | 107 |
|          | 6.6                  | Conclusions                                                        | 109 |

<u>x</u>\_\_\_\_\_

| 7                      | Con          | clusions                              | 111 |
|------------------------|--------------|---------------------------------------|-----|
|                        | 7.1          | Summary of the research contributions | 112 |
|                        | 7.2          | Future works                          | 112 |
| List of Publications 1 |              | 113                                   |     |
| Bi                     | Bibliography |                                       |     |

# List of Figures

| 2.1   | Wiener model                                                                                           | 9   |
|-------|--------------------------------------------------------------------------------------------------------|-----|
| 2.2   | Hammerstein model                                                                                      | 10  |
| 2.3   | Post-inverse system $S^{-1}$ in series with the forward system S realizes                              |     |
|       | the condition $z[n] = x[n]$                                                                            | 11  |
| 2.4   | Two ways to approximate a post-inverse system in an offline com-                                       |     |
|       | pensation: mathematical inversion of the estimation of the forward                                     |     |
|       | system (a) and direct estimation of post-inverse system (b) $\ldots$ .                                 | 12  |
| 2.5   | Direct estimation of post-inverse system. Ideal (a) architecture and                                   |     |
|       | practical implementation (b) that uses the reference signal $r[n]$ consi-                              |     |
|       | dering the unavailability of the real input data $u[n]$                                                | 17  |
| 3.1   | Concept scheme of an ideal A/D converter                                                               | 20  |
| 3.2   | Transfer function of an ideal 3 bit A/D converter. The linear fit of                                   |     |
|       | the mid points of the quantization regions $\mathfrak{S}_i$ is a perfect straight line.                | 21  |
| 3.3   | Transfer function of a non-ideal 3 bit A/D converter. The DNL is                                       |     |
|       | positive when the actual bin width is greater than the ideal one,                                      |     |
|       | negative when it is smaller                                                                            | 22  |
| 3.4   | Transfer function of a non-ideal 3 bit $A/D$ converter. The INL is                                     |     |
|       | positive when the actual transition precedes the ideal one, negative                                   |     |
|       | when it follows.                                                                                       | 23  |
| 3.5   | Flash ADC Architecture                                                                                 | 25  |
| 3.6   | First stage of a pipeline ADC                                                                          | 25  |
| 3.7   | Pipeline ADC architecture with digital correction logic                                                | 26  |
| 3.8   | Successive Approximation ADC Architecture                                                              | 27  |
| 3.9   | Sigma-Delta ADC Architecture                                                                           | 27  |
| 3.10  | 1-bit MDAC transfer characteristic (a) and corresponding 2 stages                                      |     |
|       | cascade (b) normalized to $V_R$                                                                        | 28  |
| 3.11  | 1-bit MDAC transfer characteristic affected by offset (a) and cor-                                     |     |
|       | responding 2 stages cascade considering the second stage ideal (b)                                     |     |
| ~ . ~ | normalized to $V_R$                                                                                    | 29  |
| 3.12  | 1.5-bit MDAC architecture                                                                              | 29  |
| 3.13  | 1.5-bit MDAC transfer characteristic (a) and corresponding 2 stages                                    | ~ ~ |
|       | cascade (b) normalized to $V_R$                                                                        | 30  |
| 3.14  | Reference channel based background calibration architecture                                            | 35  |
| 3.15  | Fupped-around tully differential Sample and Hold scheme (a) and                                        | 20  |
|       | associated clocking scheme (b) $\ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots$ | 36  |

| 3.16 | Comparison between pre and post calibration SNDR and Gain, using                                                                                                                              |     |
|------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----|
|      | a $[20, 2, 2, 2]$ lag configuration $\ldots \ldots \ldots$                                                                     | 38  |
| 3.17 | Comparison between pre and post calibration SNDR and Gain, using                                                                                                                              |     |
|      | a $[15, 4, 2, 2]$ lag configuration $\ldots \ldots \ldots$                                                                     | 39  |
| 3.18 | SNDR improvement and gain variations with pruning, starting from                                                                                                                              |     |
|      | a $[20, 2, 2, 2]$ lag configuration $\ldots \ldots \ldots$                                                                     | 39  |
| 3.19 | SNDR improvement and gain variations with pruning, starting from                                                                                                                              |     |
|      | a $[20, 4, 2, 2]$ lag configuration                                                                                                                                                           | 40  |
| 3.20 | Comparison between post calibration SNDR in case of supply voltage                                                                                                                            | 10  |
| 0.01 | variations of $\pm 1\%$ using a 54 parameters pruned model                                                                                                                                    | 40  |
| 3.21 | Latch comparator circuit                                                                                                                                                                      | 42  |
| 3.22 | Circuit that implements subtraction and multiplication of the input                                                                                                                           | 4.0 |
| 0.00 | and the $D$ signals                                                                                                                                                                           | 43  |
| 3.23 | SNDR improvement against nominal ADC resolution using a lag                                                                                                                                   |     |
| 0.04 | structure $[30, 4, 2, 2, 1, 1, 1, 0, 0, 0]$ .                                                                                                                                                 | 44  |
| 3.24 | SNDR improvement against number of coefficients using a lag struc-                                                                                                                            | 4 5 |
|      | ture $[30, 4, 2, 2, 1, 1, 1, 0, 0, 0]$                                                                                                                                                        | 45  |
| 4.1  | M-Channel TI-ADC architecture                                                                                                                                                                 | 48  |
| 4.2  | <i>M</i> -Channel TI-ADC clocking scheme                                                                                                                                                      | 48  |
| 4.3  | 2 Channel TI-ADC ideal signal reconstruction                                                                                                                                                  | 49  |
| 4.4  | Phase response of delay element transfer function                                                                                                                                             | 50  |
| 4.5  | 2 Channel TI-ADC non-ideal signal reconstruction                                                                                                                                              | 51  |
| 4.6  | Phase response of $H(e^{j\omega-j\pi})$ transfer function                                                                                                                                     | 52  |
| 4.7  | 4 Channel TI-ADC non-ideal signal reconstruction                                                                                                                                              | 53  |
| 4.8  | Phase of $H_i\left[e^{j\left(\omega-\frac{\pi}{2}\right)}\right]$ (a), $H_i\left[e^{j\left(\omega-\pi\right)}\right]$ (b) and $H_i\left[e^{j\left(\omega-\frac{3\pi}{2}\right)}\right]$ (c) . | 55  |
| 49   | R function (a) Z function (b) and Q function (c)                                                                                                                                              | 56  |
| 4 10 | $F_{\mu}(\omega)$ filter magnitude and phase responses                                                                                                                                        | 59  |
| 4 11 | $F_{a}(\omega)$ filter magnitude and phase responses                                                                                                                                          | 59  |
| 4.12 | Comparison between output signal spectra before and after calibration                                                                                                                         | 00  |
|      | using the exact and the approximated reconstruction filters for a                                                                                                                             |     |
|      | known combination of timing skews and gain errors.                                                                                                                                            | 60  |
| 4.13 | 4-Channel TI-ADC error model                                                                                                                                                                  | 61  |
| 4.14 | 4-Channel TI-ADC calibration architecture (a) and corresponding                                                                                                                               |     |
|      | clocking scheme (b)                                                                                                                                                                           | 62  |
| 4.15 | Approximation of the desired frequency response with a filter base.                                                                                                                           |     |
|      | The calibrated output $Z(e^{j\omega})$ is computed from the uncalibrated                                                                                                                      |     |
|      | output $Y(e^{j\omega})$ given the weights $\alpha_{ij}$ estimated by the adaptive loop                                                                                                        |     |
|      | (offset correction not shown)                                                                                                                                                                 | 63  |
| 4.16 | Correction filters shape (errors have 1% standard deviation, mean                                                                                                                             |     |
|      | bandwidth is $4f_S$ , one random realization is shown for the four channels).                                                                                                                 | 64  |
| 4.17 | Approximation error of using $(4.70)$ to fit the filters in Fig. 4.16                                                                                                                         | 65  |
| 4.18 | Complexity / accuracy trade-off for several correction filter models.                                                                                                                         | 66  |
| 4.19 | Complexity / accuracy trade-off after optimization.                                                                                                                                           | 67  |
| 4.20 | Convergence speed and average accuracy of the models. With $N_L$ in                                                                                                                           |     |
|      | [23] chosen to achieve comparable steady-state accuracy to $(4.70)$ and                                                                                                                       |     |
|      | $(4.71), respectively \ldots \ldots$                                   | 68  |

| 5.1  | Digital-IF Receiver Architecture with digital I/Q extraction 72                                              |
|------|--------------------------------------------------------------------------------------------------------------|
| 5.2  | IQ Receiver Architecture                                                                                     |
| 5.3  | Nonlinear contributions to output spectrum. Even orders kernels                                              |
|      | don't produce components around $f_c$                                                                        |
| 5.4  | Spectrum of the Anti-Aliasing Filter output with $f_s = 320$ MS/s 80                                         |
| 5.5  | Spectrum of the sub-sampled Anti-Aliasing Filter output with $f_s =$                                         |
|      | 40MS/s                                                                                                       |
| 5.6  | Real analog spectrum before the sampling process composed by the                                             |
|      | useful signal and the IMD3                                                                                   |
| 5.7  | Real digital spectrum after the sampling process                                                             |
| 5.8  | Comparison between the BF spectrum obtained after sub-sampling<br>and the one generated directly at baseband |
| 5.9  | Complex digital spectrum with the addition of the shifted conjugated                                         |
|      | replica of the IMD components and low-pass filtered                                                          |
| 5.10 | Comparison between post calibration spectra using RFVM and RFeVM                                             |
|      | using an out-of-sample detached two-tone signal                                                              |
| 5.11 | Comparison between post calibration spectra using RFVM and RFeVM                                             |
|      | using an out-of-sample two-tone signal                                                                       |
| 5.12 | Comparison between post correction SFDR of the two models using                                              |
|      | out-of-sample two-tone signals                                                                               |
| 5.13 | Comparison between the post calibration spectra using a wideband                                             |
|      | notched multisine useful for the computation of the Noise to Power                                           |
|      | Ratio (NPR)                                                                                                  |
| 5.14 | Ideal dynamic range improvement                                                                              |
| 5.15 | Dynamic range improvement limited by compression                                                             |
| 5.16 | Calibration test-bed using behavioral and circuital simulation envi-                                         |
|      | ronments                                                                                                     |
| 5.17 | BQ architecture                                                                                              |
| 5.18 | OTA architecture                                                                                             |
| 5.19 | CMFB circuit                                                                                                 |
| 5.20 | Anti-Aliasing Filter AC response                                                                             |
| 5.21 | Comparison between the SFDR of the system without calibration                                                |
|      | and the one obtained after calibrating with classic RF Volterra model                                        |
|      | $({\rm RFVM})$ and the proposed RF extended Volterra model $({\rm RFeVM})$ . $96$                            |
| 5.22 | Comparison between the SFDR of the system without calibration and                                            |
|      | the one obtained after calibrating with a 31 parameters RFVM and a $$                                        |
|      | 27 parameters RFeVM                                                                                          |
| 5.23 | Constellation plot of a 10 MHz 16QAM waveform with 0.47 V peak                                               |
|      | amplitude, without calibration, after a 20 taps linear equalizer and                                         |
|      | after 27 taps RFeVM filtering using the estimated parameters. The                                            |
|      | EVM for the three constellations is respectively $-29 \mathrm{dB}, -48 \mathrm{dB}$ and                      |
|      | $-55 \mathrm{dB}$                                                                                            |
| 5.24 | Constellation plot of a 5 MHz 0.04 $\mathrm{V}_\mathrm{p}$ 16<br>QAM waveform in the                         |
|      | presence of in-band sinusoidal IMD3 product of a 0.45 $\mathrm{V_p}$ 2-tone,                                 |
|      | without calibration, after a 20 taps linear equalizer and after 27 taps                                      |
|      | RFeVM filtering using the estimated parameters. The EVM for the                                              |
|      | three constellations is respectively $-29 \mathrm{dB}, -39.2 \mathrm{dB}$ and $-55 \mathrm{dB}$ 98           |

| 5.25 | SFDR and EVM versus the number of parameters using an iterative<br>pruning technique that discards the parameter that impacts SFDR |     |
|------|------------------------------------------------------------------------------------------------------------------------------------|-----|
|      | the least.                                                                                                                         | 99  |
| 6.1  | DBFN wideband calibration architecture                                                                                             | 102 |
| 6.2  | Uniform Linear Array scheme                                                                                                        | 103 |
| 6.3  | DBFN simulation model                                                                                                              | 105 |
| 6.4  | Pre and post calibration directivity comparison                                                                                    | 105 |
| 6.5  | Comparison between TX, RX and CAL chirp waveforms                                                                                  | 106 |
| 6.6  | Comparison between TX, RX and CAL 3-tone signal spectra                                                                            | 107 |
| 6.7  | Received weak 5MHz 16QAM constellation processed together with a                                                                   |     |
|      | strong 2-tone signal using linear and nonlinear calibration.                                                                       | 107 |

## List of Tables

| Output bit mapping of a 1.5-bit MDAC                                                         | 30                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
|----------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Output digital redundancy of the cascade of 2 1.5-bit MDAC. $D_1$ and                        |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| $D_2$ are the output state of the first and second stage respectively                        | 31                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
| Linearity improvement and complexity for various models $\ .\ .\ .$ .                        | 45                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
| Indices permutation of the $\Delta_i$ and $H_i$ used to find the other three                 |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| determinants. The first column represent the references and the other                        |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| three contain the values with which the reference is modified. $\ldots$                      | 57                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
| Computational costs of the models after optimization $\ldots \ldots \ldots$                  | 67                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
| Baseband polynomial distortion components up to fifth order                                  | 82                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
| Values of $f_0$ and $Q$ of each biquadratic cell $\ldots \ldots \ldots \ldots \ldots \ldots$ | 91                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
| Design parameters for the OTA in Fig.5.18                                                    | 93                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
| Design parameters for the CMFB in Fig.5.19                                                   | 93                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
| Capacitance values and number of OTAs used for the implementation                            |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| of the multistage anti-aliasing filter                                                       | 94                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
|                                                                                              | Output bit mapping of a 1.5-bit MDAC Output digital redundancy of the cascade of 2 1.5-bit MDAC. $D_1$ and $D_2$ are the output state of the first and second stage respectively Linearity improvement and complexity for various models Indices permutation of the $\Delta_i$ and $H_i$ used to find the other three determinants. The first column represent the references and the other three contain the values with which the reference is modified Computational costs of the models after optimization |

## Chapter 1

## Introduction

#### **1.1** Scenario and Motivation

Thanks to the progressive technology scaling described by the Moore's law, modern CMOS processes have reached 10 nm gate lengths. These ultra-scaled nodes are better suited for digital logic circuits that exploit device shrinking obtaining increased performance in speed  $(f_T)$  and power efficiency  $(g_m/I_D)$ .

The increasing demand of Systems on Chip (SoC) in consumer and telecommunication markets has produced a big demand of mixed-signal circuits able to process both analog and digital signals. Using BiCMOS Silicon-Germanium (SiGe) process is the best design choice to boost the RF/MW performance (most advanced commercial BiCMOS SiGe 55 nm process by STMicroelectronics exhibits  $f_T > 300$  GHz [1]). Compared to bulk CMOS, the BiCMOS allows a much higher cut-off frequency at a given technology node. To reach similar frequency, bulk CMOS designs have to use much smaller process nodes, forcing compromises on the design and leading most of the time to overall lower performance.

However, there are different reasons why often it is preferable to implement mixed-signal systems using CMOS technology. The most substantial one is definitely the cost difference due to the more complex fabrication process, because BiCMOS requires a higher number of masking layers. So, to reduce system cost and to enhance power efficiency, CMOS implementations of RF circuits are very attractive since most of the RF complex functions have been demonstrated on CMOS processes.

While the scaling process (today in its 3D power scaling phase [4]) enables higher integration levels leading to billions of transistors on a single chip, it also makes these silicon ICs poorer in analog processing performance and more susceptible to variations. Some of these variations are caused by the manufacturing process itself, particularly due to the stringent dimensional tolerances associated with the lithographic steps in modern processes [28]. The intra and inter-die variations in modern VLSI realizations translate into the degradation of important figures of merit, such as the standard deviation of devices' nominal parameters and the mismatch between different devices.

Due to these issues each circuital realization shows deviations in nominal parameters' values that introduce non-idealities in the analog processing chain. The effect of these errors is to lower the overall process gain and dynamic range of the system. Furthermore, they can badly compromise the correct operation of systems that rely on the consistency between different processing channels (e.g. Time-Interleaved and Quadrature Mirror Filter banks ADCs, beam forming using antenna arrays, monopulse radar). In this scenario, designers must face the heavy reduction in yield and performance of the RF CMOS implementations.

To cope with the impairments of the poor performance analog section in this "dirty RF" [37] scenario, two opposite approaches exist aimed at compensating the resulting errors: to design more complex analog processing architectures or to implement the correction in the digital domain using digital calibration algorithms.

Increasing precision, namely matching and linearity, by *analog design* is an expensive task in terms of power dissipation. To first order, matching accuracy is inversely proportional to component area. Therefore, additional precision requires larger components with larger capacitance and a resulting net increase in power dissipation [76] (higher transconductances to keep speed constant). Most of the time linearity is achieved using electronic feedback, obtaining precision in return for high gain. Achieving sufficient gain usually necessitates the use of complex amplifiers that tend to be suboptimal in terms of speed and noise [76].

In the paradigm of *digital calibration*, the demanding requirements on analog circuits' precision can be traded with digital processing schemes complexity. This solution, typically applied to A/D converters, is becoming more and more attractive thanks to the availability of a large digital processing power due to the ever increasing integration density of modern VLSI CMOS implementations  $(3 \text{ Mgate/mm}^2 \text{ for } 28 \text{ nm} \text{ process and over } 100 \text{ Mgate/mm}^2 \text{ for the Intel's } 10 \text{ nm } [3])$ . From a communication theory point of view the most common term for the post correction methods is "equalization". From an electronics engineering point of view the act of making a device or a system to work as close as the nominal performance can be more appropriately called "calibration".

#### 1.2 Digital calibration background

Digital calibration procedures consist in identifying a behavioral model of the device, estimating model parameters that describe the unwanted device behavior and then compensating the device response. The aim of calibration is to maximize a specific or multiple figures of merit, as in a multivariate optimization problem in which the variables are the model parameters, for example linearity in terms of SFDR or SNDR and process gain. A number of research activities are reported in the literature on the compensation of errors caused by analog circuits' non-idealities using digital post processing. Most of them have been applied to A/D converters and address the compensation of static and dynamic nonlinearities. While static errors are easier to model and to correct, frequency dependent distortions require more complex models to be taken into account. Different techniques exist to compensate dynamic nonlinear systems, the more common are based on Look-Up Tables (LUT) [47, 98], on Feed-Forward [43, 65] and Feed-Back [46, 50] architectures and on the inverse system cascade [87, 13].

The growing demand of high speed ADCs is directly linked to the current trend in pushing the digital domain ever more closer to the antenna in the receiver chain. Variations in circuit's parameter and transistor nonlinearities produce the degradation of the THD and thus converter's effective resolution reduction. Some methods are based on the modeling of specific ADC's error sources and non-linear contributions like nonlinearity of the amplifier's open-loop gain, offset in the comparators and capacitors' mismatch [36, 21, 51]. The application of digital calibration in [21] shows a linearity improvement of more than 30 dB on SFDR, with resolution going from 7 to 12 bits, and a substantial reduction on the standard deviation of the nominal performance after the calibration process using Monte Carlo simulations.

A large part of ADC calibration algorithms are based on more general models both for direct or inverse system modeling. The most common is the Volterra series [96], very important for its general approximation capability of nonlinear systems and at the same time infamous for its implementation complexity. Volterra model and its subsets are widespread in pre-distortion applications to enhance the efficiency of power amplifiers in transmission chains. Applications at the receiver side are less common also for the reduced generality of the model adopted, that depends on the receiver architecture. In [93] a nonlinear equalizer (NLEQ) processor able to compensate the nonlinear behavior of commercial-off-the-shelf (COTS) low-pass systems has been designed. The processor uses a sparse Volterra representation to keep the complexity to a minimum even for higher memory lags.

Results suggest that digital calibration techniques can be used together with relaxed analog design specifications, if the cost of better analog circuits, even though their design is feasible, is lower than the cost of the additional digital resources required by calibration [69]. On the other hand digital calibration can be also used to keep performance to a target level while lowering analog circuits' power consumption.

#### **1.3** Research framework – Scope of the Thesis

The research activity carried out in this thesis is focused on the identification and compensation of nonlinearities arising in the analog front-end and in the A/D converters of receiver chains, paying particular attention in low complexity digital implementations. For this reason the study is devoted to post distortion methods, using approximate nonlinear inverse of the nonlinear system under calibration. This line of research has driven me to deal with three different target architectures and calibration methods, closely related to the aforementioned issues:

- ADC pipeline calibration using Volterra model
- Digital-IF sub-sampling receiver modeling and compensation using an extended band-pass Volterra model
- Time-Interleaved ADC calibration using filter banks

The first two target architectures share the same methodology for the identification of the unknown parameters of the LIP models. The difference relies in the models: the low-pass Volterra model is adopted for the pipeline calibration while a new pass-band Volterra model is proposed to calibrate the sub-sampling receiver, taking into account harmonic distortions generated by the active anti-aliasing filter and inter-modulation distortions aliasing in critical sampling applications. The aim of this thesis is twofold: on one hand the analysis and the formalization of the non-linear behavior of the target architectures, in both single and multiple channels implementation, and on the other hand the development of calibration techniques able to compensate these analog impairments in the digital domain. Due to the huge cost difference between commercial devices that show a few dB's difference in dynamic range specifications, the design of calibration techniques that can improve the linearity of ADCs or entire receiver chains and increase the overall system dynamic range by only a few dBs, is worth pursuing.

To achieve the scope of this thesis, different problems are addressed, including:

- Complexity reduction of the Volterra model without sacrificing generality
- Design of efficient input data sets to be used in resource consuming transient simulations for the evaluation of post calibration system performance
- Modeling and compensation of sub-sampling digital-IF receiver distortions
- Closed-form multi-rate signal processing calculations in TI-ADCs

Demonstrating overall system calibration performance on a specific technology node enables the inclusion of the digital process gain in the design phase making the digital processing block embedded in the system itself.

#### 1.4 Thesis Outline

The rest of this thesis is organized as follows:

- Chapter 2 aims to give an overview of the methods and the theoretical basis of nonlinear system modeling and identification used throughout the thesis, in particular an analysis limited to the identification of linear-in-the-parameters (LIP) models using deterministic input-output data sets. The Volterra series is presented along with an overview of its most commonly used subsets and the definiton of p-th order inverse is given. Compensation techniques for nonlinear dynamic systems are described focusing on the post-distortion using inverse system cascade. The methodology to identify a post inverse LIP system is outlined giving the basic mathematical tools needed for offline and online linear parameters' estimation, i.e. Least Squares and Recursive Least Squares methods, and a description of input excitation design theory. The chapter concludes with practical considerations for the implementation of the described technique.
- **Chapter 3** discusses digital calibration techniques devoted to ADC nonlinear compensation. The theory of operation of the A/D converter is described and an overview of the most important ADC architectures is presented. A focus is given on the pipeline ADC with 1.5-bit stages and on the redundancy mechanisms that provide robustness against comparators' offsets. An overview of the existing ADC calibration techniques is carried out, describing static and dynamic post compensation techniques with a particular focus on model inversion methods. A calibration technique based on the direct estimation of

the post-inverse Volterra series is applied on a Sample and Hold simulated using the 45 nm process by STMicroelectronics. PVT robustness checks are performed and an iterative backward pruning procedure is introduced. The same technique in conjunction with a radix calibration is applied to on a 16 1.5-bit stages pipeline ADC. Results using the proposed pruning procedure are shown and compared with the literature.

- **Chapter 4** addresses the problem of post compensation in Time-Interleaved ADCs (TI-ADCs) that require an ad-hoc analysis of non-idealities that cause distortions. Two approaches for the calibration of 4-channel TI-ADC are presented: first, the correction method based on perfect reconstruction (PR) filter banks is described and a closed-form solution for the 4-channel architecture is derived and demonstrated with behavioral simulations. Second, a background calibration technique is described using cyclo-stationary filter banks to approximate the reconstruction filters. Complexity reduction is carried out both in the adopted models and in the digital filter implementation and convergence speed versus linearity is discussed.
- **Chapter 5** deals with the application of post compensation techniques to a receiver chain. Baseband Volterra model is derived and a novel extended model is proposed able to represent and compensate nonlinearities in sub-sampling digital-IF receivers. In such architectures that implement frequency selectivity at IF using active band-pass filters, the aliasing of out-of-band harmonics due to finite attenuation is taken into account. An offline calibration technique using the new model is validated by means of circuital simulations of a 10-th order pass-band active anti-aliasing filter implemented using 5 cascaded biquad stages in 45 nm process by STMicroelectronics. Validation of the identified post-inverter is carried out using out-of-sample QAM signals and a combination of QAM and strong in-band sinusoidal signals. The iterative pruning technique is used to reduce complexity with negligible impact on linearity.
- **Chapter 6** extends the calibration technique described in Chapter 5 to a multichannel array receiver. A wideband calibration architecture based on Volterra filters is described and simulated using a mixed behavioral and circuital approach. The Volterra model of a digital-IF receiver is extracted using input-output data from circuital simulations. Statistical variations on the model parameters are added to obtain an array of receivers with heterogeneous responses. The performance of the array in terms of directivity and linearity are compared before and after the digital calibration. The Chapter concludes with an analysis of complexity and possible parallel realizations of the calibration architecture.
- Chapter 7 concludes the thesis describing the achieved results and the open issues that will be addressed in future research activities

A list of the publications stemmed from this research activity is reported at the end of the thesis.

### Chapter 2

# Nonlinear system modeling and estimation

This chapter aims to give an overview of the methods and the theoretical basis of nonlinear system modeling and identification used throughout the thesis. This discussion is not to be intended as a satisfactory and comprehensive survey on system identification, for which detailed and exhaustive references exists [16, 61], but a useful analysis limited to the identification of linear-in-the-parameters (LIP) models with deterministic methods. In Sect.2.1 the Volterra series is presented along with an overview of its most commonly used subsets and the definiton of p-th order inverse is given. Compensation techniques for nonlinear dynamic systems are described in Sect. 2.2 focusing on the post inverse system cascade. The methodology to identify a post inverse LIP system is outlined in Sect. 2.3 giving the basic mathematical tools needed for offline and online linear parameters' estimation, i.e. Least Squares and Recursive Least Squares methods, and a description of input excitation design theory. Practical considerations for the implementation of the described technique are given in Sect. 2.4.

#### 2.1 Nonlinear System Modeling

The field of nonlinear systems is enormous. The need for nonlinear models arises from many practical situations in which the input-output behavior of a system cannot be represented using the classical linear system theory. In the nonlinear regime, many problems are still open and many properties are no longer available: existence and uniqueness of solutions are not guaranteed, closed form formula are difficult to come by, linear superposition can't be applied. Focusing on analog circuits for communications, most of the times the processing blocks are weakly nonlinear, in the sense that the dominant behavior is that of a linear system but mild nonlinear contributions can be identified. It is precisely these components that represent a limiting factor in applications that require high linearity (e.g. RF and ADC front-ends). The simplest approach to the modeling of a nonlinear system is the power series, that can be used to represent a memoryless nonlinearity up to order P:

$$y[n] = \alpha_1 x[n] + \alpha_2 x^2[n] + \dots + \alpha_P x^P[n]$$

$$(2.1)$$

Such model, though, is only able to describe static nonlinearities and is incapable to model more complex phenomena such as memory effects and frequency dependent nonlinearities. A more powerful and general model is the Volterra series.

#### 2.1.1 Volterra series

The Volterra series [66], developed by Vito Volterra in 1887, is one of the most common representations of nonlinear systems. Due to its intuitive structure and universal approximation capabilities it has received a considerable attention from researchers of different areas, especially in the fields of electronics and communications [38]. The discrete-time Volterra series can be written as:

$$y[n] = \sum_{k=1}^{\infty} y_k[n] \qquad y_k[n] = \sum_{q_1=0}^{\infty} \cdots \sum_{q_k=0}^{\infty} h_k[q_1, \dots, q_k] \cdot \prod_{i=1}^k x[n-q_i]$$
(2.2)

where  $h_k[q_1, \ldots, q_k]$  is the k-th order Volterra kernel. It represent a class of polynomial models that can be viewed as a multidimensional extension of the linear convolution, which makes easy the derivation of its Fourier transform representation. When limited to a finite order and a finite memory support, the truncated Volterra series is written as:

$$y[n] = \sum_{k=1}^{K} y_k[n] \qquad y_k[n] = \sum_{q_1=0}^{L-1} \cdots \sum_{q_k=0}^{L-1} h_k[q_1, \dots, q_k] \cdot \prod_{i=1}^{k} x[n-q_i]$$
(2.3)

with K the maximum order and L the maximum lag. The advantages of this model are principally two: it is stable in the bounded-input bounded-output (BIBO) sense and most of all it is linear in the parameters. This enables the use of linear estimation algorithms to identify the parameters of such a model. Although the representation (2.3) has the same memory for all kernel orders, the most general case allows a different memory for each order:

$$y[n] = \sum_{k=1}^{K} y_k[n] \qquad y_k[n] = \sum_{q_1=0}^{L_1-1} \cdots \sum_{q_k=0}^{L_k-1} h_k[q_1, \dots, q_k] \cdot \prod_{i=1}^k x[n-q_i]$$
(2.4)

A considerable reduction in the number of parameters can be performed exploiting the symmetry property of the kernels. The Volterra series is symmetric if, for each order k, kernels with different combinations of the same indices are equivalent. We can express each symmetric kernel as:

$$\tilde{h}_k[q_1, \dots, q_k] = \frac{1}{k!} \sum_{\pi(\mathbf{q})} h_k[q_1, \dots, q_k]$$
(2.5)

where  $\pi(\cdot)$  is the permutation operator and **q** is the vector of indices  $q_1, \ldots, q_k$ . Using the symmetric kernels, the Volterra series becomes:

$$y[n] = \sum_{k=1}^{K} y_k[n] \qquad y_k[n] = \sum_{q_1=0}^{L_1-1} \sum_{q_2=q_1}^{L_1-1} \cdots \sum_{q_k=q_{k-1}}^{L_k-1} \tilde{h}_k[q_1,\dots,q_k] \cdot \prod_{i=1}^k x[n-q_i] \quad (2.6)$$

From now on, all the models adopted will be considered symmetric, so we will neglect the tilde for ease of notation. If we set the memory of all kernels to zero, we obtain the power series.

A disadvantage of the Volterra series is that its basis polynomials are not orthogonal. This implies that kernel values are correlated: when estimating a model, a higher number of parameters can give a worse performance in fitting capabilities. To overcome this problem, different orthogonal expansions of the Volterra functionals have been proposed. The most famous are the Wiener functionals  $G_n$  obtained applying the Gram-Schmidt orthogonalization procedure respect to the Wiener process to the Volterra functionals. The advantage of the orthogonality property is that it allows the Wiener kernels to be measured by cross-correlation techniques using a white gaussian input.

The main drawback of Volterra series concerns the exponential growth in parametric complexity implying the need to estimate a huge number of parameters. Many kind of model complexity reduction exist based on some assumptions that make the model lose its general representation capabilities. In the following we describe the most common lower complexity models derived from subsets of Volterra kernels [74].

#### Wiener model

The Wiener model is a special case of Volterra model obtained using the condition of kernel separability  $h_k[q_1, q_2, \ldots, q_k] = \alpha_k h[q_1] \cdot h[q_2] \cdots h[q_k]$ :

$$y_W[n] = \sum_{k=1}^{K} \alpha_k \left[ \sum_{q=0}^{L} h[q] x[n-q] \right]^k$$
(2.7)

This model can be represented as the cascade of a linear memory model followed by a memoryless polynomial nonlinearity, as shown in fig.2.1.



Figure 2.1. Wiener model

#### Hammerstein model

The Hammerstein model is obtained setting to zero all the off-diagonal terms of the Volterra series. The terms that remain have  $h_k[q_1, q_2, \ldots, q_k] = \beta_k g[q]$  for  $q_1 = q_2 = \cdots = q_k = q$ . The output of a Hammerstein model can be written as:

$$y_H[n] = \sum_{q=0}^{L} g[q] \sum_{k=1}^{K} \beta_k x^k [n-q]$$
(2.8)

This model can be represented as the cascade of a memoryless polynomial nonlinearity followed by a linear memory model, as shown in fig.2.2.



Figure 2.2. Hammerstein model

#### Memory Polynomial

An extension of the Hammerstein model is given by the Memory Polynomial (MP), in which different filters g[q] are used for different kernels. We obtain:

$$y_{MP}[n] = \sum_{q=0}^{L} \sum_{k=1}^{K} \beta_k[q] x^k[n-q]$$
(2.9)

The memory polynomial model has been used for predistortion of actual power amplifiers under typical operating conditions [53].

#### 2.1.2 *p*-th order inverse Volterra model

The theory of p-th order inverse of a nonlinear system has been developed by Schetzen and published in 1976 [87]. We define a pth-order inverse of a given nonlinear system, H, as one, when connected in tandem with H, results in a system in which the second through the pth-order Volterra kernels are zero. Thus, calling T the system operator of the two systems connected in tandem, we can write:

$$T\{x[n]\} = x[n] + \sum_{k=p+1}^{\infty} T_k\{x[n]\}$$
(2.10)

where  $T_k$  is the k-th order Volterra operator of the system T. An important property of the p-th order inverse model is that, irrespective of its position in the cascade making up the nonlinear system, the pre-inverse and the post-inverse models are identical. In the case that the two systems don't load each other (true in digital processing), the order or the tandem connection only affects the residual terms of order greater than p. The p-th order inverse is computed from the structure of the nonlinear model solving a set of equations that arise from the cascade of two Volterra systems. It becomes a difficult task for orders higher than 5-7. In [85] a solution based on a recursive method has been proposed that reduces the complexity implementation of a p-th order inverse.

## 2.2 Compensation techniques for nonlinear dynamic systems

Four main approaches exist for the compensation of the nonlinearities in dynamic systems:

**LUT based** Look-Up Tables can be used to compensate nonlinear systems with short memory. Two main types are: Phase-Plane (using amplitude and slope

of the current sample) and State-Space (using amplitudes of present and past samples). This method is typically adopted in ADC calibration [47, 98] and will be analyzed in subsection 3.4.2.

- **Feed-Forward architectures** consist on the subtraction of the distortions from the main signal path that are regenerated by means of digital processing on a parallel path. Adaptive Interference Canceling (AIC) is based on feed-forward architectures [43, 65, 88, 101]
- **Feed-Back architectures** consist on using feedback loops to generate pre-distorter or post-distorter blocks used in connection with linear blocks to obtain an overall linear system with desired dynamics. Hirschorn's method is based on a feedback architecture [46, 50].
- **Inverse System Cascade** consists in the series connection of the pre or the postinverse system before or after the forward system [87, 13]. Adaptive implementations of cascaded inverse systems make obviously use of feed-forward or feed-back signal processing paths.

Inverse systems are widely used in many disciplines both in linear and nonlinear applications. In the field of communications the main usage is for equalization. Nonlinear behavior of power amplifiers when driven with low back-offs in transmission chains can be compensated using pre-distortion, i.e. an approximation of the system pre-inverse used to linearize the amplifier response. Nonlinear dynamics in receiver chains or in sensors require instead to be compensated using a post-distorter. The problem of finding the inverse of a nonlinear system is a nontrivial task requiring knowledge in system identification and modeling and can be carried out using different methods. Focusing on the fourth approach, adopted throughout the thesis, it is clear that when the target application is the nonlinear post-compensation we are interested in identifying a post-inverse system, as shown in Fig. 2.3. There are



**Figure 2.3.** Post-inverse system  $S^{-1}$  in series with the forward system S realizes the condition z[n] = x[n]

two practical ways to estimate a post-inverse system [50], graphically represented in Figs.2.4a and 2.4b :

- a) First, forward system  $\hat{S}$  is estimated from input x[n] and output y[n], then the post inverse system  $\hat{S}^{-1}$  is calculated
- b) The post inverse system  $\widehat{S^{-1}}$  is directly estimated using y[n] as input and x[n] as output

In the first method (Fig.2.4a), when dealing with truncated Volterra series, the pth-order inverse [46, 34] has to be computed. This task can become very

hard depending on the order and memory of the forward system. The second method (Fig.2.4b), used also in Indirect Learning Architecture (ILA), is the most straightforward and flexible way to identify an approximation of the post-inverse system. In [97] was shown that the analytical Volterra inverse gives much lower



Figure 2.4. Two ways to approximate a post-inverse system in an offline compensation: mathematical inversion of the estimation of the forward system (a) and direct estimation of post-inverse system (b)

computational complexity than the direct estimated inverse. However, for the simple architecture, the flexibility and the ease of application, the second method is adopted throughout the thesis for finding the post-distorters to compensate ADC and RF front end nonlinearities. We are thus interested in acquiring and further define the methodology for system identification techniques.

#### 2.3 System Identification elements

System Identification consists in determining a mathematical model of a system of interest using a-priori (e.g. error and noise statistics) and a-posteriori (experimental observations) information [16].

The methodology to approach system identification can be summarized in these steps:

- 1) Input stimuli design that is a persistent excitation (PE) for the system under test
- 2) Model structure selection
- 3) Choosing a criterion to measure the "quality" of the estimated model
- 4) Perform the system identification according to steps 1, 2 and 3
- 5) Validate the estimated model using a new out-of-sample set of data

The first three points are the basis of the estimation process and they're not independent. The design of input signals is driven by some knowledge or assumption on the system model and by physical constraints like bandwidth and input-output dynamics. The selection of the model structure can be carried out using information obtained by some preliminary analysis such as a frequency response function that can give a coarse idea of the system behavior.

The estimation process is an optimization problem that requires the minimization of a specified cost function and can be convex or non-convex . The choice of the estimator depends on the type of the problem (linear or nonlinear), on the kind of application in which the estimation is needed (offline or adaptive calibration) and on the statistics of the signals. In a linear-in-the-parameters (LIP) model, if the errors are uncorrelated, have zero mean and equal variances the Gauss-Markov theorem states that the Least Squares (LS) estimator is the Best Linear Unbiased Estimator (BLUE), i.e. it has the lowest variance on the estimated parameters with respect to the other linear unbiased estimators. For this reason LS is a common choice in offline LIP estimation problems and it's used in this thesis in the calibration procedures that don't require adaptive mechanisms.

The final validation of the estimate is required to assess whether the model performs well on new sets of data or overfitting of the model occurred. In the latter case the estimated model describes the system behavior only for the particular realization of the measurements and is not usable in real operational conditions when different data is processed by the system.

In this thesis we focus on truncated Volterra models identification in the time domain. Time-series analysis leads directly to estimates of the model parameters unlike the non-parametric estimates in the frequency domain.

#### 2.3.1 Linear Least Squares estimator

When a LIP model is considered the input-output relation of the system can be easily written using matrix notation:

$$\mathbf{y} = \mathbf{X}\mathbf{h} + \mathbf{w} \tag{2.11}$$

where

**y** is the  $N \times 1$  output samples vector  $[y[0] y[1] \cdots y[N-1]]^T$ 

 $\mathbf{X}$  is the  $N \times P$  input samples matrix expansion

**h** is the  $P \times 1$  vector of parameters to be estimated

 $\mathbf{w}$  is the  $N \times 1$  noise vector

With the only assumption that **w** has zero mean, we want to find the parameters **h** that minimize the squared  $l_2$ -norm of the error  $\mathbf{e} = \mathbf{y} - \mathbf{X}\mathbf{h}$ . The cost function thus becomes:

$$J(\mathbf{h}) = \|\mathbf{e}\|_2^2 = (\mathbf{y} - \mathbf{X}\mathbf{h})^T (\mathbf{y} - \mathbf{X}\mathbf{h})$$
(2.12)

The parameters are obtained solving the convex problem:

$$\hat{\mathbf{h}} = \arg\min_{\mathbf{h}} J(\mathbf{h}) \tag{2.13}$$

To find the global minimum of  $J(\mathbf{h})$  we differentiate it and set the derivative to zero:

$$\frac{\partial J(\mathbf{h})}{\partial \mathbf{h}}\Big|_{\mathbf{h}=\hat{\mathbf{h}}} = -2\mathbf{X}^T \mathbf{y} + 2\mathbf{X}^T \mathbf{X}\hat{\mathbf{h}} = 0$$
(2.14)

The Least Squares Estimator is then:

$$\hat{\mathbf{h}} = (\mathbf{X}^T \mathbf{X})^{-1} \mathbf{X}^T \mathbf{y} = \mathbf{S}^{-1} \boldsymbol{\theta}$$
(2.15)

where  $\mathbf{S} = \mathbf{X}^T \mathbf{X}$  and  $\boldsymbol{\theta} = \mathbf{X}^T \mathbf{y}$ . The LS parameters estimation can be easily extended to a batch of measurements. If we have M input-output data sequences we can write:

$$\begin{bmatrix}
\mathbf{y}_{1} \\
\mathbf{y}_{2} \\
\vdots \\
\mathbf{y}_{M}
\end{bmatrix} = \begin{bmatrix}
\mathbf{X}_{1} \\
\mathbf{X}_{2} \\
\vdots \\
\mathbf{X}_{M}
\end{bmatrix} \mathbf{h} + \begin{bmatrix}
\mathbf{w}_{1} \\
\mathbf{w}_{2} \\
\vdots \\
\mathbf{w}_{M}
\end{bmatrix}$$
(2.16)

#### 2.3.2 Recursive Least Squares

In applications that require to track model variations (e.g. depending on temperature) an adaptive calibration is needed because the parameters vector changes over time. It is useful in terms of computational cost and latency not to start from scratch with the estimation process but to update the parameters vector using the old estimation and the new gathered data. Considering the estimator in Eq.2.15 obtained from training sequences of N samples, a new estimator can be calculated adding the new rows  $(\mathbf{x}_{N+1}, y_{N+1})$  to the data set:

$$\hat{\mathbf{h}}_{N+1} = \left( \begin{bmatrix} \mathbf{X}_N \\ \mathbf{x}_{N+1} \end{bmatrix}^T \begin{bmatrix} \mathbf{X}_N \\ \mathbf{x}_{N+1} \end{bmatrix} \right)^{-1} \begin{bmatrix} \mathbf{X}_N \\ \mathbf{x}_{N+1} \end{bmatrix}^T \begin{bmatrix} \mathbf{y}_N \\ y_{N+1} \end{bmatrix} = \mathbf{S}_{N+1}^{-1} \boldsymbol{\theta}_{N+1}$$
(2.17)

The matrix  $\mathbf{X}_N$  acquires the new row  $\mathbf{x}_{N+1}$ , a  $1 \times P$  vector, and the vector  $\mathbf{y}_N$  the new row with the sample  $y_{N+1}$ . We have:

$$\mathbf{S}_{N+1} = \mathbf{X}_{N+1}^T \mathbf{X}_{N+1} = \begin{bmatrix} \mathbf{X}_N^T & \mathbf{x}_{N+1}^T \end{bmatrix} \begin{bmatrix} \mathbf{X}_N \\ \mathbf{x}_{N+1} \end{bmatrix} = \mathbf{S}_N + \mathbf{x}_{N+1}^T \mathbf{x}_{N+1}$$
(2.18)

$$\boldsymbol{\theta}_{N+1} = \mathbf{X}_{N+1}^T \mathbf{y}_{N+1} = \begin{bmatrix} \mathbf{X}_N^T & \mathbf{x}_{N+1}^T \end{bmatrix} \begin{bmatrix} \mathbf{y}_N \\ y_{N+1} \end{bmatrix} = \boldsymbol{\theta}_N + \mathbf{x}_{N+1}^T y_{N+1}$$
(2.19)

For adaptive purposes, we add exponential windowing of data:

$$\mathbf{S}_{N+1} = \lambda \mathbf{S}_N + \mathbf{x}_{N+1}^T \mathbf{x}_{N+1} \tag{2.20}$$

$$\boldsymbol{\theta}_{N+1} = \lambda \boldsymbol{\theta}_N + \mathbf{x}_{N+1}^T y_{N+1} \tag{2.21}$$

where  $0 < \lambda \leq 1$  is the "forgetting factor" which gives exponentially less weight to older samples. The matrix inversion lemma [45] can be used to find the expression of  $\mathbf{S}_{N+1}^{-1}$ :

$$\left(\lambda \mathbf{S}_N + \mathbf{x}_{N+1}^T \mathbf{x}_{N+1}\right)^{-1} = \frac{1}{\lambda} \left( \mathbf{S}_N^{-1} - \frac{\mathbf{S}_N^{-1} \mathbf{x}_{N+1}^T \mathbf{x}_{N+1} \mathbf{S}_N^{-1}}{\lambda + \mathbf{x}_{N+1} \mathbf{S}_N^{-1} \mathbf{x}_{N+1}^T} \right)$$
(2.22)

Assuming  $\mathbf{S}^{-1} = \mathbf{P}$ , we can summarize the results as follows:

$$\begin{pmatrix}
\mathbf{P}_{N+1} = \frac{1}{\lambda} \left( \mathbf{P}_N - \frac{\mathbf{P}_N \mathbf{x}_{N+1}^T \mathbf{x}_{N+1} \mathbf{P}_N}{\lambda + \mathbf{x}_{N+1} \mathbf{P}_N \mathbf{x}_{N+1}^T} \right) \\
\boldsymbol{\theta}_{N+1} = \lambda \boldsymbol{\theta}_N + \mathbf{x}_{N+1}^T y_{N+1} \\
\hat{\mathbf{h}}_{N+1} = \mathbf{P}_{N+1} \boldsymbol{\theta}_{N+1}
\end{cases}$$
(2.23)

A smaller  $\lambda$  makes the estimation more sensitive to recent samples, which means more fluctuations in the estimator coefficients. This recursive formulation requires the initialization values  $\mathbf{P}_0$  and  $\boldsymbol{\theta}_0$ . One way can be to gather the first N samples and to compute these values in a non recursive manner. Otherwise, the following initialization values can be assumed [14]:

$$\mathbf{P}_0 = \delta^{-1} \mathbf{I} \tag{2.24}$$

$$\boldsymbol{\theta}_0 = \mathbf{0} \tag{2.25}$$

where  $\delta$  is a very small positive constant.

#### 2.3.3 Input excitation design

Input excitation design is a key aspect of system identification. Intuitively, the essential idea is that the input stimuli shall sufficiently excite the system under test to produce enough information in the output sequences to allow the exact model estimate. This behavior is described by the persistence of excitation (PE) property, which, for deterministic inputs in a truncated Volterra model of order K and memory span M, can be described as follows [78].

Given the sample matrix  $\mathbf{S}_N = \mathbf{X}_N^T \mathbf{X}_N$ , built over an observation period of length N, and  $\lambda_{min}$  and  $\lambda_{max}$  its minimum and maximum eigenvalues, if these values are bounded by two arbitrary chosen  $\rho_1, \rho_2 > 0$  independently of the time index n, then the input sequence is said PE of degree M and order K. The PE condition for Volterra systems depends on both memory and order, and is related to the conditioning number of the sample matrix.

Different kinds of input excitation can drive the system identification process: random signals [40], pseudo-random binary and multilevel sequences [78, 31], multisines [12] and impulses. Some studies focus on the identification process using the same kind of signals that the device will process when operating, such as PSK and QAM modulated waveforms [25, 94]. It is possible that this approach could not give good results in terms of channel equalization when other types of signals (e.g. sinusoidal interferers) are processed by the system.

In this thesis we focus on parameters identification using multisines excitations mainly for two reasons: the first is to prefer deterministic inputs of limited length due to the high computational power required by circuit-level transient simulations used to assess calibration performance. Random inputs require longer sequences to fulfill the expected statistics on high-order moments. The second is to foresee a laboratory implementation working on RF receivers up to Ku and Ka bands and the generation of impulses or multi-level sequences with sharp edges is difficult to achieve at that carrier frequencies. A real discrete-time multisine can be written as:

$$x[n] = \sum_{m=1}^{M} A_m \sin\left(2\pi \frac{f_m}{f_s}n + \phi_m\right)$$
(2.26)

where  $A_m$ ,  $f_m$  and  $\phi_m$  are respectively the amplitude, the frequency and the phase of the *m*-th sinusoidal component. The selection of the multisine's parameters is done to optimize the estimate. When estimating Volterra Frequency Response Functions (VFRF) one way is to select a combination of  $f_m$ 's such as kernels with different order produce output signals with components at different frequencies. It should be mentioned that complete separation of the components of different order by frequency separation is impossible[15]. To determine a *n*-th degree VFRF, that is a *n*-dimensional function, a multisine with *n* frequencies is required[12]. To obtain different samples of the VFRF values in a given bandwidth, many tests must be carried out that may take a lot of time. The frequency separation property of the kernels becomes unfeasible when dealing with high order models and an efficient DFT representation with a small number of bins.

A different approach consists in using a batch of multisines that produce an overdetermined system of equations. The use of different combinations of frequencies can remove the uncertainty leaved by the nonlinear products mapped on the same bins. Since we are dealing with discrete-time Volterra kernels estimation, a good choice is to use coherent sampling, i.e. to select input frequencies whose ratio to the ADC clock frequency is a rational number. Assuming that a  $N_{DFT}$ -point DFT is foreseen to perform spectral analysis for pre and post calibration performance assessment, the discrete set of distinct frequencies is determined by:

$$f_m = \frac{J_m}{N_{DFT}} \cdot f_s$$
 with  $J_m \in \mathbb{N}, \ 1 \le J_m \le \frac{N_{DFT}}{2}$  (2.27)

The adopted method to select the frequencies of the multisines consists of two steps:

- Search the combination of n frequencies that, passing through a n-order polynomial nonlinearity, produce the highest number of components lying on distinct DFT bins. In this process, aliasing of nonlinear products at frequencies higher than Nyquist is taken into account and spectrum folding occurring in a subsampling scenario can be implemented.
- 2) Select other combinations of frequencies that produce non-overlapping nonlinear contributions on the DFT bins not covered in step 1.

It is beneficial to use waveforms with different Peak-to-Average Power Ratio (PAPR) when estimating the model parameters in order to catch the system behavior at different power levels keeping the peak amplitude of the excitation constant. The PAPR of a sequence x[n] expressed in decibel is defined as follows:

PAPR = 
$$10 \log_{10} \frac{(\max |x[n]|)^2}{x_{rms}^2[n]}$$
 (2.28)

The PAPR of a multisine depends on the amplitude and the phase of each sinusoid. From an implementation point of view, a simple approach is to use multisines with a different number of sinusoids keeping the amplitude and phases equal. For example, the value of PAPR goes approximately from 3.5 to 7dB using multisines with 2 to 5 tones with constant amplitude and zero phases. Using a batch of multisines with a different number of frequencies and PAPR values is somewhat similar to generate multisines with different peak amplitude levels (used in the polynomial interpolation method exploiting Vandermonde matrix [15]) because the power of each component has different values while keeping constant the signal envelope.

#### 2.4 Practical Volterra post-inverse system estimation



Figure 2.5. Direct estimation of post-inverse system. Ideal (a) architecture and practical implementation (b) that uses the reference signal r[n] considering the unavailability of the real input data u[n]

The direct post-inverse estimation of a system uses the output signal y[n] and a reference signal r[n] that represents the post-compensation target. Referring to the scheme in Fig.2.5b we can write the LS estimate of the parameters:

$$\hat{\mathbf{h}} = (\mathbf{Y}^T \mathbf{Y})^{-1} \mathbf{Y}^T \mathbf{r}$$
(2.29)

- .

When a truncated Volterra model is adopted as the post-distorter,  $\mathbf{Y}$  is the  $N \times P$ Volterra matrix expansion of the signal y[n]. The number of parameters P is the sum of the parameters of each sub-matrix in the model. For a K-order model we can write:

$$\mathbf{Y} = \begin{bmatrix} \mathbf{Y}^{(1)} \, \mathbf{Y}^{(2)} \cdots \mathbf{Y}^{(K)} \end{bmatrix} \qquad \qquad \mathbf{\hat{h}} = \begin{vmatrix} \mathbf{h}^{(1)} \\ \mathbf{\hat{h}}^{(2)} \\ \vdots \\ \mathbf{\hat{h}}^{(K)} \end{vmatrix} \qquad (2.30)$$

The first order sub-matrix of  $\mathbf{Y}$  is a  $N \times P_1$  Toeplitz matrix that realizes the linear discrete convolution with  $\hat{\mathbf{h}}^{(1)}$ . Using the notation  $y[n-q] = y_q$ , we have:

$$\mathbf{Y}^{(1)} = \begin{bmatrix} y_0 & 0 & \cdots & 0 \\ y_1 & y_0 & \cdots & 0 \\ y_2 & y_1 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ y_{N-1} & y_{N-2} & \cdots & y_{N-P_1} \end{bmatrix} \qquad \qquad \mathbf{\hat{h}}^{(1)} = \begin{bmatrix} \hat{h}_0 \\ \hat{h}_1 \\ \hat{h}_2 \\ \vdots \\ \hat{h}_{N-1} \end{bmatrix}$$
(2.31)

The number of columns of higher order sub-matrices depends on the order and the memory span of the model. For example, considering a symmetric model with memory span  $L_2 = 1$  that gives  $P_2 = 3$ , the second order sub-matrix and the parameters sub-vector are:

$$\mathbf{Y}^{(2)} = \begin{bmatrix} y_0^2 & 0 & 0\\ y_1^2 & y_0^2 & y_1 y_0\\ y_2^2 & y_1^2 & y_2 y_1\\ \vdots & \vdots & \vdots\\ y_{N-1}^2 & y_{N-2}^2 & y_{N-2} y_{N-1} \end{bmatrix} \qquad \qquad \mathbf{\hat{h}}^{(2)} = \begin{bmatrix} \hat{h}_{00}\\ \hat{h}_{11}\\ \hat{h}_{01} \end{bmatrix}$$
(2.32)

The actual input of the system x[n] is not available, the reference r[n] is an ideal representation of the input signal that represents the post-compensation goal.

$$r'[n] = x[n]_{ideal}$$
  $r[n] = Ar'[n-d]$  (2.33)

We can choose a delay offset and a scale factor of the reference signal to drive the estimation process towards a profitable implementation of the correction filter. The advantages in appropriately choosing the delay d and the amplitude A are discussed in the following.

- **Reference amplitude scaling**: when estimating Volterra kernels of an A/D converter, we expect that the system has approximately unitary gain. This translates in max  $|y[n]| \approx \max |x[n]_{ideal}|$ . Differently, if the inverse model to estimate is that of a receiving chain or a power amplifier there is an inputoutput gain of tens of dBs. In the latter case the reference shall not have the amplitude of the actual input x[n] (in this case the estimated filter would attenuate the output of the system down to the input amplitude). The reference should have the same amplitude of the output if we want the correction filter to have unitary linear gain. In general we should choose the amplitude of the reference as that of the desired post correction output.
- **Reference delay offset** : delaying the reference impacts the estimation of the unknown parameters. Typically, electronic circuits with dynamic nonlinearities show short memory effects because of exponential decaying response of the capacitive nodes. Thus to obtain a good compensation performance a short memory of the model should suffice. We can obtain this goal choosing a delay value that aligns the reference signal to the system output in the time domain. Conversely, a reference signal "far" from the output signal would require a Volterra model with higher memory lag to include the useful nonlinear terms, with many short-memory terms being useless (ideally zero). This behavior would produce a sparse Volterra model, that worsen the quality of the estimate due to the increased parameters number and conditioning of the sample matrix. Usually, linear lag of the compensation filter is sufficiently high to cover the input-output group delay, so in linear equalization this effect is negligible. Introducing the reference delay is equivalent to add a time shift in the Volterra model. The choice of the optimum delay can be driven by the post compensation performance using a fixed lag configuration. In low oversampling conditions, the optimum delay value could be also fractional.
# Chapter 3

# ADC digital calibration using post compensation

In modern digital radio receivers the system performance heavily rely on the Analogto-Digital converters specifications. When high quality RF analog front-end is employed, the ADC is the bottleneck for the system overall performance making very difficult to meet high dynamic range requirements. The reasons that motivate the use of digital post-compensation techniques stem from two main practical needs: the first is to overcome the performance of the converters available on the market when designing applications with beyond state-of-art specifications. The second is to enable the use of a converter with lower nominal performance with respect to the design requirements. This is the case when stricter constraints on power consumption or area limit the device choice or also the very common case when the ADC is integrated on the same chip with digital processing section using sub-micron CMOS processes less suited for high precision analog design.

In this chapter digital calibration techniques for the ADC nonlinear compensation are discussed and applied, and a method for model complexity reduction is proposed. In Section 3.1 the theory of operation of the A/D converter is described and in 3.2 an overview of the most important ADC architectures is presented. Sect. 3.3 describes the pipeline ADC with 1.5-bit stages and the redundancy mechanisms that provide robustness against comparators' offsets. In Sect. 3.4 an overview of the ADC calibration techniques is carried out, describing static and dynamic post compensation techniques with a particular focus on model inversion methods. In Sections 3.6 and 3.7 a calibration technique based on the direct estimation of the post-inverse Volterra series is applied on a Sample and Hold and on a pipeline ADC, simulated using the 45 nm process by STMicroelectronics, PVT robustness checks are performed and an iterative backward pruning procedure is introduced.

# 3.1 A/D Converters theory of operation

The Analog-to-Digital Converter is a mixed-signal electronic device that represents the boundary between the analog and the digital domain. It converts a continuous time (CT), continuous amplitude input signal into a discrete time (DT), discrete amplitude one. This process can be realized by two devices, as represented in Fig.3.1:



Figure 3.1. Concept scheme of an ideal A/D converter

a Sample and Hold (S/H) stage that samples the input signal s(t) at specific time instants  $nT_s$  and a quantization stage that maps the continuous amplitude of the sample into a discrete set of levels, also called codebook.

When the analog input is sampled by the S/H uniformly and instantaneously at a frequency  $f_s$ , the output spectrum will be the superposition of infinite replicas of the input one centered around multiples of the sampling frequency. Mathematically, the multiplication of a signal s(t) by a Dirac comb  $\sum \delta(t-nT_s)$  in the time domain is equal to the convolution of the input spectrum  $S(\omega)$  by the Dirac comb  $\frac{2\pi}{T_s} \sum \delta(\omega - m\omega_s)$ . For low-pass signals, if the input signal power is contained in the 0 to  $\frac{f_s}{2}$  band, which is the first Nyquist band, the aliases of the input spectrum won't be superposed, so the original signal will be represented without loss of information. The analog signal can be theoretically reconstructed using an ideal low-pass filter with  $f_c = \frac{f_s}{2}$ .

The uniform b-bit quantization process can be viewed as an operator  $\mathfrak{Q}_b$  that maps the amplitude of the signal  $s[nT_s]$  on a discrete set of equal size quantization regions  $\mathfrak{S}_i$ , with  $i \in \{1 \dots L\}$ , each of them associated to an output reconstruction level  $x_i \in \{x_j, j = 1 \dots L\}$ . Each quantization region is bounded by a lower and upper transition level,  $T_i$  and  $T_{i+1}$ . The distance between adjacent levels is the code width  $\Delta$  and the maximum analog input level is the full scale (FS). The FS and the number of bits of the quantizer limit the conversion accuracy with which the input samples can be approximated. Considering a symmetric quantizer, the input signal headroom goes from -FS to FS giving a total range of 2FS. The value of the code width is equal to:

$$\Delta = \frac{\text{FS}}{2^{b-1}} \tag{3.1}$$

An example of a 3-bit quantizer transfer function is shown in Fig.3.2, where a half code width shift has been applied to bound the maximum conversion error to  $\pm \frac{\Delta}{2}$ .

The quantization error  $e_q[n]$  is the difference between the actual value  $s[nT_s]$ and the quantized one x[n]. This error can be viewed as an additive random noise that generally is correlated with the input signal and assumes particular statistical properties under certain conditions. It is stated in [82] that if  $\Delta$  is sufficiently small and successive samples of the signal lie on distant quantization regions,  $e_q[n]$  is a stationary, white process with its samples uniformly distributed in  $\{-\Delta/2, \Delta/2\}$ showing zero mean and variance  $\sigma_q = \Delta^2/12$ . The same result can be derived also



Figure 3.2. Transfer function of an ideal 3 bit A/D converter. The linear fit of the mid points of the quantization regions  $\mathfrak{S}_i$  is a perfect straight line.

for a deterministic sinusoidal input, calculating the power of the quantization noise:

$$P_q = \frac{1}{\Delta} \int_{-\Delta/2}^{\Delta/2} t^2 \, dt = \frac{1}{\Delta} \frac{t^3}{3} \Big|_{-\frac{\Delta}{2}}^{\frac{1}{2}} = \frac{2}{\Delta} \frac{\Delta^3}{3 \cdot 8} = \frac{\Delta^2}{12} \tag{3.2}$$

Real A/D converters are also affected by other types of errors that arise from the imperfections of the actual implementation, both stochastic and deterministic. Mismatches in the components, clock jitter and non-linear behavior produce distortions at the converter output that, together with noise, limit its effective precision. In the following, the main Figures of Merit that describe static and dynamic ADC performance are briefly described.

#### **3.1.1** Performance Metrics

The most useful FoMs regard the noise and the linearity performance of the converter. From a static point of view, we are interested in how much the actual transfer function differs from the ideal one: we can evaluate it measuring the Differential Non Linearity (DNL) and the Integral Non Linearity (INL). The static characteristic of the converter can also show non-monotonicity and missing codes. We have a non-monotonicity when the output code decreases after the input voltage has increased, and a missing code when an output code never occurs for any possible value of the input.

**Differential Non Linearity**: The DNL is the difference between the actual and the ideal code bin width after correcting for static gain, divided by the ideal code width, evaluated for each quantization region and expressed in Least Significant Bits (LSB):

$$DNL_i = \frac{G\Delta_i^{real} - \Delta}{\Delta}$$
(3.3)

An ideal converter has an all-zero DNL vector. The presence of missing codes produces DNL values equal to -1. The DNL describes a local relative error. Fig. 3.3 shows a non-ideal transfer characteristic in which DNL measures are highlighted.



Figure 3.3. Transfer function of a non-ideal 3 bit A/D converter. The DNL is positive when the actual bin width is greater than the ideal one, negative when it is smaller.

**Integral Non Linearity**: The INL is a measure of the cumulative effect of DNL errors on the overall transfer function of the converter. From the standard "Terminology and Test Methods for Analog-to-Digital Converters" [6], the INL can be expressed as the difference between the ideal and the actual transition levels after correcting for static gain and offset :

$$INL_i = \frac{GT_i + V_{os} - T_i^{nom}}{\Delta}$$
(3.4)

From this expression it is clear that INL describes an absolute error because it is related to the nominal transition levels  $T_i^{nom}$ . Ideally, it should be an all-zero vector. Fig. 3.4 shows a non-ideal transfer characteristics in which INL measures are highlighted.

The dynamic characteristic of the converter can be described by FoMs that include the stochastic and deterministic behaviors. Stochastic errors are summarized by the Signal-to-Noise Ratio (SNR), which is the ratio between the power of a full swing sinusoid and the noise power. Deterministic errors can be linear (gain and offset) and non-linear, the latter more harmful for ADC performance. Non-linear distortions, like those generated by non-uniform sampling (Time-Interleaved ADCs will be analyzed in detail in Section 4), generate harmonic or spur components in the ADC output spectrum.



Figure 3.4. Transfer function of a non-ideal 3 bit A/D converter. The INL is positive when the actual transition precedes the ideal one, negative when it follows.

**Total Harmonic Distortion**: Given an input sinusoid of power  $P_s$  and frequency  $f_0$ , THD is the ratio between the root-sum-of-squares of all the harmonic distortion components including their aliases in the spectral output of the analog-to-digital converter and the power of the desired signal:

$$\text{THD} = \frac{\sum_{h} |X(f_h)|^2}{|X(f_0)|^2}$$
(3.5)

with h the first H harmonics of the frequency  $f_0$  (to catch only the harmonics without noise).

**Spurious Free Dynamic Range**: SFDR is the ratio between the power of the desired signal and the highest harmonic or spurious component:

$$SFDR = \frac{|X(f_0)|}{\max_{\substack{f_d \neq f_0}} |X(f_d)|}$$
(3.6)

It is frequency dependent, so different values of  $f_0$  will produce different SFDR values, typically worsen when increasing frequency. SFDR is an important FoM because sets the ratio between the strongest and the weakest signals that can be processed together by the ADC keeping the latter distinguishable (from spurs).

The FoM that takes into account both noise and distortions is the Signal-to-Noise and Distortions Ratio (SNDR, or SINAD), used to evaluate the effective resolution of the converter in terms of Effective Number Of Bits (ENOB). An upper bound for the SNDR is given by the Signal-to-Quantization Noise Ratio (SQNR), evaluated for an ideal ADC affected by the quantization noise only. **SNDR**: The power of an input full scale swing sinusoidal signal is:

$$P_{s} = \frac{1}{T} \int_{0}^{T} \text{FS} \sin^{2}(\omega t) \, dt = \frac{\text{FS}^{2}}{2}$$
(3.7)

if only the quantization noise is considered, the expression of the Signal-to-Quantization Noise Ratio can be derived remembering the Eq.3.1:

$$SNDR^{ideal} = SQNR = \frac{P_s}{P_q} = \frac{FS^2}{2} \cdot \frac{12 \cdot 2^{2b-2}}{FS^2} = 1.5 \cdot 2^{2b}$$
(3.8)

that, expressed in dB produces the famous 6dB per bit rule of thumb:

$$SNDR_{dB}^{ideal} = SQNR_{dB} = 1.76 + 6.02 \cdot b \tag{3.9}$$

In a real case also electronic noise and distortions will affect the converter, obtaining a lower value of the SNDR.

**ENOB**: The effective resolution of an ADC can be evaluated using the measured SNDR. An equation identical to (3.9) can be written with the real SNDR at the left side of the equation and the ENOB in place of the number of bits b:

$$SNDR_{dB} = 1.76 + 6.02 \cdot ENOB$$
 (3.10)

From this relation we obtain the effective number of bits of the converter:

$$ENOB = \frac{SNDR_{dB} - 1.76}{6.02}$$
(3.11)

#### **3.2** ADC Architectures overview

In this section we will give a brief description of the most important ADC architectures: Flash, Successive Approximation and the Sigma-Delta ( $\Sigma$ - $\Delta$ ). A focus on the pipelined ADC architecture will also be given for its widespread diffusion in medium speed and accuracy applications (< 1GS/s, 14 bits). The digital calibration techniques presented here are applicable to any ADC architecture but, depending on the specific errors to be compensated, they can be more or less effective. This description is in no way exhaustive but is sufficient in the context of the thesis to outline the field of application of the correction methods. More accurate discussions on A/D converters architectures can be found in [49, 82, 44].

#### 3.2.1 Flash ADC

The architecture of a Flash ADC is shown in Fig.3.5. It consists of a sample-and-hold stage, a linear voltage reference ladder, an array of comparators and a digital logic section that encodes the array output into a *b*-bit output. The theory of operation is very simple: the sampled input is compared with all the reference thresholds using the  $2^{b} - 1$  comparators. All the comparators whose reference threshold is lower than the input will output a "0", while all the others a "1". The ensemble of the comparators outputs forms a thermometer code with  $2^{b} - 1$  bits that is then translated into a



Figure 3.5. Flash ADC Architecture

*b*-bit output by the encoder. This is the simplest and fastest ADC architecture but, due to the exponential growing number of comparators and thresholds with respect to the number of bits, it is limited to low resolutions (< 8 bits) in practical implementations. The maximum achievable converter resolution is limited by the accuracy of the reference thresholds and by the comparators offsets whose precision must exceed *b*-bits resolution.

### 3.2.2 Pipeline ADC

The pipeline ADC divides the conversion process in many steps using many cascaded stages. The architecture of the first stage of a pipeline ADC is shown in Fig.3.6. At each stage, a  $b_j$ -bit conversion is carried out together with the computation of



Figure 3.6. First stage of a pipeline ADC

the residue  $r_j(t)$ , that is the quantization error of the stage extended over the input range. The pipeline introduces a latency in the conversion process equal to the number of stages. In the simpler case, 1-bit per stage conversion is carried out. Considering a symmetrical input range  $\pm V_R$ , the following steps are performed by each stage:

- 1) 1-bit ADC checks the input's sign and outputs a digital 1 or 0 whether it is positive or negative
- 2) The DAC outputs  $+V_R/2$  or  $-V_R/2$  accordingly, and this value is subtracted from the original analog input
- 3) The signal obtained is multiplied by 2 and fed to the following stage

In practical implementations, a higher number of bits per stage is used. Typically, due to circuital imperfections such as comparator offsets, redundancy is added at each stage overlapping the quantization regions: this consists in using more comparators than that needed for the effective resolution of the stage, and implementing digital correction using the information of the following stages. A scheme of the pipeline architecture is shown in Fig.3.7. A widespread architecture is the 1.5-bit per stage that will be analyzed in detail in Section 3.7.1. The advantage of pipeline converters



Figure 3.7. Pipeline ADC architecture with digital correction logic

is that the number of stages (comparators) is linearly dependent on resolution, instead of exponentially dependent, like in flash converters. These converters are more complex than flash ADCs, from the architectural point of view, and this causes the achievable sampling frequency to be lower.

#### 3.2.3 Successive Approximation ADC

The SA-ADC architecture is depicted in Fig.3.8. It consists of a Sample-and-Hold, a comparator, a Successive Approximation Register (SAR) and a Digital-to-Analog Converter (DAC).

The principle of operation is an iterative binary search through all possible output codes that converges to the best digital approximation minimizing the output of the DAC and the sampled input. For each conversion, the SAR changes a bit per clock cycle starting from the Most Significant Bit (MSB) of the digital code.



Figure 3.8. Successive Approximation ADC Architecture

Then, for each clock cycle, the present code is converted by the DAC and supplied to the comparator. Depending on the comparator's output, the SAR keeps the bit to 1 or otherwise reset it to zero. When the process arrives at the Least Significant Bit (LSB) of the SAR the End Of Conversion (EOC) signal is asserted. This ADC architecture is not suited for high-speed applications because it requires b clock cycles to produce a b-bit output sample. The precision of the conversion relies on the precision of the comparator and that of the DAC.

#### 3.2.4 $\Sigma$ - $\Delta$ ADC

The sigma-delta ADC is an oversampling converter that uses a 1 bit ADC and noise shaping techniques to obtain a slow but very accurate conversion, trading speed for resolution. The bandwidth of the input signal is much less than the Nyquist band (let's say 1/M). The scheme of a first-order  $\Sigma$ - $\Delta$  converter is shown in Fig.3.9. The analog section consists of an integrator, a comparator (1-bit ADC) and a 1-bit DAC, followed by the digital section that includes a low-pass filter and a decimator. The feedback DAC maintains the average output of the integrator near



Figure 3.9. Sigma-Delta ADC Architecture

the comparator's reference level. At the comparator's output, the density of "ones" is proportional to the amplitude of the input signal. For an increasing input the comparator generates a greater number of "ones," and vice versa for a decreasing input. The integrator is a first-order filter in the feedback loop that acts as low-pass for the input signal and high-pass for the quantization noise. Thanks to this behavior and to the oversampling, the noise is pushed at high frequencies. The digital 1-bit stream is then low-pass filtered with  $f_c = f_s/M$ . If the input has been sampled at  $f_s$ , the filtered-output data rate can therefore be reduced by the decimator to  $f_s/M$ without loss of information. This first-order converter provides a 9dB improvement in SNR for every doubling of the sampling rate [5].

# 3.3 Pipeline ADC with 1.5-bit stages

Pipeline architectures are very common in communication applications because they can combine both good accuracy (~ 12 - 14 bits) and good speed (~ 1 - 100MS/s) at the same time. Different types of stages can be employed, going typically from 1 to 4 bits per stage. Each stage is also called Multiplying DACs (MDAC) for the presence of the  $2^{b}$  gain. As already mentioned, due to imperfections in the circuital implementation that cause offset errors in the comparators, robust MDAC stages can be realized exploiting digital redundancy.

To better understand the effect of comparator's offset on a pipeline we can take the cascade of two 1-bit MDACs with the first one affected by offset. The 1-bit MDAC has only one zero-crossing comparator that outputs 1 bit, thus having two output possible states. The transfer characteristic of the ideal 1-bit MDAC and the cascade of two 1-bit stages is shown in Fig.3.10a. When the first comparator has



Figure 3.10. 1-bit MDAC transfer characteristic (a) and corresponding 2 stages cascade (b) normalized to  $V_R$ 

an offset, it produces an output residue out of the range  $\{-1, 1\}$  that saturates the second stage (and the rest of pipeline if we consider more than two stages) as shown in Fig.3.11a.

To counteract the saturation problem, an input-output characteristic that has a



Figure 3.11. 1-bit MDAC transfer characteristic affected by offset (a) and corresponding 2 stages cascade considering the second stage ideal (b) normalized to  $V_R$ 

margin before saturation is used. Offsets can still impact the behavior of the stage, but, as long as their effect doesn't exceed the redundant margin, they don't impact the pipeline performance. The simplest MDAC that implements redundancy in the quantization regions is the 1.5-bit architecture, shown in Fig.3.12. The 1.5-bit



Figure 3.12. 1.5-bit MDAC architecture

MDAC has 2 output bits,  $b_0$  and  $b_1$ , and have three possible output states, depending on the value of the input signal with respect to the thresholds  $-\frac{V_R}{4}$  and  $\frac{V_R}{4}$ . The three combinations can be associated to a three state variable D, useful to express the input-output characteristic of the 1.5-bit MDAC. The possible 2-bit outputs are mapped in Table 3.1.

| $V_{IN}$ range                                         | $b_1$ | $b_0$ | D  |
|--------------------------------------------------------|-------|-------|----|
| $\left[ \left\{ -V_R, -\frac{V_R}{4} \right\} \right]$ | 0     | 0     | -1 |
| $\left\{-\frac{V_R}{4}, \frac{V_R}{4}\right\}$         | 0     | 1     | 0  |
| $\left\{\frac{V_R}{4}, V_R\right\}$                    | 1     | 1     | 1  |

Table 3.1. Output bit mapping of a 1.5-bit MDAC

Using this notation, we can write:

$$V_{OUT} = 2V_{IN} - DV_R \tag{3.12}$$

Considering again the case of two stages cascade, the transfer characteristic of the single 1.5-bit MDAC and that of the cascade are shown in Fig.3.13a. In this case it



Figure 3.13. 1.5-bit MDAC transfer characteristic (a) and corresponding 2 stages cascade (b) normalized to  $V_R$ 

is clear that the comparators' offset can't cause the saturation of the following stages of the pipeline. Digital correction is realized properly combining the output codes of each stage, exploiting the redundancy of the 1.5-bit MDAC. It can be noted in Table 3.2 that the codes around the thresholds of the first stage are logically equivalent. Digital correction makes the converter insensitive to comparators' offset up to a certain extent. The maximum amount of offset that the stage can tolerate is  $\pm \frac{V_R}{4}$ 

| $V_{IN}$ range                                            | $D_1$ | $D_2$ | $2D_1 + D_2$ |
|-----------------------------------------------------------|-------|-------|--------------|
| $\left\{-V_R, -\frac{5V_R}{8}\right\}$                    | -1    | -1    | -3           |
| $\left\{-\frac{5V_R}{8},-\frac{3V_R}{8}\right\}$          | -1    | 0     | -2           |
| $\left\{-\frac{3V_R}{8},-\frac{V_R}{4}\right\}$           | -1    | 1     | -1           |
| $\left\{-\frac{V_R}{4},-\frac{V_R}{8}\right\}$            | 0     | -1    | -1           |
| $\left\{-\frac{V_R}{8},\frac{V_R}{8}\right\}$             | 0     | 0     | 0            |
| $\left\{\frac{V_R}{8}, \frac{V_R}{4}\right\}$             | 0     | 1     | 1            |
| $\left\{\frac{V_R}{4},\frac{3V_R}{8}\right\}$             | 1     | -1    | 1            |
| $\left\{\frac{\overline{3V_R}}{8},\frac{5V_R}{8}\right\}$ | 1     | 0     | 2            |
| $\left\{\frac{5V_R}{8}, V_R\right\}$                      | 1     | 1     | 3            |

**Table 3.2.** Output digital redundancy of the cascade of 2 1.5-bit MDAC.  $D_1$  and  $D_2$  are the output state of the first and second stage respectively

due to the output available range  $(\pm \frac{V_R}{2})$  and the stage gain equal to 2. In Section 3.7 a switched capacitors implementation of the 1.5-bit architecture described here will be shown using 45 nm CMOS process by STMicroelectronics.

Since other kind of errors in addition to comparators' offset can affect actual ADC realizations, many different post compensation techniques have been developed to overcome the performance limitations introduced.

# 3.4 Post compensation methods for ADCs

In this section an overview of the main error correction techniques applied to A/D converters is given, following the outline used by [63]. These techniques are based on four main methods, as summarized in [10]:

- Architecture-based
- Dithering
- Look Up Tables
- Post inversion models

The architecture-based methods are specific for each of the ADC architectures. In the previous Section we have seen the mechanism of digital redundancy implemented by the 1.5-bit stages that eliminates the risk of saturation and missing codes. Another widespread error correction method for the pipeline ADC is the radix calibration: due to finite opamp gain and capacitors' mismatch, the inter-stage gain between MDACs is different from the ideal value of  $2^b$ . Thus the real radix of each stage must be estimated by the calibration algorithm in order to obtain the corrected reconstructed output [24].

#### 3.4.1 Dithering

Dithering methods are based on the statistical theory of quantization and rely on the idea that adding some kind of noise to the signal before the quantization process improves the converter's performance. The main aims of dithering are to decorrelate the input and the quantization error and to randomize the INL/DNL patterns. The fulfillment of the quantizing theorem [104] is a necessary condition (Widrow [105]) for the adoption of the simple pseudo-quantization noise model consisting in an additive noise source with predictable statistics at the output of an ideal quantizer. The injection of a proper dither can help the input signal satisfy that condition.

With a time invariant DNL characteristic, a given input value is always affected by the same error and produce the same deterministic distortion. The dither can occasionally change the quantization region to which that input value is associated to, thus changing DNL and eliminating the deterministic nature of the distortion.

Particular applications such as [9] use averaging in conjunction with dithering to increase the resolution of the ADC but are effective only for slowly varying signals (similarly to oversampling in  $\Sigma$ - $\Delta$ ). A satisfactory analysis of the statistical theory of quantization is out of the scope of this thesis and can be carried out in [42, 59].

We focus on the LUT-based and post inversion methods that, as already specified in Chapter 2, are applicable to the more general compensation of nonlinearities in dynamic systems.

#### 3.4.2 Look Up Table based methods

The LUT based compensation techniques have been widely applied to the correction of static and dinamic ADC errors. The method consists in using the output samples from the ADC as index of a table which can have one or more dimensions. The entry value in the table is either added to or used to replace the current ADC output sample. LUT methods differs mainly depending on the indexing scheme with which the table index is generated and the type of value stored in the table. Furthermore, LUT methods rely on other steps common to different compensation techniques, such as the selection of proper input "calibration" signals and the estimation algorithms and criteria already seen in Section 2.3. In the following we describe the two specific aspects of LUTs.

#### **Indexing scheme**

The indexing scheme specifies how the table index is generated using the output samples. Correction methods are based on static, state-space and phase-plane architectures, each of them requiring a proper indexing scheme that determines size and structure of the LUTs. In the static indexing case, each output *b*-bit code x[n] or only a subset is mapped to an index, requiring a  $2^b$  table entries. Less than *b* bits can be used if the table contains correction values that are added only to a limited input range. This kind of method doesn't take into account dynamic errors, so it is limited to narrowband applications.

The state-space indexing takes into account of the error dynamics using the current and a certain number of past values of the output code. If N previous samples are used, a N-dimensional state-space indexing is realized. If the full b-bit code is used for each dimension, the number of the table entries is  $2^{(N+1)b}$  [95]. It is clear that the memory requirements for the LUT grow exponentially with the memory span, so practical implementations are limited to medium resolutions and lags not greater than 2. To overcome the memory limitation, different approaches have been proposed based on the reduction of the size of the stored values, made by truncation [99] or with a more sophisticated bit mask selecting a subset of the b-bits [64].

In the phase-plane indexing, the index is built using the present sample and an estimation of the signal slope [75]. N-dimensional phase-plane LUTs include the derivatives up to the N-th order [29]. The approximation of the derivative can be computed as a backward difference using the output samples, or with a differentiator filter, either digital or analog (in the latter case another ADC is needed to sample the analog derivative). The considerations for memory occupation are the same that in state-space LUTs.

#### Table values

The data inserted into each LUT register can be a replacement or a correction value. In the former case, the calibrated output value is stored into the LUT and is fed to the output when the actual output code (used as index) points to its location. In the latter, only the difference between the calibrated and the actual output is stored in the table. The calibrated output is obtained summing the actual output and the corresponding correction value stored in the LUT. Using correction values can improve memory occupancy at the expense of a slightly increased architectural complexity and power consumption.

#### 3.4.3 Post inversion methods

Many correction methods rely on the mathematical model of the ADC errors and its inverse. The application of such methods is limited to post-inversion when applied to ADC compensation, but the same mathematical principles also apply to the pre-distortion case (e.g. PA linearization). The inverse models are used in cascade after the ADC in order to obtain an overall system with improved performance in terms of linearity and thus resolution.

Some methods are based on the modeling of specific ADC's error sources and non-linear contributions like nonlinearity of the amplifier's open-loop gain, offset in the comparators and capacitors' mismatch [36, 21, 51]. A large part of ADC compensation methods are based on more general models both for direct or inverse system modeling. The most common are the Volterra series [96] and its subsets (Memory Polynomial [74], Modified Generalized Memory Polynomial [90], Hammerstein model [89]), Wiener model [96], Chebyshev polynomials [8] and orthogonal polynomials [106]. The theory on inverse system estimation presented in Chapter 2 apply to the calibration techniques based on these linear in the parameters models. In the following we apply digital calibration based on model inversion using a grey box approach: the target post-distorter is assumed to be a Volterra model but no specific information from the knowledge of the system response is used to prune the model. A backward iterative algorithm is used to reduce the model complexity while maximizing the post compensation performance.

# 3.5 ADC calibration techniques

The models we have seen in the previous section are used to represent and correct the converter static and dynamic errors. The estimation phase of the unknown model parameters or the LUT entries can be done once before ADC operation or continuously to track system variations. Depending on how the estimation and correction procedures are carried out, we can divide these techniques in foreground and background calibrations.

#### 3.5.1 Foreground calibration

Foreground calibration requires the interruption of the normal ADC operation for the model parameters estimation one or more times, depending on how fast the system parameters change. Offline calibration may require only one estimation phase when the system is robust to environmental variations or when multiple calibration coefficients sets are computed to take into account different operational conditions. Examples of foreground ADC calibration can be found in [41] and [24]. The main advantage of this technique is that the estimation phase can be done with a selected set of input test signals and least squares method, without worrying about convergence speed of iterative algorithms. On the other hand, it may not be possible to interrupt the ADC operation or the system parameter may change too rapidly, requiring the use of background calibration.

### 3.5.2 Background calibration

Background calibration is a continuous process that is carried out in parallel to the normal ADC operation. Many background calibration techniques in literature focus on the correction of static errors due to capacitors mismatch and finite amplifier gain. Typically, when addressing the correction of specific errors, these techniques belong to the architecture-based methods therefore being targeted for a particular ADC architecture. Among the background calibration techniques there are correlationbased, skip and fill and queue-based ones. The correlation based techniques rely on statistical properties of the errors, such as the dithering methods described before or the random swapping of the capacitors in the MDAC proposed in[84]. The skip and fill [54, 73] consists in skipping the conversion of one sample and use that time slot to perform calibration. The sample missed is filled with a polynomial interpolation using leading and lagging samples. The queue-based methods [41, 32] applied to pipeline ADC need two different clock domains, one for the S/H and one for the MDACs with a significant design complexity increase. The faster MDACs will have empty conversion cycles in which to perform calibration. To correct also frequency dependent errors (i.e. memory effects), background calibration using inverse dynamical models are used. The most common approach requires the use of a slower but more accurate ADC that acts as a reference channel, as shown in Fig.3.14. Trade-offs between convergence speed of the iterative algorithms (that



Figure 3.14. Reference channel based background calibration architecture

must be sufficient to track system variations) and correction accuracy have to be taken into account. Focusing on the adoption of LIP models, iterative algorithms for a linear estimation problem can be used. In Sect. 4.6 a background calibration technique is applied to a Time-Interleaved ADC using RLS algorithm to estimate the parameters of a linear filter bank.

# 3.6 S/H digital calibration in 45nm CMOS process

Sample & Hold stages are the front-end of most analog-to-digital converters and many S/H implementations are based on switched capacitor (SC) circuit techniques. SC circuits are discrete-time, continuous-amplitude functional blocks which store information as charge held in capacitors, and use switches and operational amplifiers to manipulate this charge and perform signal processing functions, such as summation, integration and filtering. Due to errors in the physical implementation such as switch and amplifier nonlinearities, capacitor's mismatch, or particular operational conditions like incomplete settling and slew rate, the performance of the S/H can be a limiting factor for the linearity of the overall system, especially at high sampling frequencies.

Discrete-time Volterra filters can be used as post distorters to improve the linearity of such mixed-signal circuits. Due to exponential growth of the number of parameters to be estimated, computational costs could be unsustainable. Subsets of Volterra kernels with a reduced number of parameters can be used to model specific nonlinearities, for instance the nonlinear switch on-resistance [77, 86]. [77] achieves a performance improvement of more than 20dB using hundreds of coefficients, whereas without complexity reduction the number of parameters would have run in the thousands. [86] uses the p-th order Volterra inverse and develops a model which reduces complexity (to tens of parameters) with a linearity improvement of about 10dB, up to close to 30dB for larger models and using inherently more linear analog

circuit techniques such as bootstrap switches. These techniques are tailored for a specific model of distortion and may thus be less effective in a more general case in which switch nonlinearities, amplifier nonlinearities and incomplete signal settlings are present, which is often the case in low-power high-speed S/H stages. [86] also shows that some circuit techniques can be less amenable to calibration, as switches implemented using transmission gates achieve lower linearity improvement.

In the following we describe the application of digital calibration to a S/H in CMOS 45nm STMicroelectronic process using direct estimation of the inverse Volterra model. We show that Volterra kernels of limited complexity (short memory) which use a specific lag for each order of nonlinearity, after careful pruning of the model to eliminate the parameters which add little to overall performance, achieve robust performance improvement.

#### 3.6.1 Switched Capacitors S/H

The simulated circuit is a fully-differential flipped-around S/H shown in Fig.3.15a, that uses Correlated Double Sampling (CDS) and Bottom Plate Sampling (BPS) techniques, with a folded cascode amplifier and transmission gate switches.



Figure 3.15. Flipped-around fully differential Sample and Hold scheme (a) and associated clocking scheme (b)

During Sample Phase  $(\phi_1)$ , the input signal charges the capacitors  $C_H$ , that are closed towards the "virtual" ground of the closed loop amplifier. The capacitor voltage is equal to the input, as soon as the sampler's bandwidth, limited by the switches' resistance, is higher than the signal bandwidth. Before the end of  $\phi_1$ , the switches  $\phi_1 e$  open leaving the capacitors floating. This technique eliminates charge injection and clock feedthrough. When  $\phi_1$  is over, the input signal at the switching-off instant is held on the capacitor. During the Hold Phase  $(\phi_2)$ , the voltage stored in the capacitor is fed to the output. If the operational amplifier is ideal, the output voltage is equal to the voltage on the capacitor, that is equal to the input voltage at  $\phi_1$ . The output voltage is ready at the end of  $\phi_2$ , so the S/H output introduces a  $\frac{T_s}{2}$  delay.

The S/H has a clock period  $T_s = 50$ MHz and shows incomplete settling at this speed. No analog techniques to improve switches' accuracy have been employed, such as clock voltage doublers or input dependent bootstrap: simple Transmission Gates (TG) switches are employed. The fully differential amplifier has an open-loop gain of 36dB, a gain-bandwidth product of 250MHz and consumes 30µW. Also the CMFB is implemented using a switched capacitor architecture.

#### **3.6.2** Volterra parameters estimation

The symmetrical truncated Volterra series described in 2.1.1 is adopted to model the S/H post-inverse system.

$$y[n] = \sum_{k=1}^{K} y_k[n] \qquad y_k[n] = \sum_{q_1=0}^{L_k-1} \cdots \sum_{q_k=q_{k-1}}^{L_k-1} h_k[q_1, \dots, q_k] \cdot \prod_{i=1}^k x[n-q_i] \qquad (3.13)$$

Different configurations of memory lags and maximum order have been tested, with different memory lags for each kernel order  $(L_i \text{ not necessarily equal to } L_j)$ . Dealing with a fully differential circuit, even order distortions are usually negligible thus odd order kernel are mainly used in the following. Monte Carlo simulations including mismatch have shown that a few even-order terms (including DC offset) suffice.

A set of 30 input-output signals have been simulated, and then "sampled" at 50MS/s in the Cadence simulation environment (remembering the in-out half clock cycle delay). Coherent sampling is adopted, as described in Subsection 2.3.3, for the selection of 30 sinusoids in the first Nyquist band:

$$s_{in,j}(t) = A\sin(2\pi f_j t)$$
 with  $f_j = \frac{j}{64}f_s, \quad j \in \{1,31\} \land j \neq 16$  (3.14)

The frequency  $\frac{f_s}{4}$  has not been used because all of its odd harmonics fall on  $\frac{f_s}{4}$  itself due to aliasing and all the even harmonics fall to DC.

Among the 30 waveforms, 22 are used for parameters' estimation and the other 8 for out-of-sample validation of the robustness of the technique: linearity improvement is similar for in-sample and out-of-sample tones, implying robust performance improvement also for signals not included in the calibration set. Least Squares estimation of the model parameters has been performed in the time domain using all the 22 input-output waveform data points as a batch.

#### 3.6.3 Simulation results

In the following, simulations are reported with a lag structure  $[L_1, L_3, L_5, L_7]$ , implying that the lag of the kernel of order k is  $L_k$ . To account for mismatch effects in Monte Carlo simulations, which create even-order distortions, terms of order 0 (offset), 2 and 4 have been added, with  $L_0 = L_2 = L_4 = 0$ : a constant term and two terms  $x^2[n]$  and  $x^4[n]$  are sufficient for correction at a computational cost of 3 additional multiplications. More complex even-order kernels do not improve linearity further. Linearity improvement has been defined as the difference between the minima of the SNDR after and before calibration in a specified band. Gain flatness is the variation of the linear gain in the same band. An improvement in the results has been reached with respect to that already published in [19] adding a delay to the reference signals in the estimation phase, as explained in Subsection 2.4, that doesn't add complexity to the correction filter.



Figure 3.16. Comparison between pre and post calibration SNDR and Gain, using a [20, 2, 2, 2] lag configuration

Fig.3.16 shows that an improvement of 20dB can be obtained from DC to 80% of the Nyquist band with lags [20, 2, 2, 2]. There are 87 coefficients to estimate, and gain flatness in the band of interest is below 0.01dB. Out-of-sample frequencies are shown using markers. Fig. 3.17 shows the same figure for lags [20, 4, 2, 2], with 113 free coefficients and a slightly higher SNDR improvement of 23.6dB. Simulations do not include noise, so that SNDR=-THD. Calibration cannot improve SNR, and noise would only increase the duration of the offline estimation phase.

Figs. 3.18 and 3.19 show the effect of pruning. The adopted method is a backward pruning technique based on an iterative algorithm that, starting from the largest number of parameters given by the initial lag structure, discards at each step the parameter that impacts linearity the least. The quality criterion that drives the pruning algorithm is the post calibration SNDR. Removing a few parameters improve both linearity and computational cost, and a large reduction in the number of parameters can be achieved preserving the same linearity enhancement obtained without pruning.



**Figure 3.17.** Comparison between pre and post calibration SNDR and Gain, using a [15, 4, 2, 2] lag configuration



Figure 3.18. SNDR improvement and gain variations with pruning, starting from a [20, 2, 2, 2] lag configuration

Fig. 3.18 shows that the number of parameters can be reduced up to 39, keeping more than 12dB of improvement, starting from lags [20, 2, 2, 2]. The peak linearity improvement is 25.7dB with 70 parameters and 24 dB are obtained with 58 parameters. Fig. 3.19 shows that a peak linearity gain of 26dB can be achieved with 87 parameters, 24dB gain with 57 and 12 dB with 37 parameters starting

from lags [20, 4, 2, 2]. Because the minimum number of coefficients for a given SNDR improvement varies with the initial lag structure, many simulations have been performed to achieve a given improvement with a minimal number of coefficients.



Figure 3.19. SNDR improvement and gain variations with pruning, starting from a [20, 4, 2, 2] lag configuration

Out-of-sample data have been used to test algorithm performance with signals not used in estimation. There is no significant difference between in-sample and out-of-sample frequencies. Temperature and voltage variations have been tested. Offline calibration techniques need the calibrated system to be stable against operational conditions because parameters are kept constant after estimation. The uncalibrated S/H show a SNDR variation of more than 3 dB between 7 and 47 °C. Temperature variations of  $\pm 10$  °C and supply voltage variations of  $\pm 1\%$  (12 mV) have little effect, especially for simpler models. Fig.3.20 show the post calibration SNDR of the typical



Figure 3.20. Comparison between post calibration SNDR in case of supply voltage variations of  $\pm 1\%$  using a 54 parameters pruned model

case against the  $V_{DD}$  variations using a pruned model with 54 parameters. The typical SNDR improvement is 23 dB: in case of 99% and  $101\% V_{DD}$  it is 4.5 dB and 2 dB less respectively. Parameter sets optimized for different operating conditions may be stored in a Look-Up Table, increasing the operational range. Monte Carlo simulations show that a limited number of even-order correction terms (3 including offset) are sufficient.

# 3.7 ADC pipeline digital calibration in 45nm CMOS process

Calibration using Volterra models with iterative pruning, presented in the previous Section[19] for a sample and hold stage, can be extended to pipeline ADCs, and its performance advantage increases with the sampling frequency of the ADC. This approach achieves better linearity with comparable complexity than other simplified Volterra models found in the literature. Volterra models are better suited for representing weakly non-linear systems with mild distortions. For this reason ADC front-end stages, such as SHAs [77, 86], can be more accurately represented with Volterra models, as they do not contain comparators, which produce heavily nonlinear behaviour. More complex models, in terms of higher order rather than higher lags, can be expected to be required for the correction of ADCs.

Model complexity is a limiting factor in the applicability of Volterra models. The literature on ADC calibration usually employs a different approach. Volterra kernels used for generic ADCs are based on a priori hypotheses on the structure of the kernels [77, 86, 91, 68] to reduce the number of parameters. In [68], a very compact model is used, as it is a second-order model of mixed products of the input and its derivative, approximated as a central difference. The memory span is thus limited to 1 lead and 1 lag samples, equivalently to a Volterra model with order and lag equal to 2. This model can be extended to higher orders. In [77] a simplified model is obtained by forcing  $h_k[q_1, \ldots, q_k] = 0$  for  $q_2, \ldots, q_k \neq 0$  in Eq. 3.13, i.e. the term of order k is the product of a polynomial memoryless term of order k - 1 and a linear term with memory. This model is also used in [67], though in the frequency domain, as described in Subsection III.A in that paper. In [91], two models are used – memory polynomial [74] and modified generalized memory polynomial [90] – to compensate a commercial ADC, reaching, however, a limited 10 dB gain in SFDR.

Other approaches [89, 57] use pruned models such as Hammerstein and Wiener models. We show that these approaches may be less effective, and sometimes ineffective, for the calibration of high-speed pipeline ADCs.

In the following, we apply digital calibration based on Volterra filtering on a pipeline ADC with 1.5-bit MDACs after a radix-based calibration [56] used to correct errors such finite gain and capacitor mismatch. Only the output of the pipeline ADC (after conventional calibration) is used in our non-linear calibration technique. This makes this technique suitable for calibrating off-the-shelf components, as it does not require modifications in the ADC hardware [91].

Performance is assessed with respect to the clock frequency and the number of stages. The iterative pruning technique shown in the previous Section is applied and improvements are discussed.

#### 3.7.1 Switched Capacitors Pipeline ADC

The pipeline ADC has a S/H stage followed by 16 1.5-bit MDACs, simulated in the CMOS 45 nm STMicroelectronic process with a 1.2 V power supply. The amplifier is a two-stage Miller-compensated operational transconductance amplifier (OTA) with a telescopic cascode as first stage. Each fully differential amplifier has a CMFB with resistive-partitioning and a diode-loaded differential pair. The reference voltage is  $1 V_{pp}$  differential and it is buffered using one buffer per stage.

Both the S/H and the MDAC stages are implemented using switched capacitors technique. The S/H topology is the same of Fig.3.15a. All the switches are transmission gates. Each stage is composed by:

- the 2-bit sub-ADC, consisting of two dynamic comparators
- the sub-DAC, realized as a simple multiplexer that converts the 2-bit representation into the three level signal D
- the summing node and the multiplier implemented as a whole

The dynamic comparator topology is shown in Fig.3.21. The clock signal controls both the differential pairs and the output buffer with only one phase. When the clock signal is low the differential pairs are disabled and both the output are reset. When it is high, the differential pairs are enabled and control the regenerative loop above them.



Figure 3.21. Latch comparator circuit

The circuit that produces the double of the subtraction between the input data and D is represented in Fig.3.22. The principle is similar to that of a S/H stage, but



Figure 3.22. Circuit that implements subtraction and multiplication of the input and the *D* signals

the digital circuits change the input-output relation in order to obtain the typical characteristic of a 1.5-bit MDAC.

During  $\phi_1$ , the charge stored in the two capacitors equals  $2C_H V_{in}$ . During  $\phi_2$ , with an ideal operational amplifier, we have that the charge stored in the two capacitors is  $C_H(V_{out} + DV_R)$ . The equality of the two terms produce the wanted relation:  $V_{out} = 2V_{in} - DV_R$ .

The digital circuits compare the input voltage with the thresholds of the two comparators  $\left(-\frac{V_R}{4} \text{ and } \frac{V_R}{4}\right)$  and select one of the three input reference voltage of the multiplexer  $\left(-V_R, 0 \text{ and } V_R\right)$  in order to obtain what we have called  $DV_R$ .

Circuit non idealities such as opamp finite gain and input capacitance and capacitors' mismatch produce linear static errors in the gain of the MDAC. Switch and amplifier nonlinearities introduce more complex non linear errors that can show memory effects. Even if the signals in the system are not perfectly settled, there can be residual memory effects, which consists in outputs that partially depend upon previous samples. This generally occurs because capacitors do not discharge completely, and their final value depends both on the initial condition and on the final state which would be reached, if enough time were available. This behavior is most likely to be apparent when the clock frequency increases.

#### 3.7.2 Simulation results

The methodology for the selection of the post inverse model, the design of the input data set and the estimation of the parameters is the same adopted for the sample and hold in the previous Section. The ADC's SNDR has been defined as that of the tone from DC to 80% of the Nyquist frequency with the highest distortion. All the 30 frequencies are considered: if the model overfits the data, out-of-sample tones have lower SNDR. The pipeline was originally designed to work with a clock

frequency of 50 MHz but thanks to digital calibration it has been possible to push it up to 125 MHz. Power consumption does not change appreciably with the clock frequency.



Figure 3.23. SNDR improvement against nominal ADC resolution using a lag structure [30, 4, 2, 2, 1, 1, 1, 0, 0, 0].

Fig. 3.23 shows SNDR improvement (in ENOB) against nominal ADC resolution and sampling period, using a lag structure [30, 4, 2, 2, 1, 1, 1, 0, 0, 0] for odd orders from 1 to 19. Number of parameters is 162 without pruning. The nominal resolution of the pipeline is the number of MDAC stages plus 1. The Volterra model has been used to simulate both the improvement in the S/H stage alone (assuming an ideal ADC) and of the whole pipeline ADC. Simulations with 8 ns sampling period show that the ADC has about 9 bits of ENOB before calibration and close to 11.5 after. The S/H's ENOB is 10.5 bits and reaches 14 bits after calibration. A memoryless polynomial model with odd-order kernels from 3 to 19 has been simulated: it improves linearity by 0.5 bit at 16 and 12 ns of clock period, but it has no effect at 8 ns (125 MS/s). The effects of pruning are shown in Fig.3.24 for the three sampling frequencies, starting from the lag structure [30, 4, 2, 2, 1, 1, 1, 0, 0, 0]. Pruning improves linearity, initially, and reduces model complexity by a factor of about 2.

The models in [77, 68] and in [91, 74, 90] have been used to calibrate our 8 ns sampling time data set. Table 3.3 reports the best results we have found for each algorithm.

The model [68] is simple but not effective. The MP model in [91] has limited effectiveness (about 0.5 bit peak improvement), with a low parameter count. The MGMP model is marginally better, but more complex. The model in [77] is more effective, yielding a maximum improvement of about 1.2 bits with 205 coefficients, and about 0.9 bit with 21 coefficients. ENOB improvement saturates at 1.2. Our pruned Volterra model achieves performance improvements larger than 1.2 bits, up to 2.5 bits, with a cost from about 40–101 parameters.



Figure 3.24. SNDR improvement against number of coefficients using a lag structure [30, 4, 2, 2, 1, 1, 1, 0, 0, 0].

| Reference   | Max Order                                                     | Max Lag | Complexity | ΔΕΝΟΒ |
|-------------|---------------------------------------------------------------|---------|------------|-------|
| [68]        | 3–19                                                          | _       | _          | 0     |
| [77]        | 5                                                             | 3       | 21         | 0.9   |
|             | 9                                                             | 20      | 205        | 1.2   |
| [91] (MP)   | 5                                                             | 2       | 6          | 0.5   |
| [91] (MGMP) | 9                                                             | 4       | 20         | 0.4   |
|             | 11                                                            | 5       | 30         | 0.7   |
|             | Pruning from model in Fig.3.24<br>referring to the 8 ns curve |         | 53         | 1.5   |
| This work   |                                                               |         | 72         | 2     |
|             |                                                               |         | 101        | 2.5   |

Table 3.3. Linearity improvement and complexity for various models

# 3.8 Conclusions

In this chapter an offline post compensation technique based on the Volterra series has been applied to calibrate a S/H stage and a pipeline ADC. An iterative pruning algorithm has been used to reduce computational complexity of the model, demonstrating that a slightly better performance is reached with less parameter than the complete model. In the S/H, improvements greater than 24 dB are obtained using 57 model parameters after pruning. Moreover, it is possible to enhance performance for pipeline ADCs driven at much higher sampling frequencies than the nominal one, as the Volterra model can correct for the effects of the non-linear dynamics of the circuits. The performance improvement is in fact particularly significant for the largest simulated sampling frequency of 125 MS/s.

# Chapter 4

# Time-Interleaved ADC calibration using filter banks

Widespread applications such as direct sampling receivers, radar and instrumentation require both high speed and high linearity analog-to-digital converters. Pushing the ADC technology to the limit may not be sufficient, considering even greater challenges in IC design due to scaling down to nanometer technology nodes. One possibility to achieve combined speed-resolution goal and to overcome technology limit is to exploit parallelism: Time-Interleaved ADC achieve high conversion rates by interleaving samples coming from multiple slow and accurate ADCs connected in parallel. An M-channel architecture produce a digital output sampled at M times the single ADC sampling frequency. From a theoretical point of view, considering identical converters on each channel and a perfect clocking scheme, there is no limit in increasing M (however the analog bandwidth of the single ADC must be grater or equal to the full input bandwidth). In practice gain, bandwidth and timing mismatches between the channels produce distortions on the reconstructed output, limiting the effective number of bits of the overall converter. For this reason calibration techniques are mandatory to fully exploit the interleaving potential. While analog calibration architectures are resource consuming in terms of area and power [83], digital ones are even more cheaper thanks to the increasing component density and lower power supply. Different calibration techniques have been proposed that use additional circuitry or known input signal properties to estimate these errors and correct them using digital processing [44, 55, 103]. In this chapter two approaches for the calibration of 4-channel TI-ADC are presented: first, the correction method based on perfect reconstruction (PR) filter banks is described and a closed-form solution for the 4-channel architecture is derived and demonstrated with behavioral simulations. Second, a background calibration technique using cyclo-stationary filter banks along with fixed-point complexity reduction methods is presented. In Section 4.1 the general architecture of a TI-ADC and the behavioral model based on analysis-synthesis filters are described. Section 4.2 introduce the simple problem of finding perfect reconstruction filters for a 2-channel architecture while in Section 4.3 the filters for the 4-channel TI-ADC are calculated in closed form. In Section 4.4 the first-order Taylor approximation of the PR filters is derived and in Section 4.5 the exact and approximated forms are validated by simulations. In Section

4.6 a background calibration technique exploiting cyclo-stationary filter banks is described, complexity reduction is carried out both in the adopted models and in filter implementation and convergence speed versus linearity is discussed. Section 4.7 concludes.

# 4.1 TI-ADC Architecture

An *M*-channel Time-Interleaved ADC architecture is depicted in Fig. 4.1 and the related clocking scheme is represented in Fig. 4.2. The analog input x(t) is sampled by each sub-ADC at a rate of  $f_s/M$  with a delay of one clock period  $T_s$  between adjacent channels. At the output, a multiplexer merges the sub-ADC outputs into a single output running at  $f_s$ .



Figure 4.1. M-Channel TI-ADC architecture



Figure 4.2. *M*-Channel TI-ADC clocking scheme

Mathematical models of the architecture in Fig. 4.1 are well known in literature [102]. A practical representation consists in describing the input-output signals in the

frequency domain using digital modeling of the analog section non-idealities (i.e. gain, bandwidth, timing skew). To better understand the mathematical representation issues, the simple 2-channel architecture [82] is described first, introducing the notation used in the 4-channel reconstruction problem.

# 4.2 2-channel TI-ADC

### 4.2.1 Ideal reconstruction



Figure 4.3. 2 Channel TI-ADC ideal signal reconstruction

The signals on the upper and lower branch at the dotted interface of figure 4.3 are:

$$X_0(e^{j\omega}) = \frac{1}{2} \left[ X(e^{j\frac{\omega}{2}}) + X(e^{j\frac{\omega}{2} - j\pi}) \right]$$
(4.1)

$$X_1(e^{j\omega}) = \frac{1}{2} \left[ X(e^{j\frac{\omega}{2}})e^{-j\frac{\omega}{2}} + X(e^{j\frac{\omega}{2}-j\pi})e^{-j\frac{\omega}{2}-j\pi} \right]$$
(4.2)

Before summing up, the signals become:

$$Y_0(e^{j\omega}) = \frac{1}{2} \left[ X(e^{j\omega}) + X(e^{j\omega-j\pi}) \right] e^{-j\omega}$$
(4.3)

$$Y_1(e^{j\omega}) = \frac{1}{2} \left[ X(e^{j\omega})e^{-j\omega} + X(e^{j\omega-j\pi})e^{-j\omega-j\pi} \right]$$
(4.4)

The output of the ideal TI-ADC is the sum of these two signals:

$$Y(e^{j\omega}) = \frac{1}{2} \left[ 2X(e^{j\omega}) + X(e^{j\omega-j\pi}) + X(e^{j\omega-j\pi}e^{-j\pi}) \right] e^{-j\omega} =$$
  
$$= \frac{1}{2} \left[ 2X(e^{j\omega}) + X(e^{j\omega-j\pi}) - X(e^{j\omega-j\pi}) \right] e^{-j\omega} =$$
  
$$= X(e^{j\omega})e^{-j\omega}$$
(4.5)

Thanks to ideal architecture symmetry the terms containing aliasing are cancelled.

#### 4.2.2 Introducing time skew on branches: non integer delay

In a real environment the presence of a static time skew error in each branch must be considered. If the delay  $\Delta T_s$  is an integer multiple of the sampling period (i.e.  $\Delta$ is an integer) its transfer function is easily represented in the DFT periodic domain:

$$H(e^{j\omega}) = e^{-j\omega\Delta} \tag{4.6}$$

The periodicity of this expression in the frequency domain can be shown:

$$H(e^{j(\omega+2\pi)}) = e^{-j(\omega+2\pi)\Delta} = e^{-j\omega\Delta}e^{-j2\pi\Delta} = e^{-j\omega\Delta} \cdot 1 = H(e^{j\omega})$$
(4.7)

If  $\Delta$  has a non integer value, the transfer function expression as it is does not respect anymore the mathematical constraint on frequency periodicity of  $2\pi$ , so the explicit periodic domain has to be added. This must be taken into account when performing frequency translations (typical in multi-rate processing).

$$H(e^{j\omega}) = e^{-j\omega\Delta} \qquad -\pi < \omega < \pi \tag{4.8}$$

The phase of the delay transfer function is represented in figure 4.4.



Figure 4.4. Phase response of delay element transfer function

#### 4.2.3 Non-ideal reconstruction

If we consider a real TI-ADC, gain errors in each channel must be considered too. Offset errors can be neglected for ease of notation because their correction model is very simple. Gain and offset errors produce modulation effects that generate interleaving spurs at the output. Assuming that the channel responses are affected by gain errors and timing skews the following frequency domain transfer function is added in each branch as analysis filter.

$$H_i(e^{j\omega}) = (1+g_i)e^{-j\omega\Delta_i} \qquad -\pi < \omega < \pi \tag{4.9}$$

Due to the non-integer values of  $\Delta_i$  the  $2\pi$ -periodicity of the  $H_i(e^{j\omega})$  is obtained defining the function on a bounded but periodic domain.

Consider now the scheme represented in figure 4.5 with the synthesis filters  $F_i(e^{j\omega})$ : the signals on the upper and lower branch at the dotted interface are

$$X_{0}(e^{j\omega}) = \frac{1}{2} \left[ X(e^{j\frac{\omega}{2}}) H_{0}(e^{j\frac{\omega}{2}}) + X(e^{j\frac{\omega}{2}-j\pi}) H_{0}(e^{j\frac{\omega}{2}-j\pi}) \right]$$
(4.10)  

$$X_{1}(e^{j\omega}) = \frac{1}{2} \left[ X(e^{j\frac{\omega}{2}}) H_{1}(e^{j\frac{\omega}{2}}) e^{-j\frac{\omega}{2}} + X(e^{j\frac{\omega}{2}-j\pi}) H_{1}(e^{j\frac{\omega}{2}-j\pi}) e^{-j\frac{\omega}{2}+j\pi} \right] =$$
$$= \frac{1}{2} \left[ X(e^{j\frac{\omega}{2}}) H_{1}(e^{j\frac{\omega}{2}}) - X(e^{j\frac{\omega}{2}-j\pi}) H_{1}(e^{j\frac{\omega}{2}-j\pi}) \right] e^{-j\frac{\omega}{2}}$$
(4.11)



Figure 4.5. 2 Channel TI-ADC non-ideal signal reconstruction

The two signals at the summing node become:

$$Y_0(e^{j\omega}) = \frac{1}{2} \left[ X(e^{j\omega}) H_0(e^{j\omega}) + X(e^{j\omega-j\pi}) H_0(e^{j\omega-j\pi}) \right] F_0(e^{j\omega}) e^{-j\omega}$$
(4.12)

$$Y_1(e^{j\omega}) = \frac{1}{2} \left[ X(e^{j\omega}) H_1(e^{j\omega}) - X(e^{j\omega - j\pi}) H_1(e^{j\omega - j\pi}) \right] F_1(e^{j\omega}) e^{-j\omega}$$
(4.13)

The condition of perfect reconstruction  $Y(e^{j\omega}) = X(e^{j\omega})$  generates the two-equation system below:

$$\begin{cases} H_0(e^{j\omega})F_0(e^{j\omega}) + H_1(e^{j\omega})F_1(e^{j\omega}) = 2\\ H_0(e^{j\omega-j\pi})F_0(e^{j\omega}) - H_1(e^{j\omega-j\pi})F_1(e^{j\omega}) = 0 \end{cases}$$
(4.14)

The first equation of (4.14) represents an equalization condition while the second one is a non-aliasing condition. The timing skews produce spectral replicas due to aliasing.

#### 4.2.4 Reconstruction without equalization

If we neglect the first condition we can assume  $F_0(e^{j\omega}) = 1$  and solve the second equation:

$$F_1(e^{j\omega}) = \frac{H_0(e^{j\omega-j\pi})}{H_1(e^{j\omega-j\pi})}$$
(4.15)

The mathematical representation of  $H_i(e^{j\omega-j\pi})$  in the periodic DFT domain is:

$$H_i(e^{j\omega-j\pi}) = (1+g_i)e^{-j\omega\Delta_i - j\operatorname{sign}(\omega)\pi\Delta_i} \qquad -\pi < \omega < \pi \qquad (4.16)$$

The graphical representation of its phase is shown in figure 4.6. The expression of the reconstruction filter becomes

$$F_1(e^{j\omega}) = \frac{1+g_0}{1+g_1} e^{-j[\omega-\pi \operatorname{sign}(\omega)](\Delta_0 - \Delta_1)}$$
(4.17)

#### 4.2.5 Reconstruction with equalization

For ease of notation, set  $\nabla_i(e^{j\omega}) = \nabla_i$  and  $\nabla_i(e^{j\omega-j\pi}) = \nabla_i^{\pi}$ . We can write the equation system in matrix notation and solve it using Cramer's rule.

$$\begin{pmatrix} H_0 & H_1 \\ H_0^{\pi} & -H_1^{\pi} \end{pmatrix} \begin{pmatrix} F_0 \\ F_1 \end{pmatrix} = \begin{pmatrix} 2 \\ 0 \end{pmatrix}$$
(4.18)



**Figure 4.6.** Phase response of  $H(e^{j\omega-j\pi})$  transfer function

$$F_0 = \frac{2H_1^{\pi}}{H_0 H_1^{\pi} + H_0^{\pi} H_1} \qquad F_1 = \frac{2H_0^{\pi}}{H_0 H_1^{\pi} + H_0^{\pi} H_1}$$
(4.19)

Substituting the values of  $H_i$  in the expressions gives

$$F_{0} = \frac{2(1+g_{1})e^{-j\omega\Delta_{1}+j\pi\Delta_{1}\operatorname{sign}(\omega)}}{(1+g_{0})(1+g_{1})e^{-j\omega(\Delta_{0}+\Delta_{1})}(e^{j\pi\Delta_{1}\operatorname{sign}(\omega)}+e^{j\pi\Delta_{0}\operatorname{sign}(\omega)})}$$
(4.20)

$$F_1 = \frac{2(1+g_0)e^{-j\omega\Delta_0 + j\pi\Delta_0\,\mathrm{sign}(\omega)}}{(1+g_0)(1+g_1)e^{-j\omega(\Delta_0 + \Delta_1)}(e^{j\pi\Delta_1\,\mathrm{sign}(\omega)} + e^{j\pi\Delta_0\,\mathrm{sign}(\omega)})} \tag{4.21}$$

The expressions (4.20) and (4.21) can be reduced removing common factors:

$$F_0 = \frac{2e^{j\pi\Delta_1\operatorname{sign}(\omega)}}{(1+g_0)e^{-j\omega\Delta_0}(e^{j\pi\Delta_1\operatorname{sign}(\omega)} + e^{j\pi\Delta_0\operatorname{sign}(\omega)})}$$
(4.22)  
$$2e^{j\pi\Delta_0\operatorname{sign}(\omega)}$$

$$F_1 = \frac{2e^{j\pi\omega_0 \operatorname{sign}(\omega)}}{(1+g_1)e^{-j\omega\Delta_1}(e^{j\pi\Delta_1\operatorname{sign}(\omega)} + e^{j\pi\Delta_0\operatorname{sign}(\omega)})}$$
(4.23)

The sum of complex exponentials at the denominator can be rewritten using the identity (4.24)

$$e^{jx} + e^{jy} = 2\cos\left(\frac{x-y}{2}\right)e^{j(\frac{x+y}{2})}$$
(4.24)

The expressions become

$$F_0 = \frac{e^{j\omega\Delta_0} \cdot e^{j\pi\Delta_1 \operatorname{sign}(\omega)} \cdot e^{-j\frac{\pi(\Delta_1 + \Delta_0)}{2}\operatorname{sign}(\omega)}}{(1 + g_0)\cos\left[\frac{\pi(\Delta_1 - \Delta_0)}{2}\operatorname{sign}(\omega)\right]}$$
(4.25)

$$F_1 = \frac{e^{j\omega\Delta_1} \cdot e^{j\pi\Delta_0 \operatorname{sign}(\omega)} \cdot e^{-j\frac{\pi(\Delta_1 + \Delta_0)}{2}\operatorname{sign}(\omega)}}{(1+g_1)\cos\left[\frac{\pi(\Delta_1 - \Delta_0)}{2}\operatorname{sign}(\omega)\right]}$$
(4.26)

Further reductions can be done since the cosine is an even function

$$F_0 = \frac{e^{j\omega\Delta_0} \cdot e^{j\frac{\pi(\Delta_1 - \Delta_0)}{2}\operatorname{sign}(\omega)}}{(1 + g_0)\cos\left[\frac{\pi}{2}(\Delta_1 - \Delta_0)\right]}$$
(4.27)

$$F_{1} = \frac{e^{j\omega\Delta_{1}} \cdot e^{-j\frac{\pi(\Delta_{1}-\Delta_{0})}{2}\operatorname{sign}(\omega)}}{(1+g_{1})\cos\left[\frac{\pi}{2}(\Delta_{1}-\Delta_{0})\right]}$$
(4.28)

### 4.3 4-channel TI-ADC Perfect Reconstruction Filters

In previous works the problem of finding a suitable filter basis to approximate the perfect reconstruction filters has been addressed [70], using numerical simulations to assess the one with better performance in terms of post-compensation linearity. In [81] the problem of finding synthesis filters for perfect reconstruction as a function of analysis filters is addressed for an M-channel architecture, but numerical methods are used and no closed-form solution is presented. In this Section the closed-form calculation for 4-channel TI-ADC PR filters is presented, when the analysis filters model gain and timing errors.



Figure 4.7. 4 Channel TI-ADC non-ideal signal reconstruction

Figure 4.7 shows the discrete-time equivalent architecture of a 4-channel TI-ADC where, for each channel  $i = \{0, 1, 2, 3\}$ ,  $H_i(e^{j\omega})$  is a frequency response that can model gain, bandwidth and timing errors [70] and  $F_i(e^{j\omega})$  are the reconstruction filters needed to fulfill the condition of perfect output reconstruction. Additional integer delays are added to align the output samples. The expressions of the signals after the downsamplers and before the summing node are:

$$X_i(e^{j\omega}) = \frac{1}{4} \sum_{k=0}^{3} X(e^{j\frac{\omega-2k\pi}{4}}) H_i(e^{j\frac{\omega-2k\pi}{4}}) e^{-ji(\frac{\omega-2k\pi}{4})}$$
(4.29)

$$Y_{i}(e^{j\omega}) = \frac{1}{4} \sum_{k=0}^{3} X\left[e^{j\left(\omega - \frac{2k\pi}{4}\right)}\right] H_{i}\left[e^{j\left(\omega - \frac{2k\pi}{4}\right)}\right] e^{ji\frac{k\pi}{2}} \cdot F_{i}(e^{j\omega})e^{-j3\omega}$$
(4.30)

Calibration consists in solving the system of equations with four unknowns  $F_i(e^{j\omega})$  arising from the condition of perfect reconstruction.

$$Y(e^{j\omega}) = \sum_{i=0}^{3} Y_i(e^{j\omega}) \equiv X(e^{j\omega})e^{-j3\omega}$$
(4.31)

Using the notation  $[\cdot]_i(e^{j\frac{\omega-2k\pi}{4}}) = [\cdot]_i^k$  and considering that  $e^{ji\frac{k\pi}{2}} = \{\pm 1, \pm j\}$  the system becomes:

$$\begin{pmatrix} H_0^0 & H_1^0 & H_2^0 & H_3^0 \\ H_0^1 & jH_1^1 & -H_2^1 & -jH_3^1 \\ H_0^2 & -H_1^2 & H_2^2 & -H_3^2 \\ H_0^3 & -jH_1^3 & -H_2^3 & jH_3^3 \end{pmatrix} \begin{pmatrix} F_0(e^{j\omega}) \\ F_1(e^{j\omega}) \\ F_2(e^{j\omega}) \\ F_3(e^{j\omega}) \end{pmatrix} = \begin{pmatrix} 4 \\ 0 \\ 0 \\ 0 \\ 0 \end{pmatrix}$$
(4.32)

The linear system can be simplified neglecting the equalization condition, i.e.  $F_0(e^{j\omega}) = 1$ , leaving a linear error on the reconstructed output.

$$\begin{pmatrix} jH_1^1 & -H_2^1 & -jH_3^1 \\ -H_1^2 & H_2^2 & -H_3^2 \\ -jH_1^3 & -H_2^3 & jH_3^3 \end{pmatrix} \begin{pmatrix} F_1(e^{j\omega}) \\ F_2(e^{j\omega}) \\ F_3(e^{j\omega}) \end{pmatrix} = \begin{pmatrix} -H_0^1 \\ -H_0^2 \\ -H_0^3 \end{pmatrix}$$
(4.33)

A key point to proceed with the calculus is to find a mathematical representation for the the phases of the aliased fractional delay transfer functions valid within the periodic interval  $-\pi < \omega < \pi$ .

$$\angle H_i \left[ e^{j(\omega - \frac{\pi}{2})} \right] = -\omega \Delta_i + \pi \Delta_i \operatorname{sign} \left( \omega + \frac{\pi}{2} \right) - \frac{\pi \Delta_i}{2}$$

$$\angle H_i \left[ e^{j(\omega - \pi)} \right] = -\omega \Delta_i + \pi \Delta_i \operatorname{sign}(\omega)$$

$$\angle H_i \left[ e^{j(\omega - \frac{3\pi}{2})} \right] = -\omega \Delta_i + \pi \Delta_i \operatorname{sign} \left( \omega - \frac{\pi}{2} \right) + \frac{\pi \Delta_i}{2}$$

$$(4.34)$$

The functions in (4.34) are represented in Figs. 4.8a, 4.8b and 4.8c.

To use Cramer's rule to solve the linear system we need to calculate the determinant of the  $3 \times 3$  square matrix in (4.33) called **H**:

$$\det(\mathbf{H}) = -jH_1^2 (H_2^3 H_3^1 + H_2^1 H_3^3) + H_2^2 (H_1^3 H_3^1 - H_3^3 H_1^1) - jH_3^2 (H_1^3 H_2^1 + H_1^1 H_2^3)$$
(4.35)

Defining:

$$\Psi = -j|H_1||H_2||H_3|e^{-j\omega(\Delta_1 + \Delta_2 + \Delta_3)}, \qquad S(\cdot) = \text{sign}(\cdot)$$
(4.36)

the expression becomes:

$$det(\mathbf{H}) = \Psi \left\{ e^{j\pi\Delta_{1}S(\omega)} \left[ e^{j\pi \left[ \Delta_{2} \left( S(\omega - \frac{\pi}{2}) + \frac{1}{2} \right) + \Delta_{3} \left( S(\omega + \frac{\pi}{2}) - \frac{1}{2} \right) \right] + e^{j\pi \left[ \Delta_{2} \left( S(\omega + \frac{\pi}{2}) - \frac{1}{2} \right) + \Delta_{3} \left( S(\omega - \frac{\pi}{2}) + \frac{1}{2} \right) \right]} \right] + e^{j\pi\Delta_{2}S(\omega)} \left[ e^{j\pi \left[ \frac{1}{2} + \Delta_{1} \left( S(\omega - \frac{\pi}{2}) + \frac{1}{2} \right) + \Delta_{3} \left( S(\omega + \frac{\pi}{2}) - \frac{1}{2} \right) \right]} + e^{j\pi \left[ -\frac{1}{2} + \Delta_{1} \left( S(\omega + \frac{\pi}{2}) - \frac{1}{2} \right) + \Delta_{3} \left( S(\omega - \frac{\pi}{2}) + \frac{1}{2} \right) \right]} \right] + e^{j\pi\Delta_{3}S(\omega)} \left[ e^{j\pi \left[ \Delta_{1} \left( S(\omega - \frac{\pi}{2}) + \frac{1}{2} \right) + \Delta_{2} \left( S(\omega + \frac{\pi}{2}) - \frac{1}{2} \right) \right]} + e^{j\pi \left[ \Delta_{1} \left( S(\omega - \frac{\pi}{2}) + \frac{1}{2} \right) + \Delta_{2} \left( S(\omega + \frac{\pi}{2}) - \frac{1}{2} \right) \right]} + e^{j\pi \left[ \Delta_{1} \left( S(\omega - \frac{\pi}{2}) + \frac{1}{2} \right) + \Delta_{2} \left( S(\omega + \frac{\pi}{2}) - \frac{1}{2} \right) \right]} \right] \right\}$$

$$(4.37)$$

We introduce the piecewise functions  $R(\omega)$  and  $Z(\omega)$  whose graphical representations are shown in Figs. 4.9a and 4.9b:

$$R(\omega) = S\left(\omega - \frac{\pi}{2}\right) - S\left(\omega + \frac{\pi}{2}\right) + 1 \tag{4.38}$$

$$Z(\omega) = \frac{1}{2} \left[ S\left(\omega - \frac{\pi}{2}\right) + S\left(\omega + \frac{\pi}{2}\right) \right]$$
(4.39)


**Figure 4.8.** Phase of  $H_i\left[e^{j\left(\omega-\frac{\pi}{2}\right)}\right]$  (a),  $H_i\left[e^{j\left(\omega-\pi\right)}\right]$  (b) and  $H_i\left[e^{j\left(\omega-\frac{3\pi}{2}\right)}\right]$  (c)

Using the trigonometric identity (4.24) and the functions (4.38) and (4.39) we can rewrite the determinant obtaining the expression in (4.40).

$$\det(\mathbf{H}) = 2\Psi \left\{ e^{j\pi\Delta_1 S(\omega)} \cos\left[\frac{\pi(\Delta_2 - \Delta_3)R(\omega)}{2}\right] e^{j\pi(\Delta_2 + \Delta_3)Z(\omega)} + e^{j\pi\Delta_2 S(\omega)} \cos\left[\frac{\pi + \pi(\Delta_1 - \Delta_3)R(\omega)}{2}\right] e^{j\pi(\Delta_1 + \Delta_3)Z(\omega)} + e^{j\pi\Delta_3 S(\omega)} \cos\left[\frac{\pi(\Delta_1 - \Delta_2)R(\omega)}{2}\right] e^{j\pi(\Delta_1 + \Delta_2)Z(\omega)} \right\}$$
(4.40)

The following rules can be applied to the piecewise function  $R(\omega)$ :

$$\cos\left[k \cdot R(\omega)\right] = \cos(k) \tag{4.41}$$

$$\sin\left[k \cdot R(\omega)\right] = R(\omega) \cdot \sin(k) \tag{4.42}$$

Using the new piecewise function  $Q(\omega) = Z(\omega) - S(\omega)$  depicted in Fig. 4.9c and grouping the common factor  $e^{j\pi(\Delta_1+\Delta_2+\Delta_3)S(\omega)} = \Gamma^{S(\omega)}$  gives the expression:

$$\det(\mathbf{H}) = 2\Psi\Gamma^{S(\omega)} \left\{ \cos\left[\frac{\pi(\Delta_2 - \Delta_3)}{2}\right] e^{j\pi(\Delta_2 + \Delta_3)Q(\omega)} + R(\omega) \sin\left[\frac{\pi(\Delta_3 - \Delta_1)}{2}\right] e^{j\pi(\Delta_1 + \Delta_3)Q(\omega)} + \cos\left[\frac{\pi(\Delta_1 - \Delta_2)}{2}\right] e^{j\pi(\Delta_1 + \Delta_2)Q(\omega)} \right\}$$
(4.43)



(c)

**Figure 4.9.** R function (a), Z function (b) and Q function (c)

In order to factorize (4.43) we evaluate it in the 4 frequency intervals  $-\pi(1-\frac{b}{2}) < \omega < -\frac{\pi}{2}(1-b)$ , with  $b = \{0, 1, 2, 3\}$ , for further simplifications and then reassemble the results back in one expression using the piecewise functions found.

Inside the intervals with b = 0 and b = 3 we have  $R(\omega) = 1$  and  $Q(\omega) = 0$ . The expression can be written as

$$\det(\mathbf{H}^{(0,3)}) = 8\Psi\Gamma^{S(\omega)}\cos\left[\frac{\pi(\Delta_1 - \Delta_3)}{4}\right]\sin\left[\frac{\pi(\Delta_1 - \Delta_2 - 1)}{4}\right]\sin\left[\frac{\pi(\Delta_2 - \Delta_3 - 1)}{4}\right]$$
(4.44)

Inside the interval with b = 1 we have  $R(\omega) = -1$  and  $Q(\omega) = 1$ . The expression can be written as

$$\det(\mathbf{H}^{(1)}) = \det(\mathbf{H}^{(0,3)}) \left( -je^{j\frac{\pi\Delta_1}{2}} + e^{j\frac{\pi\Delta_2}{2}} + je^{j\frac{\pi\Delta_3}{2}} \right) e^{j\frac{\pi(\Delta_1 + \Delta_2 + \Delta_3)}{2}}$$
(4.45)

Inside the interval with b = 2 we have  $R(\omega) = -1$  and  $Q(\omega) = -1$ . The expression can be written as

$$\det(\mathbf{H}^{(2)}) = \det(\mathbf{H}^{(0,3)}) \left( j e^{-j\frac{\pi\Delta_1}{2}} + e^{-j\frac{\pi\Delta_2}{2}} - j e^{-j\frac{\pi\Delta_3}{2}} \right) e^{-j\frac{\pi(\Delta_1 + \Delta_2 + \Delta_3)}{2}}$$
(4.46)

By visual inspection it can be noted that only the final parenthesis and the complex exponential change between the four intervals. Multiplying each imaginary unit by  $Q(\omega)$  and considering the new piecewise function  $P(\omega) = S(\omega) + \frac{Q(\omega)}{2}$ , an expression valid in all the intervals is obtained:

$$\det(\mathbf{H}) = 8\Psi\Gamma^{P(\omega)}\cos\left[\frac{\pi(\Delta_1 - \Delta_3)}{4}\right]\sin\left[\frac{\pi(\Delta_1 - \Delta_2 - 1)}{4}\right]\sin\left[\frac{\pi(\Delta_2 - \Delta_3 - 1)}{4}\right] \cdot \left(-jQ(\omega)e^{jQ(\omega)\frac{\pi\Delta_1}{2}} + e^{jQ(\omega)\frac{\pi\Delta_2}{2}} + jQ(\omega)e^{jQ(\omega)\frac{\pi\Delta_3}{2}}\right)$$
(4.47)

Table 4.1. Indices permutation of the  $\Delta_i$  and  $H_i$  used to find the other three determinants. The first column represent the references and the other three contain the values with which the reference is modified.

| $\det(\mathbf{H})$ | $\det(\mathbf{H}_1)$ | $\det(\mathbf{H}_2)$ | $\det(\mathbf{H}_3)$ |
|--------------------|----------------------|----------------------|----------------------|
| 1                  | 2                    | 3                    | 0                    |
| 2                  | 3                    | 0                    | 1                    |
| 3                  | 0                    | 1                    | 2                    |

The other three determinants needed to calculate the solutions are derived from (4.47) with a permutation of the indices of the  $\Delta_i$  and  $H_i$  shown in Table 4.1. Remember that also  $\Psi$  and  $\Gamma$  are affected by this permutation since they contain  $\Delta_i$  and  $H_i$ . The reconstruction filters are then calculated as the ratio between two determinants, following the Cramer's rule, and with further simplifications of common factors we finally obtain the expressions (4.48), (4.49) and (4.50).

$$F_{1}(e^{j\omega}) = \frac{|H_{0}|}{|H_{1}|} \cdot \frac{\cos\left[\frac{\pi(\Delta_{2}-\Delta_{0})}{4}\right] \sin\left[\frac{\pi(\Delta_{3}-\Delta_{0}-1)}{4}\right]}{\cos\left[\frac{\pi(\Delta_{1}-\Delta_{3})}{4}\right] \sin\left[\frac{\pi(\Delta_{1}-\Delta_{2}-1)}{4}\right]} \cdot \frac{-jQ(\omega)e^{jQ(\omega)\frac{\pi(\Delta_{2}-\Delta_{0})}{2}} + e^{jQ(\omega)\frac{\pi(\Delta_{3}-\Delta_{0})}{2}} + jQ(\omega)}{-jQ(\omega) + e^{jQ(\omega)\frac{\pi(\Delta_{2}-\Delta_{1})}{2}} + jQ(\omega)e^{jQ(\omega)\frac{\pi(\Delta_{3}-\Delta_{1})}{2}}} \cdot e^{-j[\omega-\pi Z(\omega)](\Delta_{0}-\Delta_{1})}$$
(4.48)

$$F_{2}(e^{j\omega}) = \frac{|H_{0}|}{|H_{2}|} \cdot \frac{\sin\left[\frac{\pi(\Delta_{3}-\Delta_{0}-1)}{4}\right] \sin\left[\frac{\pi(\Delta_{0}-\Delta_{1}-1)}{4}\right]}{\sin\left[\frac{\pi(\Delta_{2}-\Delta_{3}-1)}{4}\right] \sin\left[\frac{\pi(\Delta_{1}-\Delta_{2}-1)}{4}\right]} \cdot \frac{-jQ(\omega)e^{jQ(\omega)\frac{\pi(\Delta_{3}-\Delta_{0})}{2}} + 1 + jQ(\omega)e^{jQ(\omega)\frac{\pi(\Delta_{1}-\Delta_{0})}{2}}}{-jQ(\omega)e^{jQ(\omega)\frac{\pi(\Delta_{1}-\Delta_{2})}{2}} + 1 + jQ(\omega)e^{jQ(\omega)\frac{\pi(\Delta_{3}-\Delta_{2})}{2}}} \cdot e^{-j[\omega-\pi Z(\omega)](\Delta_{0}-\Delta_{2})}$$
(4.49)

$$F_{3}(e^{j\omega}) = \frac{|H_{0}|}{|H_{3}|} \cdot \frac{\cos\left[\frac{\pi(\Delta_{2}-\Delta_{0})}{4}\right] \sin\left[\frac{\pi(\Delta_{0}-\Delta_{1}-1)}{4}\right]}{\cos\left[\frac{\pi(\Delta_{1}-\Delta_{3})}{4}\right] \sin\left[\frac{\pi(\Delta_{2}-\Delta_{3}-1)}{4}\right]} \cdot \frac{-jQ(\omega) + e^{jQ(\omega)\frac{\pi(\Delta_{1}-\Delta_{0})}{2}} + jQ(\omega)e^{jQ(\omega)\frac{\pi(\Delta_{2}-\Delta_{0})}{2}}}{-jQ(\omega)e^{jQ(\omega)\frac{\pi(\Delta_{1}-\Delta_{3})}{2}} + e^{jQ(\omega)\frac{\pi(\Delta_{2}-\Delta_{3})}{2}} + jQ(\omega)} \cdot e^{-j[\omega-\pi Z(\omega)](\Delta_{0}-\Delta_{3})}$$
(4.50)

# 4.4 First-Order Taylor Approximation

In order to use calibration techniques based on linear estimation methods, the first-order approximation of the filters must be derived. We can write the first order Taylor expansion of a multivariable function in the following way:

$$f(\mathbf{x}) \approx f(\mathbf{c}) + \nabla f(\mathbf{c}) \cdot (\mathbf{x} - \mathbf{c})$$
 (4.51)

where  $\nabla f(\mathbf{c})$  is the vector of partial derivatives evaluated in  $\mathbf{x} = \mathbf{c}$ . We define:

$$\frac{\Delta_i - \Delta_0}{2} = x_i \tag{4.52}$$

The total number of variables is 6 but each transfer function  $F_i$  depends on 4 variables only:  $\epsilon_i, x_1, x_2$  and  $x_3$ . This implies the calculation of 4 partial derivatives for each  $F_i$ . For ease of notation and calculus consider the function  $F_i$  as the product of three functions:

$$F_i(e^{j\omega}, \mathbf{x}) = G_i(g_i) \cdot I_i(x_1, x_2, x_3) \cdot D_i(e^{j\omega}, x_1, x_2, x_3)$$
(4.53)

 $G_i$  contains the dependence from gain error while  $I_i$  and  $D_i$  that from the timing skews, respectively frequency independent and frequency dependent. For example, the three functions for the decomposition of  $F_1$  are:

$$G_1 = \frac{|H_0|}{|H_1|} = \frac{1+g_0}{1+g_1} \tag{4.54}$$

$$I_{1} = \frac{\cos\left[\frac{\pi x_{2}}{2}\right] \sin\left[\frac{\pi(x_{3}-1/2)}{2}\right]}{\cos\left[\frac{\pi(x_{1}-x_{3})}{2}\right] \sin\left[\frac{\pi(x_{1}-x_{2}-1/2)}{2}\right]}$$
(4.55)

$$D_{1} = \frac{-jQ(\omega)e^{jQ(\omega)\pi x_{2}} + e^{jQ(\omega)\pi x_{3}} + jQ(\omega)}{-jQ(\omega) + e^{jQ(\omega)\pi(x_{2}-x_{1})} + jQ(\omega)e^{jQ(\omega)\pi(x_{3}-x_{1})}} \cdot e^{j[\omega-\pi Z(\omega)]2x_{1}}$$
(4.56)

To evaluate Taylor's expansion of the  $F_i$  functions we need  $F_i(e^{j\omega}, \mathbf{c})$  and  $\nabla F_i(e^{j\omega}, \mathbf{c})$ , with  $\mathbf{c} = \mathbf{0}$ . It's easy to verify that  $F_i(e^{j\omega}, \mathbf{0}) = 1$ . For the calculation of the partial derivatives, neglecting the dependencies from the variables in the notation, it's useful to obtain the relations:

$$\frac{\partial F_i}{\partial g_0}\Big|_{\mathbf{x}=\mathbf{0}} = \frac{\partial G_i}{\partial g_0} \cdot I_i \cdot D_i \Big|_{\mathbf{x}=\mathbf{0}} = \frac{\partial G_i}{\partial g_0}\Big|_{\mathbf{x}=\mathbf{0}} = 1$$
(4.57)

$$\left. \frac{\partial F_i}{\partial g_i} \right|_{\mathbf{x}=\mathbf{0}} = \left. \frac{\partial G_i}{\partial g_i} \cdot I_i \cdot D_i \right|_{\mathbf{x}=\mathbf{0}} = \left. \frac{\partial G_i}{\partial g_i} \right|_{\mathbf{x}=\mathbf{0}} = -1$$
(4.58)

$$\frac{\partial F_i}{\partial x_j}\Big|_{\mathbf{x}=\mathbf{0}} = G_i \cdot \left[\frac{\partial I_i}{\partial x_j} + \frac{\partial D_i}{\partial x_j}\right]\Big|_{\mathbf{x}=\mathbf{0}} = \left[\frac{\partial I_i}{\partial x_j} + \frac{\partial D_i}{\partial x_j}\right]\Big|_{\mathbf{x}=\mathbf{0}}$$
(4.59)

### 4.4.1 Compact expression synthesis

The first-order expansions of the filters with  $\mathbf{c} = \mathbf{0}$  are calculated. Considering the relation  $1 - 2Q^2(\omega) = R(\omega)$  the expressions are:

$$F_1(e^{j\omega}, \mathbf{x}) \approx 1 + \epsilon_1 + \pi \left[\frac{R(\omega)}{2} + jQ(\omega) + j\frac{2\omega}{\pi} - j2Z(\omega)\right] \cdot x_1 + \\ -\pi \left[\frac{R(\omega)}{2} + jQ(\omega)\right] \cdot x_2 - \pi \left[\frac{R(\omega)}{2} - jQ(\omega)\right] \cdot x_3$$
(4.60)

$$F_2(e^{j\omega}, \mathbf{x}) \approx 1 + \epsilon_2 + \pi R(\omega) \cdot x_1 + j\pi \left[\frac{2\omega}{\pi} - 2Z(\omega)\right] \cdot x_2 - \pi R(\omega) \cdot x_3 \qquad (4.61)$$

$$F_3(e^{j\omega}, \mathbf{x}) \approx 1 + \epsilon_3 + \pi \left[\frac{R(\omega)}{2} + jQ(\omega)\right] \cdot x_1 + \pi \left[\frac{R(\omega)}{2} - jQ(\omega)\right] \cdot x_2 + \\ + \pi \left[-\frac{R(\omega)}{2} + jQ(\omega) + j\frac{2\omega}{\pi} - j2Z(\omega)\right] \cdot x_3$$
(4.62)

whit  $\epsilon_i = g_0 - g_i$ . To produce a more compact representation assume:

$$F_a(\omega) = \pi \left[ \frac{R(\omega)}{2} + jQ(\omega) \right]$$
(4.63)

$$F_b(\omega) = j2 \left[ \omega - \pi Z(\omega) \right] \tag{4.64}$$

The expressions of the approximated filters are:

$$F_1\left(e^{j\omega}\right) \approx 1 + \epsilon_1 + \left[F_a(\omega) + F_b(\omega)\right] x_1 - F_a(\omega)x_2 - F_a^*(\omega)x_3 \tag{4.65}$$

$$F_2(e^{j\omega}) \approx 1 + \epsilon_2 + \pi R(\omega)x_1 + F_b(\omega)x_2 - \pi R(\omega)x_3 \tag{4.66}$$

$$F_3(e^{j\omega}) \approx 1 + \epsilon_3 + F_a(\omega)x_1 + F_a^*(\omega)x_2 - [F_a^*(\omega) - F_b(\omega)]x_3 \qquad (4.67)$$

The filters' responses are Hermitian as expected, and thus real in the time domain. The filters are a linear combination<sup>1</sup> of (4.63) and (4.64), whose magnitude and phase responses are shown in Figs 4.10 and 4.11. In other works [30, 72, 23] these responses are approximated using linear combinations of simple filters (i.e LPF, Hilbert, Differentiator) but using a higher number of free parameters. The results in this work set a lower bound of 4 parameters (3 time skews and 1 gain error) for each channel.



**Figure 4.10.**  $F_a(\omega)$  filter magnitude and phase responses



**Figure 4.11.**  $F_b(\omega)$  filter magnitude and phase responses

<sup>&</sup>lt;sup>1</sup>Note that  $\pi R(\omega) = F_a(\omega) + F_a^*(\omega)$ .

# 4.5 Behavioral Simulations and Results

To validate the exact and approximated results, numerical simulations are performed. In the test environment gain and timing mismatches are simulated using fractional delay filters on each channel with specified error parameters  $\Delta_i$ . The same parameters are then used to generate the filter responses with which each channel is filtered. A length of 256 taps is used to realize both exact and approximated filters in the frequency domain and then impulse responses are calculated by the inverse DFT. The comparison between the un-calibrated output and the output calibrated with the exact and approximated filters in the case of 1% gain error and 0.5% time skews is shown in Fig.4.12. The exact filters completely remove distortions proving the correctness of the calculations. Calibrating with the first-order approximations gives more than 40dB SFDR improvement.



Figure 4.12. Comparison between output signal spectra before and after calibration using the exact and the approximated reconstruction filters for a known combination of timing skews and gain errors.

# 4.6 Background calibration using cyclostationary filter banks

Perfect reconstruction filters which cancel distortions due to aliasing exist under certain assumptions [79], and calibration can be interpreted as approximating these filters to minimize distortions [72]. This can be performed by adaptive FIR structures, but least squares (LS) estimation of large models is computationally expensive and prone to numerical stability issues [82]. It is thus important to find models which are both accurate and simple: the number of free parameters and computational costs should be minimized.



Figure 4.13. 4-Channel TI-ADC error model

Fig. 4.13 shows the discrete-time equivalent model of a 4-channel TI-ADC affected by the channel filters  $H_i(e^{j\omega})$ , similar to the one presented in Section 4.3. These filters represent the analog frequency response of each channel and can model gain, timing, or bandwidth errors, or any linear mismatch among the channels. When these responses are mismatched, recombination of the channels produces aliasing at the output. A cyclo-stationary filter with period M=4 can be used to perform mismatch correction. Such a filter can be implemented by an architecture identical to that shown in Fig. 4.13, with the correction filters  $G_i(e^{j\omega})$  in place of  $H_i(e^{j\omega})$ . The input of the correction block is  $Y(e^{j\omega})$ , and the calibrated output is  $Z(e^{j\omega})$ :

$$Z\left(e^{j\omega}\right) = \sum_{k=0}^{3} Y\left(e^{j\omega-j\frac{\pi}{2}k}\right) \left[\frac{1}{4}\sum_{i=0}^{3} G_i\left(e^{j\omega-j\frac{\pi}{2}k}\right)e^{j\frac{\pi}{2}ih}\right]$$
(4.68)

The analytical solution of the system of equations arising from the perfect reconstruction condition can be carried out using an approach similar to that adopted in the previous Section or an approximation can be found using adaptive methods. Calibration forces  $Z(e^{j\omega}) \approx X(e^{j\omega})$  by adjusting the correction filters  $G_i(e^{j\omega})$ , which depend on the error filters  $H_i(e^{j\omega})$ . Usually, the  $H_i(e^{j\omega})$ 's are not known and the  $G_i(e^{j\omega})$ 's are estimated adaptively. It is common to only cancel the aliasing terms and leave a linear error, as this doesn't affect linearity.



Figure 4.14. 4-Channel TI-ADC calibration architecture (a) and corresponding clocking scheme (b)

Fig. 4.14a shows the poly-phase architecture of the calibration system, described in [22, 72]. As in previous literature, a reference channel is used, which is periodically aligned with the other channels. The calibration algorithm compares the outputs of the channel under calibration, y[n], and of the reference channel, r[n], when they are aligned, and chooses the correction coefficients to minimize their difference, obtaining the calibrated output z[n]. At steady state, if both the estimates and the model are accurate, each channel has the same frequency response as the reference channel, and aliasing distortions are minimized. The algorithm also corrects for offset mismatches, not shown for simplicity. Adding a reference channel has an overhead in terms of area, and power consumption, which falls with the number of channels. With two different clock domains, signal integrity is of the essence to avoid spurs. Circuit-level (voltage buffers) and layout-level (guard-rings, decoupling capacitors) techniques are required for robust design. In [60] a comparable scheme has been implemented and measured in CMOS 130 nm. Fig. 4.15shows the approximation of the desired



Figure 4.15. Approximation of the desired frequency response with a filter base. The calibrated output  $Z(e^{j\omega})$  is computed from the uncalibrated output  $Y(e^{j\omega})$  given the weights  $\alpha_{ij}$  estimated by the adaptive loop (offset correction not shown)

correction filters' frequency response as a linear combination of base filters. The adaptive filters estimate the coefficients of this combination. This architecture can simulate the algorithms in [22, 72, 23] and a generic FIR model [72], by changing the number and frequency responses of the base filters.

### 4.6.1 Expression of the filter base

The error filters modeling gain, timing and bandwidth mismatches,  $g_i$ ,  $t_i$  and  $b_i$ , can be expressed as [71]:

$$H_i(\omega) = \frac{(1+g_i)e^{-j\omega t_i}}{1+j\omega/b_i}$$

$$(4.69)$$

Alternatively, the mixed signal model for bandwidth mismatch can be used [71], but it has been shown to be equivalent to the analog model in (4.69) if the bandwidth of the filters is just a few times larger than the Nyquist frequency [71].

Assuming (4.69), a numerical solution for the four correction filters, shown in Fig. 4.16, can be found for a given realization of the mismatch parameters. The shape of the frequency response of the correction filters is due to frequency translations, which create discontinuities at  $\frac{\pi}{2}$  ( $f_S/4$ ) both in level and slope. This suggests the use of low-pass filters with a bandwidth of  $\frac{\pi}{2}$  (0.25 in normalized frequency) and their combination with other filters, to replicate the frequency responses in Fig.4.16. Low-pass filters were not present in previous works [22, 23, 72], whose models could not replicate the observed discontinuity in the middle of the Nyquist band. The model used in [22] approximated the correction filters as  $G_i(e^{j\omega}) \approx 1 + a_{i1} + j\omega a_{i2}$ . Accuracy is limited, because the model cannot replicate the discontinuities at  $\frac{\pi}{2}$ , and



Figure 4.16. Correction filters shape (errors have 1% standard deviation, mean bandwidth is  $4f_S$ , one random realization is shown for the four channels).

the different slopes of the signals in the two halves of the spectrum. The following model yields much better accuracy. The residual error is parabolic, so that all the linear-in-frequency terms are cancelled. Furthermore, removal of any term yields a linear (in frequency) error:

$$G_{i}(e^{j\omega}) \approx 1 + a_{i1} + j\omega a_{i2} + G_{H}(e^{j\omega})a_{i3} + G_{L}(e^{j\omega})a_{i4} + G_{H}(e^{j\omega})G_{L}(e^{j\omega})a_{i5} + j\omega G_{H}(e^{j\omega})a_{i6} + j\omega G_{L}(e^{j\omega})a_{i7} + j\omega G_{H}(e^{j\omega})G_{L}(e^{j\omega})a_{i8}$$
(4.70)

In this expression,  $G_H(e^{j\omega})$  is the Hilbert filter, and  $G_L(e^{j\omega})$  the low-pass filter with  $\frac{\pi}{2}$  bandwidth. The approximation error after least squares fitting is shown in Fig. 4.17. With 8 parameters, it is lower than 0.04%. The estimation of a model with 8 parameters can be cumbersome. Monte Carlo simulations have been performed to derive the distribution of the  $a_{ij}$  coefficients, to determine the more relevant base filters for calibration. The most important is the differentiator term  $a_{i2}$ , while the terms  $a_{i1}$ ,  $a_{i3}$ ,  $a_{i4}$ , and  $a_{i5}$  are also significant. The resulting simplified model is:

$$G_i(e^{j\omega}) \approx 1 + a_{i1} + j\omega a_{i2} + G_H(e^{j\omega})a_{i3} + G_L(e^{j\omega})a_{i4} + G_H(e^{j\omega})G_L(e^{j\omega})a_{i5}$$
(4.71)

The maximum approximation error of this model is 0.2% (not shown). Though less accurate, the model is easier to estimate, and less computationally expensive.

Adding a second-order differentiator to the model (4.70) yields a 9-parameter model whose maximum error is 0.0013%:

$$G_{i}(e^{j\omega}) \approx 1 + a_{i1} + j\omega a_{i2} + G_{H}(e^{j\omega})a_{i3} + G_{L}(e^{j\omega})a_{i4} + G_{H}(e^{j\omega})G_{L}(e^{j\omega})a_{i5} + j\omega G_{H}(e^{j\omega})a_{i6} + j\omega G_{L}(e^{j\omega})a_{i7} + j\omega G_{H}(e^{j\omega})G_{L}(e^{j\omega})a_{i8} - \omega^{2}a_{i9}$$
(4.72)



Figure 4.17. Approximation error of using (4.70) to fit the filters in Fig. 4.16

### 4.6.2 Complexity reduction

The FIR structures implementing the base filters in (4.70) and (4.71) are truncated to 2T+1 terms, from -T to T, windowed, and translated for causality. T=10 in the following. A few products have been spared by scaling the maximum coefficients of each filter to 1. This comes at no cost in terms of accuracy because, if one of the filters is scaled, its coefficient is inversely scaled by the adaptive algorithm. As the correction filters are approximated as a linear combination of filters, they can be expressed in another base. Because this process does not affect the accuracy of the model, it is possible to minimize complexity by maximizing the number of zeros and ones in the correction filters, thus reducing the number of multipliers. For instance, a differentiator filter has impulse response  $(-1)^n/n$ , whereas a Hilbert filter (if scaled to a maximum value of 1) has impulse response 1/n for odd values, and 0 for even values. The odd values are the same (except for the sign) for the two filters. Hence, instead of using a Hilbert and a differentiator filter (requiring 3T/2-2products), the differentiator filter can be substituted with a linear combination of the two whose odd coefficients are zero (total complexity: T-2 products), saving T/2products. This procedure has been used, after grouping the filters in symmetrical and anti-symmetrical, for the models (4.70) and (4.71), as those in [23]. With no cost in terms of accuracy, the improvement in complexity is significant for the new models.

A further technique, though approximated, has been employed. Many coefficients in the filter banks are small, as they fall as 1/n or faster. By removing the coefficients lower in absolute value than 0.01, the models in [23] save 2 coefficients (in the differentiator), and the models (4.71), 4.70, (4.72) spares 6, 14 and 19, respectively. The threshold 0.01 is the largest we could choose before reducing average linearity in Monte Carlo simulations.

### 4.6.3 Behavioral simulations and results

The three models (4.71), (4.70) and (4.72) have been simulated for a 4-channel Nyquist TI-ADC affected by offset, gain, timing, bandwidth mismatches of 5% standard deviation (average bandwidth is twice  $f_S$ ). Because of the offset term, the three models have 9, 6 and 10 parameters, respectively. The input signal is bandpass noise from 20% to 80% of the Nyquist band ([ $0.1f_s, 0.4f_s$ ]). Models in [22, 23, 72] have also been simulated.



Figure 4.18. Complexity / accuracy trade-off for several correction filter models.

Fig. 4.18 shows a comparison of average linearity (in ENOB) versus computational complexity (defined as number of products plus 3 times the number of divisions per sample) of several algorithms, divided in five groups. These are: adaptive FIR filters with  $T = 3, \ldots, 8$ ; models obtained from the real (orders 1, 2, 3) and complex (orders 1, 2) Taylor expansions described in [23] (called REAL and CPLX in Fig. 4.18); models derived from [23] for different values of the parameter  $N_L = 1, \ldots, 5$ (equation (10), in [23]); and the three new models. The new models are better (more accurate or less expensive) than FIR filters and real and complex Taylor expansions, but are no better than the models obtained from [23], before the new complexity reduction techniques described in Subsection 4.6.2 are used. Fig. 4.19 shows the results of using the complexity reduction techniques only for the two families of algorithms with the best performance (and the FIR filters, for reference). The new models are now more efficient than those in [23], because the reduction in numerical complexity in the filtering section is more pronounced with the new models. Further complexity reduction may be achieved because the new models have 6, 9 and 10 parameters, whereas the closest models in [23] in terms of linearity have 7, 11 and 13  $(N_L = 2, 4, 5)$ . Numerical accuracy is required mainly in the adaptive loop, because



Figure 4.19. Complexity / accuracy trade-off after optimization.

|                 | # of products |      |       | # of divs. |      |      |
|-----------------|---------------|------|-------|------------|------|------|
|                 | Filt.         | Est. | Corr. | Tot.       | Est. | ENOB |
| Eq. 4.71        | 8             | 17.4 | 5     | 30.4       | 1.4  | 9.1  |
| $[23], N_L = 2$ | 4             | 22.4 | 6     | 32.4       | 1.6  | 9.0  |
| Eq. 4.70        | 14            | 34.2 | 8     | 56.2       | 2    | 10.7 |
| $[23], N_L = 4$ | 2             | 48.4 | 10    | 60.4       | 2.4  | 10.7 |
| Eq. 4.72        | 15            | 41   | 9     | 65         | 2.2  | 10.9 |
| $[23], N_L = 5$ | 1             | 65   | 12    | 80         | 2.8  | 11   |

Table 4.2. Computational costs of the models after optimization

of potential numerical instability. It may thus be possible to reduce the number of bits in the fixed-point implementation of the filtering section. This would further increase the relative efficiency of the new models.

Table 4.2 shows the computational costs of the models after optimization, compared with those in [23] and the average linearity in ENOB, expressed in dB. The cost is expressed in number of products and divisions per sample. Divisions are only used during estimation [11], whereas multiplications are required during filtering, estimation, and correction.

The new three models and the ones in [23] have been compared in terms of stability in fixed-point arithmetic and convergence speed. Fig. 4.20 shows the models in [23] and the new models (4.70-4.71) in 24-bit fixed-point arithmetic. Model 4.72 is no more accurate than (4.71) in fixed point and is not shown. The new models have the same speed and steady-state accuracy as the models in [23] in fixed-point arithmetic. Convergence takes about 2400 samples for the models (4.70) and [23] with  $N_L = 4$ , and 1000 samples for the models (4.71) and [23] with  $N_L = 2$ . With one update every 20 samples (when the reference and the calibrated channels are



Figure 4.20. Convergence speed and average accuracy of the models. With  $N_L$  in [23] chosen to achieve comparable steady-state accuracy to (4.70) and (4.71), respectively

aligned) this means 120 and 50 updates to convergence, respectively.

# 4.7 Conclusions and future work

Two different approaches to the calibration of TI-ADC have been shown: the analytical solution of the Papoulis model and the approximation of the correction filters in the cyclo-stationary architecture. In the first approach the exact form of the 4-channels TI-ADC perfect reconstruction filters has been calculated and validated by numerical simulation. The first order approximation of the solutions has been derived and expressed in a compact form. The lower bound of 4 free parameters needed for calibrating gain and timing mismatches has been reached. No fixed-point performance has been analyzed yet nor practical adaptive estimation algorithms have been tested, these topics will be addressed in future studies. In the second approach a background calibration based on cyclo-stationary (periodic time-varying) filter banks is addressed. The correction is performed with a 4-periodic filter at the output of the adding node of the TI-ADC, turning the estimation problem into four separate adaptive filtering problems with one quarter of the complexity with respect to the Papoulis architecture. Three new models for the approximation of the correction filter are described in terms of a linear combination of base filters, and these models are more accurate than the ones in [72]. After applying a complexity reduction the new models are also better than the ones in [23], with comparable convergence speed and accuracy in fixed-point implementation and lower computational cost. The methodology has been developed for 4-channel TI-ADCs and may be extended

to a higher number of channels in future works.

# Chapter 5

# Digital-IF receiver nonlinear calibration

Future mobile multi-antenna architectures require an ever growing number of receivers, imposing critical constraints on cost, power and size of each receiver chain. The straightforward approach is to exploit nanometer CMOS processes even for the analog processing section and to minimize power consumption. Moreover, data rates in wireless communications continue to increase approximately five times every four years [37] (mobile 5G specification calls for a 20 Gb/s peak download data rate [48]). In these conditions, achieving high linearity and dynamic range for wide-band multi-carrier communications or radar applications becomes extremely difficult. Performance can be improved by analog design, increasing complexity and power consumption. Furthermore, the design of analog compensation architectures heavily depends on the technological node and represent a recurring cost for successive implementations with future scaled processes. Conversely, calibration techniques implemented in the digital domain can have a smaller impact on the power consumption and are "portable" with respect to technological node scaling if the implemented model is capable of representing the behavior of the non-linear analog section.

Different digital signal processing (DSP) techniques have been presented in the literature for the compensation of RF front-ends and analog baseband nonlinearities, mainly using feed-forward or post-distortions architectures. [43, 65, 101] present feed-forward techniques for Adaptive Interference Cancellation (AIC) applied on Direct Conversion Receiver (DCR) architectures. DCRs are widespread in Software Defined Radio (SDR) applications: they have little selectivity at RF, featuring wideband front-ends and flexible digital processing. AIC methods require adaptive signal processing algorithms that often assume statistical independence between signals in different frequency bands, which is true for physically uncorrelated sources. Some results are limited to particular waveforms as in [88], where detection and compensation algorithm only works on two-tone signals, and others only in particular conditions, like [80], where the blind identification technique works only for strongly nonlinear systems being limited by the convergence precision of the detection algorithm. [13] describes a mixed analog-digital co-design of a receiver using a sparse Volterra equalizer. Orthogonal Matching Pursuit (OMP) is used to estimate the parameters

of a sparse subset of the real-valued Volterra series. The same approach in [93] has been used to build a nonlinear equalizer IC processor (NLEQ) able to compensate the nonlinear behavior of commercial-off-the-shelf (COTS) low-pass systems. This method relies upon real valued processing and it is not straightforwardly applicable to communication systems exploiting baseband complex signal processing.

In this chapter baseband Volterra models are analyzed and a novel one is proposed able to represent and compensate nonlinearities in sub-sampling digital-IF receivers. An offline calibration technique using the new model is validated by means of behavioral and circuital simulations on a bandpass anti-aliasing filter implemented using 45 nm process by ST Microelectronics.

# 5.1 Receiver Target Architecture

Calibration techniques for linear and nonlinear equalization are applied in the digital domain using complex baseband notation. This approach has been chosen instead of using real low-pass representation, as in [13], for its greater flexibility because it can be used in both Digital-IF receivers and native IQ receivers that have one or two real outputs respectively. In the first architecture, shown in Fig. 5.1, the baseband components are extracted in the digital domain while in the second one, shown in Fig. 5.2, the extraction is made in the analog domain by an IQ mixer. Even if the data format for baseband processing is the same, the IQ architecture can show additional impairments due to IQ imbalance.



Figure 5.1. Digital-IF Receiver Architecture with digital I/Q extraction

In this study the Digital-IF architecture [33] has been chosen as a target to develop and validate models and algorithms for RF system digital calibration. The Digital-IF receiver architecture shown in Fig. 5.1 is composed of a RF band pass filter (RFF), a Low Noise Amplifier (LNA), a mixer for the down conversion to IF, an Anti-Aliasing Filter (AAF), the A/D converter and the digital baseband processing section. A typical choice is to exploit sub-sampling: using a lower sampling rate decreases power consumption but care must be given to Signal-to-Noise Ratio (SNR) because of noise folding.

The digital IQ extraction is carried out multiplying the digitized IF signal by two quadrature sinusoidal sequences  $(\sin(2\pi f_{IF}nT_s) \text{ and } \cos(2\pi f_{IF}nT_s))$  and low-pass filtering the results to obtain the In-phase and Quadrature sequences, respectively



Figure 5.2. IQ Receiver Architecture

 $y_I[n]$  and  $y_Q[n]$ . The complex baseband output sequence is then:

$$\tilde{y}[n] = y_I[n] + j \ y_Q[n] \tag{5.1}$$

An efficient way to implement the quadrature signals is to choose an intermediate frequency  $f_{IF}$  equal to a quarter of the sampling frequency:

$$f_{IF} = \frac{f_s}{4} \tag{5.2}$$

With this condition the quadrature sinusoids become:

$$s_I[n] = \cos(2\pi f_{IF} n T_s) = \cos\left(\frac{\pi}{2}n\right) \to \{1, 0, -1, 0, \dots\}$$
 (5.3)

$$s_Q[n] = \sin(2\pi f_{IF} n T_s) = \sin\left(\frac{\pi}{2}n\right) \to \{0, 1, 0, -1, \dots\}$$
 (5.4)

Mixing can be implemented efficiently without using multiplications, multiplexing even and odd samples on each branch with zeros and changing the bit sign alternatively. This technique can be applied also when exploiting sub-sampling, choosing an intermediate frequency in the center of the Nyquist band used and remembering to invert the sign of the quadrature component when the Nyquist band is even.

# 5.2 Baseband Volterra models for bandpass systems

Most of the literature [58, 108, 35, 92] focuses on baseband Volterra series for power amplifiers modeling and pre-distortion applications. Dealing with receiver nonlinearities mitigation, [43], [65] and [88] find specific nonlinear terms in a Direct Conversion Receiver (DCR) generated by the cascade of the RF front-end analog section (LNA) and the BB section, don't use a generalized model and don't deal with sampling frequency issues. In [13] a real sparse Volterra model is used for both the I and Q channels without exploiting the efficiency of the baseband notation. In this Section the complex-signal baseband Volterra model for bandpass systems is derived and an extension is proposed that includes out-of-band harmonic distortions aliasing in active bandpass filters exploiting sub-sampling down-conversion.

### 5.2.1 Classic baseband model derivation

The Volterra series for a discrete-time non-linear system is given by:

$$y[n] = \sum_{k=1}^{K} y_k[n] \qquad y_k[n] = \sum_{q_1} \cdots \sum_{q_k} h_k[q_1, \dots, q_k] \cdot \prod_{i=1}^{k} x[n-q_i]$$
(5.5)

where K is the maximum order of nonlinearity and  $h_k$  is the k-th order Volterra kernel. The system memory shall be specified by the limits of the sums. Assume also that the input sequence is obtained by sampling an analog signal verifying the Nyquist criterion. Since we are modeling a bandpass system we can write the input signal with the complex envelope notation:

$$x[n] = \Re\left\{\tilde{x}[n]\mathrm{e}^{j\omega_c n}\right\} = \frac{1}{2}\left(\tilde{x}[n]\mathrm{e}^{j\omega_c n} + \tilde{x}^*[n]\mathrm{e}^{-j\omega_c n}\right)$$
(5.6)

where  $\omega_c$  is the normalized carrier frequency equal to  $\frac{\tilde{\omega}_c}{\omega_s}$ . Let's write the above equation in the following way:

$$x[n] = \frac{1}{2} \sum_{b=0}^{1} \bar{x}_b \exp\left[(-1)^b j\omega_c n\right]$$
(5.7)

where b is a binary index,  $b \in \{0, 1\}$ . With this notation the argument of the product in 5.5 becomes:

$$x[n-q_i] = \frac{1}{2} \sum_{b=0}^{1} \bar{x}_b \exp\left[(-1)^b j\omega_c n\right] \exp\left[(-1)^{b+1} j\omega_c q_i\right]$$
(5.8)

in which

$$\bar{x}_b = \begin{cases} \tilde{x}[n], & \text{if } b = 0\\ \tilde{x}^*[n], & \text{if } b = 1 \end{cases}$$

$$(5.9)$$

We can rewrite 5.5:

$$y_k[n] = \frac{1}{2^k} \sum_{q_1} \cdots \sum_{q_k} h_k[q_1, \dots, q_k] \cdot \prod_{i=1}^k \sum_{b=0}^1 \bar{x}_b \exp\left[(-1)^b j\omega_c n\right] \exp\left[(-1)^{b+1} j\omega_c q_i\right]$$
(5.10)

Using the identity:

$$\prod_{i=1}^{k} \sum_{b=0}^{1} f_{b}[q_{i}] = \underbrace{\sum_{b_{1}=0}^{1} \cdots \sum_{b_{k}=0}^{1}}_{k \text{ sums}} \prod_{i=1}^{k} f_{b_{i}}[q_{i}]$$
(5.11)

we can rewrite 5.10

. .

$$y_{k}[n] = \frac{1}{2^{k}} \sum_{q_{1}} \cdots \sum_{q_{k}} h_{k}[q_{1} \dots q_{k}] \sum_{b_{1}=0}^{1} \cdots \sum_{b_{k}=0}^{1} \prod_{i=1}^{k} \left\{ \bar{x}_{b_{i}}[n-q_{i}]e^{\left[(-1)^{b_{i}}j\omega_{c}n\right]}e^{\left[(-1)^{b_{i}+1}j\omega_{c}q_{i}\right]} \right\} = \frac{1}{2^{k}} \sum_{b_{1}=0}^{1} \cdots \sum_{b_{k}=0}^{1} \sum_{q_{1}} \cdots \sum_{q_{k}} h_{k}[q_{1} \dots q_{k}] \prod_{i=1}^{k} \bar{x}_{b_{i}}[n-q_{i}] \cdot \exp\left[j\omega_{c}q_{i}\sum_{n=1}^{k} (-1)^{b_{n}+1}\right] \exp\left[j\omega_{c}n\sum_{n=1}^{k} (-1)^{b_{n}}\right]$$

$$(5.12)$$

We can see that for k = 1 the resulting expression for the first order linear term is:

$$y_1[n] = \frac{1}{2} \sum_{b_1=0}^{1} \sum_{q_1} h_1[q_1] \cdot \bar{x}_{b_1}[n-q_1] \cdot \exp\left[j\omega_c q_1(-1)^{b_1+1}\right] \exp\left[j\omega_c n(-1)^{b_1}\right]$$
(5.13)

Like in 5.6 the expression of the output signal can be written as:

$$y[n] = \Re\left\{\tilde{y}[n]e^{j\omega_c n}\right\} = \frac{1}{2}\left(\tilde{y}[n]e^{j\omega_c n} + \tilde{y}^*[n]e^{-j\omega_c n}\right)$$
(5.14)

By visual inspection we determine the expression for the output complex envelope:

$$\tilde{y}[n] = \sum_{q_1} h_1[q_1] \cdot \bar{x}_0[n-q_1] \cdot \exp\left(-j\omega_c q_1\right) = \sum_{q_1} \tilde{h}_1(q_1) \cdot \tilde{x}[n-q_1]$$
(5.15)

with:

$$\tilde{h}_1[q_1] = h_1[q_1] \cdot \exp(-j\omega_c q_1)$$
(5.16)

Focusing on the 5.12 we can see that the expression is composed by a sum of k-dimensional sums multiplied by complex exponentials. We can represent this signal as:

$$y_k[n] = \sum_{\pi(\mathbf{B}_k)} g_k[\mathbf{B}_k, n] \cdot \exp\left[jm(\mathbf{B}_k)\omega_c n\right]$$
(5.17)

where:

- $\mathbf{B}_k$  is a k-elements binary vector such that  $\mathbf{B}_k(i) = b_i$
- $\pi(\mathbf{B}_k)$  is the set containing the  $2^k$  permutations of k binary values  $\{0, 1\}$

• 
$$g_k[\mathbf{B}_k, n] = \frac{1}{2^k} \sum \cdots \sum h_k[q_1 \dots q_k] \prod_{i=1}^k \bar{x}_{b_i}[n-q_i] \exp\left[j\omega_c q_i \sum_{n=1}^k (-1)^{b_n+1}\right]$$

•  $m(\mathbf{B}_k) = \sum_{n=1}^k (-1)^{b_n}$  is equal to the difference between the number of zeros and the number of ones in  $\mathbf{B}_k$ . For even values of k,  $m(\mathbf{B}_k)$  has even values  $\{0, \pm 2, \pm 4, \ldots, \pm k\}$ ; for odd values of k,  $m(\mathbf{B}_k)$  has odd values  $\{\pm 1, \pm 3, \ldots, \pm k\}$ .

To understand the "useful" contributions of the non-linear terms to the bandpass output we must analyze the problem in the frequency domain. Using the Discrete Fourier Transform rule:

$$\mathfrak{F}\left\{f[n]\mathrm{e}^{j\omega_{c}n}\right\} = F[\mathrm{e}^{j(\omega-\omega_{c})n}]$$
(5.18)

we can write the Fourier transform of 5.17 as:

$$Y_k(e^{j\omega}) = \sum_{\pi(\mathbf{B}_k)} G_k\left[\mathbf{B}_k, e^{j[\omega - m(\mathbf{B}_k)\omega_c]}\right]$$
(5.19)

where  $G_k(\mathbf{B}_k, e^{j\omega})$  is the DFT of  $g_k[\mathbf{B}_k, n]$ , whose bandwidth is k times the one of the input signal. The above equation tells us that the spectrum of the k-th order Volterra kernel is composed by  $2^k$  contributions centered around integer multiples of the carrier frequency  $\omega_c$ , as shown in Figure 5.3. Only the ones around  $\pm \omega_c$  will remain after a bandpass filter (i.e. an anti-aliasing filter); these are generated by odd values of k. In a system with K (odd) as the maximum order of non-linearity we can apply the change of variable:

$$k = 2l - 1$$
 with  $l \in \mathbb{N}$ ,  $1 \le l \le \frac{K+1}{2}$  (5.20)

The output signal expression becomes:

$$y_{2l-1}[n] = \sum_{\pi(\mathbf{B}_{2l-1})} g_{2l-1}[\mathbf{B}_{2l-1}, n] \cdot \exp\left[jm(\mathbf{B}_{2l-1})\omega_c n\right]$$
(5.21)

Between the  $2^{2l-1}$  vectors given by  $\pi(\mathbf{B}_{2l-1})$  only the ones in which  $m(\mathbf{B}_{2l-1}) = \pm 1$ (the difference between the number of zeros and the number of ones is  $\pm 1$ ) must be considered. The target subset is composed by twice the permutations of l elements in 2l - 1 positions, because the l elements are in one case zeros and in the other ones. We can define  $\mathbf{B}_{2l-1}^{l}$  the generic vector containing l zeros and symmetrically  $\mathbf{B}_{2l-1}^{l-1}$  the generic vector containing l ones. We can write:

$$\pi\left(\mathbf{B}_{2l-1}^{l}\right) = \left\{\pi(\mathbf{B}_{2l-1}) : m(\mathbf{B}_{2l-1}) = 1\right\}$$
(5.22)

$$\pi\left(\mathbf{B}_{2l-1}^{l-1}\right) = \left\{\pi(\mathbf{B}_{2l-1}) : m(\mathbf{B}_{2l-1}) = -1\right\}$$
(5.23)

The number of permutations is given by the multinomial coefficient.

$$\operatorname{card}\left[\pi\left(\mathbf{B}_{2l-1}^{l}\right)\right] = \operatorname{card}\left[\pi\left(\mathbf{B}_{2l-1}^{l-1}\right)\right] = \binom{2l-1}{l,l-1} = \frac{(2l-1)!}{l!(l-1)!} = \frac{1}{2}\binom{2l}{l} \quad (5.24)$$

The useful output signal around  $\pm \omega_c$  can be written as the sum of the components generated by the two subsets:

$$y_{2l-1}^{\pm\omega_c}[n] = \sum_{\pi\left(\mathbf{B}_{2l-1}^l\right)} g_{2l-1}\left[\mathbf{B}_{2l-1}^l, n\right] \cdot e^{j\omega_c n} + \sum_{\pi\left(\mathbf{B}_{2l-1}^{l-1}\right)} g_{2l-1}\left[\mathbf{B}_{2l-1}^{l-1}, n\right] \cdot e^{-j\omega_c n}$$
(5.25)



Figure 5.3. Nonlinear contributions to output spectrum. Even orders kernels don't produce components around  $f_c$ 

We are interested in simplifying the functions inside the sums for the vector subsets that we are considering. For each vector  $\mathbf{B}_{2l-1}^{l}$  we can write:

$$\prod_{i=1}^{2l-1} \bar{x}_{b_i}[n-q_i] = \underbrace{\prod_{i=1}^{l} \bar{x}_0[n-q_i]}_{l \text{ zeros}} \cdot \underbrace{\prod_{i=l+1}^{2l-1} \bar{x}_1[n-q_i]}_{l-1 \text{ ones}}$$
(5.26)

Therefore the first sum in 5.25 becomes:

$$\sum_{\pi(\mathbf{B}_{2l-1}^{l})} g_{2l-1} \left[ \mathbf{B}_{2l-1}^{l}, n \right] = \frac{1}{2^{2l}} \binom{2l}{l} \sum_{q_{1}} \cdots \sum_{q_{2l-1}} h_{2l-1}[q_{1} \dots q_{2l-1}] \cdot \prod_{i=1}^{l} \tilde{x}[n-q_{i}] \cdot \prod_{i=l+1}^{2l-1} \tilde{x}^{*}[n-q_{i}] \cdot \prod_{i=1}^{2l-1} e^{-j\omega_{c}q_{i}}$$
(5.27)

If we consider the same limit of integration for each variable  $q_i$  in (5.25), we can use arbitrary indexes for the terms inside the products, obtaining the key condition for the conjugate simmetry. The second sum can be written as :

$$\sum_{\pi(\mathbf{B}_{2l-1}^{l-1})} g_{2l-1} \left[ \mathbf{B}_{2l-1}^{l-1}, n \right] = \frac{1}{2^{2l}} \binom{2l}{l} \sum_{q_1} \cdots \sum_{q_{2l-1}} h_{2l-1} [q_1 \dots q_{2l-1}] \cdot \prod_{i=1}^l \tilde{x}^* [n-q_i] \cdot \prod_{i=l+1}^{2l-1} \tilde{x} [n-q_i] \cdot \prod_{i=1}^{2l-1} e^{j\omega_c q_i}$$
(5.28)

The equations (5.27) and (5.28) verify the following identity:

$$\sum_{\pi\left(\mathbf{B}_{2l-1}^{l-1}\right)} g_{2l-1}\left[\mathbf{B}_{2l-1}^{l-1}, n\right] = \left\{\sum_{\pi\left(\mathbf{B}_{2l-1}^{l}\right)} g_{2l-1}\left[\mathbf{B}_{2l-1}^{l}, n\right]\right\}^{*}$$
(5.29)

We have reached a useful mathematical notation for determining the bandpass response between input and output complex envelopes. In [58] this notation is called RF Volterra Model (RF-VM). Considering 5.14 and 5.29, by visual inspection of 5.25 we can write :

$$\begin{split} \tilde{y}_{2l-1}[n] &= \\ &= \frac{1}{2^{2l-1}} \binom{2l}{l} \sum_{q_1} \cdots \sum_{q_{2l-1}} h_{2l-1}[q_1 \dots q_{2l-1}] \prod_{i=1}^l \tilde{x}[n-q_i] \prod_{i=l+1}^{2l-1} \tilde{x}^*[n-q_i] \prod_{i=1}^{2l-1} e^{-j\omega_c q_i} \\ &= \sum_{q_1} \cdots \sum_{q_{2l-1}} \tilde{h}_{2l-1}[q_1 \dots q_{2l-1}] \cdot \prod_{i=1}^l \tilde{x}[n-q_i] \prod_{i=l+1}^{2l-1} \tilde{x}^*[n-q_i] \end{split}$$
(5.30)

with

$$\tilde{h}_{2l-1}[q_1\dots q_{2l-1}] = \frac{1}{2^{2l-1}} \begin{pmatrix} 2l\\ l \end{pmatrix} h_{2l-1}[q_1\dots q_{2l-1}] \prod_{i=1}^{2l-1} e^{-j\omega_c q_i}$$
(5.31)

Actual implementations of Volterra filters require a finite support and are thus able to represent only finite memory systems. Considering a truncated Volterra series with a finite memory M for the kernel 2l - 1 and exploiting the symmetry of the kernels, we can write:

$$\tilde{y}_{2l-1}[n] =$$

$$\sum_{q_1=0}^{M} \sum_{q_2=q_1}^{M} \cdots \sum_{q_l=q_{l-1}}^{M} \sum_{q_{l+1}=0}^{M} \cdots \sum_{q_{2l-1}=q_{2l-2}}^{M} \tilde{h}_{2l-1}[q_1 \dots q_{2l-1}] \prod_{i=1}^{l} \tilde{x}[n-q_i] \prod_{i=l+1}^{2l-1} \tilde{x}^*[n-q_i]$$
(5.32)
$$(5.33)$$

Note that different kernels can have different values of memory lags.

### 5.2.2 Model extension considering finite out-of-band attenuation

In practical bandpass systems the out-of-band attenuation is a finite quantity. It depends on the order and the circuital implementation of the bandpass filter (antialiasing filter in a receiver chain). Consider that in a multistage active filter the distortions are generated in each stage so the ones arising in the latest stage are not attenuated. Moreover, in an active filter with gain, the highest distortions are generated in the latest stages that work with higher signal amplitudes. Depending on the operating point of the active circuits, non-negligible distortion contributions can be found around  $\pm 3\omega_c$  that can be higher than the fifth-order in-band distortions. If such spectrum is sampled, the out-of band distortions can alias in band depending on the value of the sampling frequency. Figures 5.4 and 5.5 show the output of an Anti-Aliasing filter with a pass band of 10MHz and a center frequency of 30MHz. The input of the filter is a two-tone signal with frequencies 25 and 25.3125MHz. To obtain the first spectrum, the signal is sampled at 320MS/s allowing the identification of the spurious products in the frequency domain up to the fifth harmonic, while the second is obtained sampling the signal at 40MS/s, the actual choice used for down-converting it at a center frequency of 10MHz with sub-sampling. It is clear that HD3 and HD5 unwanted components alias back in the pass band of the filter.

This behavior can be modeled including out-of-band distortion terms centered around  $\pm 3\omega_c$  and  $\pm 5\omega_c$  in the equivalent baseband Volterra model, i.e. considering only the third order for the sake of simplicity we are interested in the vectors given by  $\pi(\mathbf{B}_{2l-1})$  with  $m(\mathbf{B}_{2l-1}) = \pm 3$  (the difference between the number of zeros and the number of ones is  $\pm 3$ ). The target subset is composed by twice the permutations of l + 1 elements in 2l - 1 positions, because the l + 1 elements are in one case zeros and in the other ones. We can define  $\mathbf{B}_{2l-1}^{l+1}$  the generic vector containing l + 1 zeros and symmetrically  $\mathbf{B}_{2l-1}^{l-2}$  the generic vector containing l + 1 ones. We can write:

$$\pi \left( \mathbf{B}_{2l-1}^{l+1} \right) = \left\{ \pi(\mathbf{B}_{2l-1}) : m(\mathbf{B}_{2l-1}) = 3 \right\}$$
(5.34)

$$\pi\left(\mathbf{B}_{2l-1}^{l-2}\right) = \left\{\pi(\mathbf{B}_{2l-1}) : m(\mathbf{B}_{2l-1}) = -3\right\}$$
(5.35)

The number of permutations is given by the multinomial coefficient.

$$\operatorname{card}\left[\pi\left(\mathbf{B}_{2l-1}^{l+1}\right)\right] = \operatorname{card}\left[\pi\left(\mathbf{B}_{2l-1}^{l-2}\right)\right] = \binom{2l-1}{l+1,l-2} = \frac{(2l-1)!}{(l+1)!(l-2)!} = \frac{1}{2}\frac{l-1}{l+1}\binom{2l}{l}$$
(5.36)



Figure 5.4. Spectrum of the Anti-Aliasing Filter output with  $f_s = 320$ MS/s.



Figure 5.5. Spectrum of the sub-sampled Anti-Aliasing Filter output with  $f_s = 40$ MS/s.

The output signal containing third-order harmonic distortions can be written as the sum of the components generated by the two subsets:

$$y_{2l-1}^{\pm 3\omega_c}[n] = \sum_{\pi\left(\mathbf{B}_{2l-1}^{l+1}\right)} g_{2l-1}^{\pm 3\omega_c} \left[\mathbf{B}_{2l-1}^{l+1}, n\right] \cdot e^{j3\omega_c n} + \sum_{\pi\left(\mathbf{B}_{2l-1}^{l-2}\right)} g_{2l-1}^{\pm 3\omega_c} \left[\mathbf{B}_{2l-2}^{l-1}, n\right] \cdot e^{-j3\omega_c n}$$
(5.37)

We are interested in simplifying the functions inside the sums for the subsets of vector that we are considering. For each vector  $\mathbf{B}_{2l-1}^{l+1}$  we can write:

$$\prod_{i=1}^{2l-1} \bar{x}_{b_i}[n-q_i] = \underbrace{\prod_{i=1}^{l+1} \bar{x}_0[n-q_i]}_{l+1 \text{ zeros}} \cdot \underbrace{\prod_{i=l+2}^{2l-1} \bar{x}_1[n-q_i]}_{l-2 \text{ ones}}$$
(5.38)

Therefore the first sum in 5.37 becomes:

$$\sum_{\pi(\mathbf{B}_{2l-1}^{l+1})} g_{2l-1}^{\pm 3\omega_c} \left[ \mathbf{B}_{2l-1}^{l+1}, n \right] =$$

$$= \frac{1}{2^{2l}} \frac{l-1}{l+1} \binom{2l}{l} \sum_{q_1} \cdots \sum_{q_{2l-1}} h_{2l-1} [q_1 \dots q_{2l-1}] \prod_{i=1}^{l+1} \tilde{x}[n-q_i] \prod_{i=l+2}^{2l-1} \tilde{x}^*[n-q_i] \prod_{i=1}^{2l-1} e^{-j3\omega_c q_i}$$
(5.39)

The second sum can be written as :

$$\sum_{\pi(\mathbf{B}_{2l-1}^{l-2})} g_{2l-1}^{\pm 3\omega_c} \left[ \mathbf{B}_{2l-1}^{l-2}, n \right] = \frac{1}{2^{2l}} \frac{l-1}{l+1} \binom{2l}{l} \sum_{q_1} \cdots \sum_{q_{2l-1}} h_{2l-1} [q_1 \dots q_{2l-1}] \prod_{i=1}^{l+1} \tilde{x}^* [n-q_i] \prod_{i=l+2}^{2l-1} \tilde{x} [n-q_i] \prod_{i=1}^{2l-1} e^{j3\omega_c q_i}$$

$$(5.40)$$

The two sum terms are complex conjugate. Since we are considering the baseband notation with respect to  $\omega_c$  we must compare expression 5.37 with 5.14. This lets us identify the expression of the complex envelope:

$$\begin{split} \tilde{y}_{2l-1}^{\pm 3\omega_{c}}[n] &= \\ &= \frac{e^{j2\omega_{c}n}}{2^{2l-1}} \frac{l-1}{l+1} \binom{2l}{l} \sum_{q_{1}} \cdots \sum_{q_{2l-1}} h_{2l-1}[q_{1}\dots q_{2l-1}] \prod_{i=1}^{l+1} \tilde{x}[n-q_{i}] \prod_{i=l+2}^{2l-1} \tilde{x}^{*}[n-q_{i}] \prod_{i=1}^{2l-1} e^{-j3\omega_{c}q_{i}} \\ &= \sum_{q_{1}} \cdots \sum_{q_{2l-1}} \tilde{h}_{2l-1}^{\pm 3\omega_{c}}[q_{1}\dots q_{2l-1}] \prod_{i=1}^{l+1} \tilde{x}[n-q_{i}] \prod_{i=l+2}^{2l-1} \tilde{x}^{*}[n-q_{i}] \cdot e^{j2\omega_{c}n} \end{split}$$

$$(5.41)$$

with

$$\tilde{h}_{2l-1}^{\pm 3\omega_c}[q_1\dots q_{2l-1}] = \frac{1}{2^{2l-1}} \frac{l-1}{l+1} \binom{2l}{l} h_{2l-1}[q_1\dots q_{2l-1}] \prod_{i=1}^{2l-1} e^{-j3\omega_c q_i}$$
(5.42)

Focusing for example on the contributions around  $3\omega_c$  generated by third order kernels (l = 2, HD3), we have:

| Component  | Expression                            |  |  |
|------------|---------------------------------------|--|--|
| IM3        | $\tilde{x}[n] \cdot  \tilde{x}[n] ^2$ |  |  |
| IM5        | $\tilde{x}[n] \cdot  \tilde{x}[n] ^4$ |  |  |
| HD3        | $\tilde{x}[n]^3$                      |  |  |
| HD5        | $\tilde{x}[n]^5$                      |  |  |
| IM5 to HD3 | $\tilde{x}[n]^4 \cdot \tilde{x}^*[n]$ |  |  |

 Table 5.1.
 Baseband polynomial distortion components up to fifth order

$$\tilde{y}_{3}^{\pm 3\omega_{c}}[n] = \sum_{q_{1}} \sum_{q_{2}} \sum_{q_{3}} \tilde{h}_{3}(q_{1}, q_{2}, q_{3}) \cdot \tilde{x}[n-q_{1}] \cdot \tilde{x}[n-q_{2}] \cdot \tilde{x}[n-q_{3}] \cdot e^{j2\omega_{c}n} \quad (5.43)$$

The result in 5.41 is directly applicable when considering the architecture of the IQ receiver in Fig.5.2 where the baseband components are extracted in the analog domain and then sampled. For the digital-IF receiver, where the sampling process is done before the quadrature mixing, further analysis is carried out in the following sub-section considering also sub-sampling and aliasing effects. Anyway, thanks to the calculations made so far we gain insight into how to model both IMD3 and HD3 and we can deduce thus the mathematical representation of other distortion components, reported in Table 5.1 up to the fifth order.

### 5.2.3 Baseband modeling of sub-sampled bandpass systems

The goal of this analysis is to model RF distortions directly at baseband including aliasing effect occurring on the in-band and out-of-band distortions in a sub-sampling receiver. If we have to sample the analog output of a linear bandpass system with a bandwidth B centered in one of the Nyquist zones, for the Nyquist criterion we must use a sampling frequency  $f_s > 2B$  to prevent information losses due to aliasing. When a more realistic nonlinear bandpass system is taken into account (i.e. RF receivers, active filters) we must consider its in-band distortion components up to a certain order. The distortions bandwidth is proportional to their order: third order distortions bandwidth is 3B, fifth order is 5B, etc. Considering for example only third order systems, there are different behaviors of the distortion spectrum depending on the choice of  $f_s$ : we can find out the constraints relative to  $f_s$  that don't produce distortions aliasing and the ones that produce distortions aliasing non-overlapping with the useful signal spectrum.

- $f_s \ge 6B \rightarrow$  no third order distortions aliasing
- $4B \leq f_s < 6B \rightarrow$  third order distortions aliasing outside the useful signal spectrum
- $2B \leq f_s < 4B \rightarrow$  third order distortions aliasing inside the useful signal spectrum

Thus depending on the choice of  $f_s$  and B, different spectra are observed in the first Nyquist zone after bandpass sampling. Classical modeling of IMD at baseband  $(\tilde{x}[n] \cdot |\tilde{x}[n]|^2)$  is unable to describe this behavior for every value of  $f_s$ .

Let's focus on the bandpass signal in Fig. 5.6 that meets the constraint  $4B \leq f_s < 6B$ . The central frequency is supposed to fall at the center of an odd Nyquist zone so there will be no need to flip the spectrum in the first zone after sampling. After the sampling process the spectrum of the real digitized signal



Figure 5.6. Real analog spectrum before the sampling process composed by the useful signal and the IMD3

looks like the one showed in Fig. 5.7. Low-IF down-conversion is performed with  $f_{IF} = f_s/4$  if the original carrier frequency  $f_c$  is centered in a Nyquist Zone. Aliasing of the distortion spectrum occurs. We obtain the real and the imaginary components



Figure 5.7. Real digital spectrum after the sampling process

of the baseband signal multiplying the real sampled signal for the cosine and the sine of  $f_s/4$  and then low-pass filtering with an half band filter. The actual baseband spectrum is depicted in Fig.5.8a.



Figure 5.8. Comparison between the BF spectrum obtained after sub-sampling and the one generated directly at baseband

As it can be seen in Fig. 5.8b the classical way of modeling IMD doesn't reproduce the actual behavior of a sub-sampled bandpass nonlinear system when  $f_s < 6B$ . The cause is that it doesn't take into account the spectrum folding properly. Even if we use half the sampling frequency at baseband prior to the generation of the IMD component, the aliasing on the complex spectrum will not reproduce the folding behavior of a real symmetric spectrum.

To correctly model the distortions at baseband a conjugated replica of the IMD must be added, shifted in the frequency domain by  $f_s/2$  and then low-pass filtered with an half band filter. The resulting spectrum is shown in Fig.5.9. The  $f_s/2$  shift equals to  $\exp(j\pi n)$  because the sampling frequency is  $f_s$ :

$$\exp\left(j2\pi\frac{f_s}{2}n\frac{1}{f_s}\right) = \exp(j\pi n) = (-1)^n \tag{5.44}$$

The IMD components can be written as:

$$\tilde{y}_{IMD}[n] = LPF\{\tilde{x}[n] \cdot |\tilde{x}[n]|^2 + \tilde{x}^*[n] \cdot |\tilde{x}[n]|^2(-1)^n\}$$
(5.45)

The above method applies also to the modeling of harmonic distortions in nonlinear bandpass systems such as active multistage filters. The HD components will alias in the first NZ after sub-sampling and they will also be affected by folding due to the constraint  $f_s < 6B$ . Thanks to the analysis in the previous subsection we know that the term  $\tilde{x}[n]^3$  produces HD3 but we have to take into account spectral folding depending on the value of  $3f_c$ . Considering again the case of useful signal spectrum centered in a NZ, we can write:

$$f_c = \frac{f_s}{4} + k\frac{f_s}{2} \qquad \qquad 3f_c = \frac{f_s}{4} + (3k+1)\frac{f_s}{2} \qquad (5.46)$$



Figure 5.9. Complex digital spectrum with the addition of the shifted conjugated replica of the IMD components and low-pass filtered

For each value of k the first and the third harmonic always fall in even and odd NZs or vice versa. This behavior implies the conjugation of the HD3 component in the model. We must include the frequency shift and low-pass filtering as in the IMD case obtaining the following formula:

$$\tilde{y}_{HD3}[n] = LPF\{(\tilde{x}^*)^3[n] + \tilde{x}^3[n](-1)^n\}$$
(5.47)

The overall RF extended Volterra model (RFeVM) will include both IMD and HD components with memory.

#### 5.2.4 Behavioral simulations

To prove the validity of the RFeVM a bandpass nonlinear system with memory has been simulated at IF with a sampling frequency high enough to represent HD3 without aliasing. The presence of HDs in the output spectrum of active bandpass filters has been proven by circuital simulations in Subsection 5.2.2. After subsampling the BF output is computed. Then the classical RFVM and the proposed RFeVM are used to model the post-inverse system able to equalize the linear and nonlinear transfer functions. The model parameters have been estimated using Least Squares on a batch of 17 three-tone input-output BF signals. The signal bandwidth is B = 15 MHz centered around  $f_c = 30$  MHz. Nonlinearities with memory are generated at  $f'_s = 320$  MHz, obtaining:

$$y_{IF}[n] = x_{in}[n] + \alpha x_{in}^3[n] + \beta x_{in}^2[n]x_{in}[n-1]$$
(5.48)

The sub-sampling frequency is  $f_s = 40 \text{ MHz}$ :  $f_c$  is centered in the second NZ and the  $3f_c$  component in the fifth. In this case the relation  $2B \leq f_s < 4B$  holds. Both the BF Volterra models are set symmetric and with a maximum memory equal to 2.

In Fig.5.10 the comparison between the post calibration spectra using RFVM and RFeVM is shown. The input signal is an out-of-sample two tone with frequencies near the bounds of the passband that produces IMD and HD3 affected by aliasing. It is clear that the RFVM is not able to correct these distortions. The RFeVM compensate for both IMD and HD affected by aliasing: it has no out of band components thanks to the low-pass filters included in the model. Higher order residual components are generated due to the cascade of two third order systems (the original and the post inverse one).



Figure 5.10. Comparison between post calibration spectra using RFVM and RFeVM using an out-of-sample detached two-tone signal

Fig. 5.11 shows that the RFVM is only able to compensate IMD not affected by aliasing but fails to correct HD components that alias in band. In Fig. 5.12 the comparison between SFDR enhancements obtained with the two models is shown. The test has been performed using a set of 25 two-tone out-of-sample signals with frequencies lying on adjecent FFT bins. The RFeVM achieves more than 20dB of SFDR enhancemet within all the passband while the classical model shows very poor performance due to its inability to compensate the harmonic distortions. In such a test the IMD aliasing does not occur because each test signal is narrowband. One way to check the wideband behavior of the estimated models is to evaluate the post calibration Noise to Power Ratio (NPR) using an out-of-sample multisine signal with a notch at the center of the spectrum. Fig.5.13 shows that using the RFeVM the spectral regrowth decreases by more than 10dB while it gets worse by the same value using the RFVM. The NPR value before calibration is 23.7dB, and becomes 39.6dB after calibrating with RFeVM and 20.7dB using RFVM. From these simulations we can say that the classical RFVM can model nonlinear bandpass systems without out of band distortions (i.e. only nonlinear systems followed by passive band pass filters) and it is limited to the condition  $f_s > 4B$  that ensures no IMD aliasing in the useful signal band. The proposed RFeVM can model nonlinear



Figure 5.11. Comparison between post calibration spectra using RFVM and RFeVM using an out-of-sample two-tone signal



Figure 5.12. Comparison between post correction SFDR of the two models using out-of-sample two-tone signals

bandpass systems even with HDs (i.e active multistage filters) and pushes down the  $f_s$  value ( $f_s > 2B$ ) towards critically sampled systems (with IMD experiencing in band aliasing). The price to be paid is a higher number of model parameters due to the presence of HD terms in the RFeVM. The presence of an half-band filter for each



Figure 5.13. Comparison between the post calibration spectra using a wideband notched multisine useful for the computation of the Noise to Power Ratio (NPR)

model parameter is only required in the offline estimation phase, where out-of-band signals contribute to magnitude errors in the Least Squares algorithm. Once the parameters have been determined, the RFeVM filter implementation requires only one half-band filter after the linear combination of the model components.

# 5.3 System operating point impact on digital calibration performance

Post-distortion can be applied to improve Dynamic Range (DR) in receiver chains. The nonlinear behavior of a receiver is highly dependent on the power level of the input signal that determines the compression level of the system. In the selection of the input stimuli needed for the calibration procedure, care must be taken in choosing the peak amplitude of the signals and their Peak-to-Average Power Ratio (PAPR). This parameter will impact on the operating range of the digitally enhanced receiver. The approach can be summarized in these steps:

- Set the level of the input signals so that at the output the distortions exceed the noise floor.
- Acquire N periods of the useful signals (averaging the noise lowers the estimation error).
- Estimate the parameters. If the Volterra model have been chosen appropriately, the post-correction distortions are pushed down to the noise floor.

Increasing the compression level of the system gives rise to a higher order set of distortions that degrade exponentially the conditioning number of the sample matrix. The estimated Volterra model is definitely capable of representing the input-output

behavior of the system for input amplitudes lower or equal to the amplitude used during the estimation phase. The modeling error increases when further increasing the input level for the absence of higher orders kernel.



Figure 5.14. Ideal dynamic range improvement

This method extends the operating region of the system up to the input level used in the estimation phase and it is particularly advantageous if the system is working in low compression. In these conditions, there is an increase in both SNR and SFDR that translate in dynamic range improvement, as shown in Fig. 5.14. The nearest the amplifying chain is to the saturation, the less the digital enhancement to the DR will be: the effects of compression are highlighted in Fig. 5.15.



Figure 5.15. Dynamic range improvement limited by compression

The most suitable architecture for the application of digital calibration is then a system with weak nonlinearities, even with memory, and low noise floor. Regarding IF bandpass filters in receiver chains, the ones based on Gm-C architectures are characterized by medium linearity and poor noise performance while the ones based on OPA-RC architectures show better noise figures.

# 5.4 40nm CMOS AAF calibration

In a digital-IF receiver architecture the active anti-aliasing filter with gain is the major contributor to system non-linearity because it works with the highest signal swing. This behavior has been verified simulating an entire receiving chain in 40 nm process made by LNA, mixer and AAF: when the output of the filter (designed with lower voltage supply to reduce the overall power consumption) shows the effects of heavy compression with fifth and seventh harmonics, its input has still a good linearity. In this Section a calibration technique with the proposed model is applied to an active multistage anti-aliasing bandpass filter designed in 40 nm CMOS using mixed circuital and behavioral simulations, according to the scheme in Fig.5.16.



Figure 5.16. Calibration test-bed using behavioral and circuital simulation environments.

In the system under test a bandpass signal centered at 30 MHz with a 10 MHz bandwidth is filtered with an Anti Aliasing Filter (described in detail in subsection 5.4.1) and then sub-sampled in the second Nyquist zone with a  $f_S = 40 \text{ MS/s}$ . The resulting digitized IF spectrum is flipped (conjugated) and shifted around  $f_{IF} = 10 \text{ MHz}$ . Baseband components are extracted with a quadrature NCO using  $f_{LO} = f_{IF} = f_S/4$  inverting the sign of the imaginary part to obtain the correct BF spectrum.

The generation of the input stimuli is performed in MATLAB and saved in two formats: IF high sampling rate waveforms in ASCII format needed for device under test excitation and BF waveforms used as reference signals needed for the estimation process. The files containing the IF waveforms are imported in Cadence Virtuoso and played with a Piece Wise Linear File (PWLF) generator. Transient simulations are used to extract output waveforms and the parameters estimation process is carried out in the time domain. It is very important to simulate the "analog" section with a sufficiently high accuracy to obtain the desired numerical precision. This is achieved appropriately choosing different simulation settings:

- **IF waveform sampling frequency**: The PWLF generator reads a two-column file with time and amplitude values. The sampling frequency of the waveform must be high enough to limit the distortions due to linear interpolation of the source. In this study the value of 80 GS/s is adopted.
- **Transient simulation strobe**: The output of the system under test is ideally sampled by an ADC with a sampling frequency  $f_S$ . Using the default variable
|       | BQ1  | BQ2  | BQ3  | BQ4  | BQ5  |
|-------|------|------|------|------|------|
| $f_0$ | 30.8 | 39.8 | 23.3 | 37.5 | 25.5 |
| Q     | 2.6  | 3.3  | 3.1  | 3.1  | 3.4  |

**Table 5.2.** Values of  $f_0$  and Q of each biquadratic cell

time step in the simulation, data at the sampling instants will be produced by linear interpolation when the actual simulated time instants doesn't coincide with the wanted ones. The strobe option can be used specifying the **strobeperiod**, 25 ns in this study, to add all the ADC sampling instants to the time points evaluated by the simulator and remove thus the interpolation error.

Simulator parameters: Since we are simulating the cascade of circuits with moderate Q factors, the approximation errors due to Newton's method worsen because of oscillation phenomena of the resonant networks affecting the algorithm convergence. For this reason it is important to set the Accuracy Defaults to conservative and tighten the reltol (Newton's method relative tolerance) parameter. Controlling vabstol and iabstol we set the absolute values of voltage and current tolerable errors. The adopted values for these three parameters are respectively  $10^{-8}$ ,  $10^{-10}$  and  $10^{-12}$ .

The sampled simulation output is imported in MATLAB where it is used for the Volterra kernels estimation and for the validation process with out-of-sample waveforms. All the processing is done at baseband, using the same IQ extraction blocks for symmetry both on the reference and output signal. The IQ extraction is performed with a quadrature mixer followed by a 40-tap Butterworth low-pass filter.

#### 5.4.1 Anti-Aliasing Filter design

The bandpass anti-aliasing filter is realized by the cascade of 5 biquadratic (BQ) cells, obtaining a  $10^{th}$ -order transfer function. The central frequency of the filter is 30 MHz with a 3 dB bandwidth of 10 MHz and its gain is 16 dB. The values of  $f_0$  and Q for the biquads are reported in Table 5.2. Each biquad is implemented using OTA-C fully-differential architecture depicted in Fig. 5.17.



Figure 5.17. BQ architecture

For each BQ stage the values of  $f_0$ , Q, and the gain K depend on the capacitors  $C_{1,2}$  and on the transconductances  $G_{mj}$  (with  $j \in \{1, \ldots, 4\}$ ) according to the relations in 5.49:

$$f_0 = \frac{1}{2\pi} \sqrt{\frac{G_{m3}G_{m4}}{C_1 C_2}} \qquad Q = \frac{\sqrt{G_{m3}G_{m4}}}{G_{m2}} \sqrt{\frac{C_1}{C_2}} \qquad K = \frac{G_{m1}}{G_{m2}}$$
(5.49)

The gain K of each BQ cell can be chosen in order to keep the peak magnitude of the frequency response of each stage equal to the one at the output. We must consider that at the  $C_2$  node the BQ shows a 2-pole low-pass frequency response with a resonance peak near the cut-off frequency proportional to its Q factor. From a linearity point of view it is important to keep also this peak value at the same level of the bandpass output in order to let the OTA  $G_{m4}$  work with the same input swing as the other OTAs. The internal voltage scaling on a node  $n_x$  by a factor  $\gamma$  is carried out increasing the impedance seen at  $n_x$  and multiplying all the transconductances leaving that node by the same factor  $\gamma$ . In our case this means:

$$C'_2 = C_2/\gamma \qquad \text{and} \qquad G'_{m4} = \gamma G_{m4} \tag{5.50}$$

The need for a linear transconductance impacts the OTA design. For this implementation the topology in [7], represented in Fig.5.18, has been used. In this



Figure 5.18. OTA architecture

transconductor a linear resistor R is used to perform the voltage-to-current conversion. The input voltage buffer is realized creating low impedance nodes across the resistor using local negative feedback. The current obtained from the resistor is delivered to the output by source coupled pairs. The design parameters used for the transconductor are shown in Table 5.3. The power consumption of the OTA is  $225 \,\mu\text{W}$ .

In the filter design, the value  $G_{m2} = G_{m3} = 50 \,\mu\text{A}/\text{V}^2$  is adopted using  $R_2 = R_3 = 20 \,\text{k}\Omega$ . To obtain the desired values of gain K and low-pass node voltage

| Parameter       | Value    | Unit          |  |
|-----------------|----------|---------------|--|
| V <sub>DD</sub> | 1.5      | V             |  |
| Ι               | 25       | μΑ            |  |
| R               | 10 to 20 | $k\Omega$     |  |
| M1              | 40/0.2   | $\mu m/\mu m$ |  |
| M2              | 10/0.1   | μm/μm         |  |
| Mcn             | 15/0.4   | $\mu m/\mu m$ |  |
| Mcp             | 50/0.5   | $\mu m/\mu m$ |  |
| $V_{cn}$        | 646      | mV            |  |
| $V_{cp}$        | 832      | mV            |  |

Table 5.3. Design parameters for the OTA in Fig.5.18

scaling  $\gamma$ , lower values of  $R_1$  and  $R_4$  are needed. The value of R affects the local loop gain of the voltage buffer and thus transconductor linearity: there is a trade-off between the lowest resistance and the desired linearity. To overcome this limit at the expense of power consumption, many OTAs in parallel are used to realize  $G_{m1}$ and  $G_{m4}$  transconductances with  $R_1 = R_4 = 10 \text{ k}\Omega$ .

The fully-differential topology of the OTAs requires the use of a CMFB circuit. The adopted BQ architecture has only two independent high impedance nodes: output ports of transconductors  $G_{m1}, G_{m2}$  and  $G_{m4}$  are connected together at the bandpass output and the  $G_{m3}$  alone at the low-pass output. So two CMFBs suffice to control the common mode of the BQ circuit. The circuit implemented for the CMFBs is shown in Fig.5.19.

The parameters adopted for the CMFB are reported in Table 5.4. The power consumption of the CMFB is  $120 \,\mu$ W.

| Parameter       | Value                 | Unit  |  |
|-----------------|-----------------------|-------|--|
| V <sub>DD</sub> | 1.5                   | V     |  |
| I(Mbn)          | 40                    | μA    |  |
| M1              | 2.5/0.5               | µm/µm |  |
| M2              | $2 \times 25/0.25$    | µm/µm |  |
| M3              | $2 \times 25/0.75$    | µm/µm |  |
| Mbn             | $2 \times 7.5 / 0.75$ | µm/µm |  |
| $V_{Bn}$        | 470                   | mV    |  |
| $V_{cp}$        | 832                   | mV    |  |

Table 5.4. Design parameters for the CMFB in Fig.5.19

In Table 5.5 the capacitance values and the number of OTAs for each BQ are reported. The total power consumption of the filter is 9.975 mW, less than 1 mW



Figure 5.19. CMFB circuit

per pole.

|         | BQ1               | BQ2               | BQ3               | BQ4               | BQ5               |
|---------|-------------------|-------------------|-------------------|-------------------|-------------------|
| $C_1$   | $1\mathrm{pF}$    | $0.94\mathrm{pF}$ | $1.78\mathrm{pF}$ | $0.95\mathrm{pF}$ | $1.72\mathrm{pF}$ |
| $C_2$   | $0.52\mathrm{pF}$ | $0.47\mathrm{pF}$ | $0.57\mathrm{pF}$ | $0.41\mathrm{pF}$ | $0.63\mathrm{pF}$ |
| # OTAs  | 8                 | 7                 | 8                 | 7                 | 9                 |
| # CMFBs | 2                 | 2                 | 2                 | 2                 | 2                 |

 Table 5.5. Capacitance values and number of OTAs used for the implementation of the multistage anti-aliasing filter

The frequency response of the designed filter is shown in Fig. 5.20. The linear frequency axis is used to emphasize the lower attenuation value on the third Nyquist zone (40 to 60 MHz) with respect to the first. The output noise power is calculated integrating the output noise spectral density in the filter bandwidth considering also the spectrum intervals of higher Nyquist zones because of noise folding. We obtain  $P_n = -64 \,\mathrm{dB}$  on the 10 MHz bandwidth. The dynamic range is evaluated as the ratio between the power of the useful signal when IMD reaches the noise floor and the noise power:

$$DR = \frac{P_S|_{\text{IMD}=P_n}}{P_n} \tag{5.51}$$

Using a 2-tone signal the IMD reaches  $-64 \,\mathrm{dB}$  with an output peak amplitude of  $320 \,\mathrm{mV}$  giving a dynamic range  $DR = 51.8 \,\mathrm{dB}$ .



Figure 5.20. Anti-Aliasing Filter AC response

#### 5.4.2 Digital calibration results

The calibration technique described in Section 2.4 has been applied to the AAF using both the classic RFVM and the proposed RFeVM, described in Section 5.2.3. The input excitation waveforms used for parameter estimation are a batch of multisines: ten 5-tone and eight 2-tone signals. The selection of the frequencies of the multisines is linked to the DFT resolution with which we analyze the output spectrum. We adopt  $N_{FFT} = 128$  to represent the complex spectrum between -20 to 20MHz. The filter pass band ranges from -5 to 5MHz at baseband, so 33 discrete frequency bins are available for input excitation design spaced by  $\Delta_f = \frac{40}{128} = 0.3125$  MHz. An iterative algorithm has been designed to choose a subset of ten 5-tuples among all the 237336 combinations given by  $\binom{33}{5}$ . This algorithm searches for combinations of frequencies such that the union set of all their IMD products cover all distinct FFT bins. The number of IMD products depends on the number of frequencies of the multisine and on the system nonlinearity order. A set of 16 2-tone signals have been selected spanning all the pass band: one half have been used for the estimation phase and the other half for the out-of-sample validation.

Different values of model lags have been tested exploring trade-offs between accuracy and complexity of the calibration technique. In the following simulations the notation  $[L_1 L_3 L_5 L_7 L_9]$  is adopted to specify the memory depth of each odd kernel used. Analyzing the output spectrum of the AAF the presence of third and fifth intermodulation and harmonic distortions is clear. The use of higher order kernels can improve the nonlinear compensation correcting also the nonlinearities produced by the cascade of the original system and the Volterra filter itself (e.g. third order IMD products passing through a third order kernel).

The improved non-linear compensation capabilities of the proposed model are shown in Fig. 5.21 using a [101100] lag configuration, showing a wideband SFDR improvement of more than 24 dB against the classic RFVM limited by in-band HD aliasing. The plot is realized interleaving in-sample and out-of-sample 2-tone signals, showing the same trend on SFDR enhancement. The number of parameters of the RFVM and the RFevM are 31 and 43 respectively.



Figure 5.21. Comparison between the SFDR of the system without calibration and the one obtained after calibrating with classic RF Volterra model (RFVM) and the proposed RF extended Volterra model (RFeVM)

The better fitting capability of the RFeVM is also demonstrated comparing the two models with different lag settings producing a comparable number of parameters. In Fig 5.22 the RFeVM with a [101000] lag configuration that produce a 27 parameter model is compared with the 31 parameter RFVM shown in the previous figure. RFeVM performs 4 dB better with less parameters.



Figure 5.22. Comparison between the SFDR of the system without calibration and the one obtained after calibrating with a 31 parameters RFVM and a 27 parameters RFeVM.

The multisine signals used in the estimation phase have a peak amplitude of 0.45 V and the worst output IMD reaches -52 dB, exceeding the noise floor by 12 dB. The calibration procedure pushes the distortions below the noise floor so the improvement in dynamic range is given by the extension of the output swing and is limited by noise. Passing from 0.32 V<sub>p</sub> to 0.45 V<sub>p</sub> represents a 3 dB improvement in DR. The value of the new DR should be measured increasing the amplitude of 2-tone test signal until the post calibration IMD reaches the noise floor, so 3 dB is a conservative value.

The validation procedure of the estimated calibration parameters is carried out also using out-of-sample modulated waveforms and a combination of a weak modulated signal and a sinusoidal interferer. First, the performance of the calibration filter is assessed using a 10 MHz 16QAM waveform. The operations of filtering, rotation and symbol timing recovery are implemented in MATLAB to demodulate the signal. Fig. 5.23 shows the comparison between the demodulated constellations of the 16QAM signal with a peak amplitude of 0.47 V without calibration, after 20 taps linear equalizer and using RFeVM with [101000]. The EVM of the uncalibrated



Figure 5.23. Constellation plot of a 10 MHz 16QAM waveform with 0.47 V peak amplitude, without calibration, after a 20 taps linear equalizer and after 27 taps RFeVM filtering using the estimated parameters. The EVM for the three constellations is respectively -29 dB, -48 dB and -55 dB

output is  $-29 \,\mathrm{dB}$ . The effect of compression on the constellation symbols due to nonlinearities is very low for this amplitude value; there is indeed a low difference between the EVM in the linear EQ case,  $-48 \,\mathrm{dB}$ , and in the RFeVM case,  $-55 \,\mathrm{dB}$ . RFeVM outperforms the conventional RFVM by 5 dB in EVM (not shown).

Higher performance improvement in terms of EVM can be seen on a weak 5 MHz 16QAM signal received together with a strong 2-tone whose IMD products fall into the modulated signal band. Fig. 5.24 shows the comparison between the constellations in presence of an in-band interferer produced by intermodulation distortions using linear equalization and RFeVM filtering. In this case the linear equalizer is unable to compensate the nonlinear distortion, reaching a  $-39.2 \,\mathrm{dB}$  EVM, against the  $-55 \,\mathrm{dB}$  value using RFeVM calibration. The "Lin EQ" constellation shows the presence of the in-band sinusoidal interferer because the symbols are spread on circles.

Computational complexity is the bottleneck of Volterra filters practical implementations due to the high number of parameters. When selecting the lags configuration of a Volterra model we produce filters with a fixed number of coefficients with a



Figure 5.24. Constellation plot of a 5 MHz 0.04 V<sub>p</sub> 16QAM waveform in the presence of in-band sinusoidal IMD3 product of a 0.45 V<sub>p</sub> 2-tone, without calibration, after a 20 taps linear equalizer and after 27 taps RFeVM filtering using the estimated parameters. The EVM for the three constellations is respectively  $-29 \,\text{dB}$ ,  $-39.2 \,\text{dB}$  and  $-55 \,\text{dB}$ 

growing difference in terms of parameters number between different configurations when increasing lag values. Although we demonstrate that calibration with low memory lags is possible even for multistage active circuits in moderate compression, a lower number of parameters could be necessary to guarantee the same (or very less lower) performance. Many pruned models are described in literature based on a priori reduction of the model terms (losing of generality). A different approach is a posteriori pruning, that discards single parameters using some information about their weight on the estimated post-inverse model. Since the Volterra kernels are not an orthonormal basis, discarding the smallest parameter obtained by a LS estimation is not a good criterion to reduce the complexity. One method is the projection of the Volterra kernels on an orthogonal basis [27, 52], e.g Wiener G-functionals [66], Laguerre [17, 107] or Kautz [26] polynomials, removing the smallest parameters from the new representation and then going back to a Volterra series with a reduced parameter set. The adopted method is a backward pruning technique based on an iterative algorithm that discards at each step the parameter that impacts linearity the least. Fig. 5.25 shows the post calibration SFDR and the EVM of a  $10 \,\mathrm{MHz}$ 16QAM signal versus model complexity starting from a [101000] lag configuration. The quality criterion that drives the pruning algorithm is the post calibration SFDR. The results show that almost the same performance of the full model can be reached with 20 filter coefficients instead of 27. Using only 20 parameters (11 for linear equalization) we obtain a  $63.2 \,\mathrm{dB}$  SFDR and an EVM equal to  $-54.7 \,\mathrm{dB}$ . This pruning search algorithm performs a number of iterations equal to:

$$\frac{N_{NL}(N_{NL}+1)}{2}$$
(5.52)

where  $N_{NL}$  is the number of parameters excluding the linear kernel. The number of iterations grows rapidly with the increasing model order and maximum lag, but it is an offline operation that does not weigh on real time processing.

The concept of "digitally enhanced analog circuits" can be assessed using some Figure of Merit (FoM) and evaluating its enhancement after digital calibration. Different FoMs are reported in literature to evaluate the performance of analog



Figure 5.25. SFDR and EVM versus the number of parameters using an iterative pruning technique that discards the parameter that impacts SFDR the least.

filters [100]:

$$FoM_1 = \frac{P_d}{8k_BT \cdot f_c \cdot N_p \cdot DR}$$
(5.53)

$$FoM_2 = \frac{P_d}{f_c \cdot N_p \cdot SFDR \cdot Q}$$
(5.54)

$$FoM_3 = \frac{P_d \cdot A}{f_c \cdot N_p \cdot SFDR \cdot IIP3}$$
(5.55)

Digital calibration impacts on linearity, power consumption, and occupied area of the digital processing section. We can see that the first expression gets better if the relative DR increase is greater than that of power consumption. The second expression is more relaxed, with the constraint impacting the SFDR increase, always greater than that of the DR. Using  $FoM_3$  the additional ratio between area and IIP3 must be considered. We can write:

$$\Delta \text{FoM}_1 = \frac{\Delta P_d}{\Delta DR} \qquad \Delta \text{FoM}_2 = \frac{\Delta P_d}{\Delta SFDR} \qquad \Delta \text{FoM}_3 = \Delta \text{FoM}_2 \frac{\Delta A}{\Delta IIP3} \quad (5.56)$$

The values  $\Delta P_d$  and  $\Delta A$  depend on the implementation of the digital section.

#### 5.5 Conclusions and future work

System-level calibration of entire processing chains enables the use of a single correction model to correct the nonlinear effects arising in a sequence of different blocks, such as a multistage active filter. This has the potential to greatly reduce the digital overhead of the correction algorithms, if the resulting aggregate models do not become too complex. A Volterra model is agnostic about the specific causes of nonlinearities, except by postulating that they are continuous and - for numerical

feasibility - relatively short-memory. The new RFeVM makes the baseband processing for nonlinear compensation feasible even for critically sampled systems exploiting sub-sampling. Significant performance improvement can be achieved by relatively simple models using low-lag Volterra kernels. A 23 dB enhancement in SFDR has been reached using a pruned 20 parameters model. The dynamic range of the filter increases by 3 dB if we consider an output peak amplitude of 0.45 V. True post calibration DR has a greater value and must be assessed increasing test signal amplitude until the IMD reaches the noise floor.

Most of the processing power is required for linear equalization, which is always present in receivers to reduce Inter-Symbol Interference (ISI), so the overhead is likely to be small. In an experimental setup the noise on the measurements in the estimation phase can be reduced using the mean of many acquisitions of the periodic test signals.

A laboratory testbed addressed to RF receiver chains calibration is under development in the Thales Alenia Space Italy premises. FPGA firmware and Arbitrary Waveform Generator control software for synchronous data generation and acquisition have been developed. Different RF chains will be tested to assess model complexity for different architectures.

### Chapter 6

### **RF** Array Receiver Calibration

Exploiting the spatial domain of the satellite communication channels is a key feature for next-generation payloads enhancing link robustness against interferers and enabling a high time-frequency resources reusability. The most straightforward way to implement these capabilities is to take advantage of the onboard spatial diversity given by antenna arrays and digital beam forming network (DBFN) techniques. The figures of merit of a receiving phased array satcom antenna such as directivity and process gain are highly dependent on the receivers frequency-selective responses and on frequency responses mismatch between different channels giving rise to linear distortions. Moreover the presence of receiver nonlinearities, typically higher in low power RF circuitry, will also impact the spatial resolution and the dynamic range of the system. To compensate these errors due to RF chain non-idealities in satcom DBFN many solutions have been proposed relying on digital signal processing after A/D conversion. Classical amplitude and phase correction algorithms are only suitable for narrowband linear systems and even more complex linear filtering calibration techniques cover the wideband case, but are not able to compensate non-linear distortions [18].

In this chapter the digital calibration technique based on baseband Volterra model described in the previous one is extended to a multi-channel array receiver. Using mixed behavioral and circuital simulations we demonstrate the non-linear wideband equalization of the antenna receivers response, thus enhancing directivity, process gain and spurious free dynamic range of the array. The proposed calibration algorithm requires a least squares parameter estimation of the post-inverse Volterra passband model that maximizes the linearity of the overall system response. This method foresees the transmission of a set of known pilot references, such as multitone or chirp signals, in order to estimate and fine tune the optimal calibration coefficients. The RF chain response correction is carried out in the digital domain with a Volterra non-linear expansion of the received signals followed by linear filtering (FIR). The validity of the proposed calibration technique is discussed with the simulation results from a RF receiving chain affected by non-linearities.

#### 6.1 Introduction

Next generation SatCom payloads envisage an ever growing digital domain section that enables higher flexibility with respect to communication protocols and Digital Signal Processing (DSP) functions thanks to hardware reconfiguration capabilities. It is possible to extend these advantages even for the antenna section employing Direct Radiating Antenna (DRA) arrays and DBFN processing capable to realize programmable spotted coverage and adaptive interference robustness. These multichannel architectures require a calibration phase to homogenise each channel response to a target response and to reach then the maximum performance achievable. On the basis of the hardware implementation of the DBFN and the features of the processed signal, different kinds of architectures could be required to implement the calibration algorithm. Therefore, an interest is raising in flexible calibration stages able to adapt to different operative scenarios.

In this study we focus on the calibration of a receiving phased array according to the scheme in Fig. 6.1, therefore aiming to compensate the errors arising from both the antenna elements and the receiver chains. The non-linear modelling by means of the Volterra series makes the proposed techniques applicable even to a transmitting phased array, in which the non-idealities of power amplifiers and antenna elements can be corrected using pre-distortion. Volterra series based calibration techniques are also applied to Sample and Hold Amplifier (SHA) stages [19] and to A/D converters [20], even if in some cases more specific models are needed [21]. For the sake of simplicity, a uniform linear array (ULA) is considered, without limiting the generality of the proposed solution that is extensible to multi-dimensional array simply using the proper array manifolds.

#### 6.2 Wideband Volterra calibration architecture



Figure 6.1. DBFN wideband calibration architecture

The architecture of a wideband n-channel DBFN is depicted in Fig.6.1 that includes a calibration stage made up of n Volterra filters implemented in the digital

domain after the A/D converters. A phased array consists of n antenna elements, each one with its own radiation pattern, distributed over one, two or three spatial dimensions. Combining accordingly the signals on the n elements a steerable beam pattern can be synthesized with gain and directivity dependent on the array geometry and on the properties of each antenna.



Figure 6.2. Uniform Linear Array scheme

Consider the far-field transmitted passband signal  $x_{RF}(t) = \Re \{x_{BF}(t)e^{j\omega_{RF}t}\}$ impinging on the ULA on the direction  $\theta$  depicted in Fig. 6.2. Each element of the array receives the signal with a delay  $\tau_n = \frac{nD\sin\theta}{c}$ , with  $\theta$  the angle of arrival, Dthe element's spacing and c the speed of light. The signal on the *n*-th ideal antenna is then:

$$y_{RF}^{n} = x_{RF}(t-\tau_{n}) = \Re\left\{x_{BF}(t-\tau_{n})e^{j\omega_{RF}(t-\tau_{n})}\right\} = \Re\left\{x_{BF}(t-\tau_{n})e^{j\omega_{RF}-\tau_{n}}e^{j\omega_{RF}t}\right\}$$

$$(6.1)$$

In a narrowband scenario the BF component can be considered constant over a time period equal to the maximum relative delay  $\frac{(N-1)D\sin\theta}{c}$ , so the delay between elements can be represented simply by a phase shift of the baseband signal:

$$y_{RF_{NB}}^{n} \approx \Re \left\{ x_{BF}(t) e^{j\omega_{RF}(t-\tau_{n})} \right\} = \Re \left\{ x_{BF}(t-\tau_{n}) e^{j\Phi_{n}} e^{j\omega_{RF}t} \right\}$$
(6.2)

The mathematical boundary between narrow and wide band can be derived from the first order Taylor expansion of the BF signal:

$$x_{BF}(t - \tau_n) \approx x_{BF}(t) - \tau_n \cdot x'_{BF}(t) = x_{BF}(t) \cdot \left[1 - \tau_n \frac{x'_{BF}(t)}{x_{BF}(t)}\right]$$
(6.3)

The constraint rises from the derivative of the BF signal in which the highest frequency component is multiplied by the frequency itself. The narrowband approximation holds if  $\tau_n \cdot \omega_{max} \ll 1$  for the maximum value of  $\tau_n$ , i.e. for n = N - 1.

Assuming to model a RF receiver as a linear time invariant (LTI) system, it can be represented by means of its baseband frequency response  $H_{BF}(\omega)$ . In the narrowband case the receiver's frequency response can be approximated with its value at the carrier frequency, which is the DC value of the baseband equivalent  $H_{BF}(0) = A_0 e^{(j\phi_0)}$ . The relation of the output signal with respect to the transmitted signal becomes:

$$z_{BF_{NB}}^{n}(t) = A_{0}^{n} x_{BF}(t) e^{j(\Phi_{n} + \phi_{0}^{n})}$$
(6.4)

Compensating the different values of  $A_0^n$  and  $\phi_0^n$  among channels can be done with APC (Amplitude and Phase Correction) algorithms in the narrowband case. When the received signal is wideband the baseband output signal before the A/D conversion can be written in the frequency domain as:

$$Z_{BF}^{n}(\omega) = H_{BF}^{n}(\omega) \cdot X_{BF}^{n}(\omega) e^{-j\omega\tau_{n}} e^{j\Phi_{n}}$$

$$(6.5)$$

In this case the calibration and the beam forming algorithms must be based on FIR filtering in the digital domain: the calibration algorithm needs to find the frequency transfer function that equalizes the overall channel response and the digital beam former must synthesize a fractional delay on the baseband component.

In a more realistic scenario, the RF extended Volterra model seen in the previous Chapter can be adopted, including wideband electronic circuits behavior and their memory effects. From a DSP point of view of each receiving chain, Volterra filtering is similar to Multi Input Single Output (MISO) linear filtering in which all the equivalent inputs are obtained multiplying delayed replica of the input signal.

#### 6.3 System setup for parameters estimation

The global array calibration is obtained with the application of the calibration procedure shown in the previous Section for each channel. Knowing the direction of arrival of the reference signal gives to us the information about the expected amount of delay of the received signal at the output of each channel (in the ideal case). In this way we can apply the calibration algorithm using the reference signals with different delays for different channels. It is equivalent to apply the inverse of a steering vector to the reference, obtaining a local reference for each channel. This procedure realizes both nonlinear compensation with Volterra kernels and group delay equalization with the linear FIR filtering section among the channels.

#### 6.3.1 DBFN simulation model

The DBFN testbench comprises both a circuit level model and a behavioural simulation model. With the first component, it is extracted a truncated Volterra series model of a L-band receiver implemented in 40nm STMicroelectronics CMOS technology using Cadence Virtuoso simulation tool [2], composed by an LNA, a Mixer, an Anti-aliasing passband filter and an IF-Amplifier. Then, the extracted model is used in MATLAB simulation environment by adding gaussian variability to the parameters with a standard deviation equal to 10% of the nominal parameter value. The models obtained are used as the receiver responses for each of the 32 channels of a ULA. Each antenna element is modelled by a complex Fourier series of 5 terms of standard deviation equal to 5% of the nominal directivity. The model adopted for DBFN simulation is shown in Fig. 6.3.

The directivity of the array is simulated using a chirp signal from a known direction as input to each element, then the beamformer scans a discrete set of directions and the synthesized sum sequence is correlated by means of a matched filter on the chirp waveform.



Figure 6.3. DBFN simulation model

#### 6.4 Simulation Results

The calibration technique is based on the LS parameter estimation of the post-inverse Volterra system of each array receiving channel. The selected Volterra model has the first and the third order kernels with lags 10 and 1 respectively, producing a 17 parameters calibration filter. The key for obtaining a correct estimation of the parameters is to choose input calibration signals that represent persistent excitations for the system. Ten three-tone signals that produce output distortion spurs on distinct frequency bins have been used for the estimation phase. The result of the post calibration directivity obtained with a chirp signal impinging on the array on the direction  $\theta = -\pi/7$  is represented in the comparison plot of Fig.6.4.



Figure 6.4. Pre and post calibration directivity comparison

The results show a gain of 1dB on the peak of the matched filter output and a reduction of 7dB on the highest side lobe level. The directivity loss is mainly due to linear distortions and bandwidth mismatches between channels, so even with a linear FIR calibration this result can be obtained. The effects of calibration on the processed signals can be seen comparing the transmitted and received waveforms with the post-calibration one. In Fig. 6.5 the real part of the complex baseband



chirp signals is shown, the effects of the linear equalization are noticeable.

Figure 6.5. Comparison between TX, RX and CAL chirp waveforms

The nonlinear behavior of the receivers contributes mainly to Spurious Free Dynamic Range (SFDR) degradation and spectral regrowth. These effects are mitigated thanks to the Volterra non-linear calibration. The comparison between transmitted, received and post-calibration spectra of a three tone signal are shown in Fig. 6.6. The SFDR of the system without calibration is 46dB, the dominant distortions are generated by the Anti-Aliasing active filter implemented by a 5-stage Biquad cascade. A 18dB improvement in SFDR is obtained after the calibration filter.

To validate the proposed calibration technique on different kinds of out-of-sample signals, a 5MHz bandwidth 16QAM waveform together with a high amplitude twotone signal whose IMD3 component falls into the modulated bandwidth has been processed by the receiving chain and then filtered using the Volterra filter with the estimated coefficients. The Error Vector Magnitude (EVM) has been computed at the output of the Anti-Aliasing Filter without any correction, after linear equalization and applying the proposed nonlinear calibration. The baseband Volterra model used for the calibration has the linear kernel with lag 10, the third order kernel with lag 2, the fifth with lag 1 and the seventh with lag 0.

Fig. 6.7 shows the comparison between the three constellations: linear equalization alone improves the EVM, but nonlinear calibration further improves linearity. EVM was -30dB before equalization, lowered to -39dB with linear equalization, then further lowered to -47dB with nonlinear equalization. The absence of a particular symbol is only caused by the few symbols collected in the circuital transient simulation, due to the low symbol rate and the high ratio between simulation time and simulated time.



Figure 6.6. Comparison between TX, RX and CAL 3-tone signal spectra



Figure 6.7. Received weak 5MHz 16QAM constellation processed together with a strong 2-tone signal using linear and nonlinear calibration.

#### 6.5 Implementation complexity and parallelism

With Moore's law no longer working for single-core CPUs, the trend in high performance computing is to exploit parallel architectures which perform several (from a couple to thousands) operations at the same time. To the extent that the underlying algorithm can be parallelized, it performs many independent operations that do not rely on the output of previous instructions, and possesses sufficient data locality to exploit the bandwidth of wide (for instance, 64 to 512 bits) digital buses. In the framework of flexible / reconfigurable payloads, DSP algorithms can exploit parallelism of programmable space qualified platforms based on a mixed many-cores CPU - FPGA architecture [62]. Next generation many-cores CPU, such as RC-64 [39], have many instances of cores (10-1000), which are simpler than the cores in conventional CPUs and can scale parallelism up by more than an order of magnitude

respect to them. FPGAs can be configured to perform parallel computations at the hardware level, and are usually programmed in VHDL, though recently support for OpenCL has been introduced, making FPGA development somewhat more similar to that of CPUs and GPUs. The efficiency in the use of parallel architecture is strongly dependent on the type of algorithm being performed: in FIR digital filters, and FIR filter banks (including the DFT algorithm), efficiency is usually high. On the other hand, for IIR and adaptive filters, which make use of feedback loops, efficiency is much lower. Besides the computing power, other limitations arise from bandwidth limitations in the bus connecting the device and its RAM, and the acceleration board and the host computer. Finally, memory limitations may be an issue, if for instance tens or hundreds of GB of data are to be stored in RAM. The correction algorithms to compensate linear mismatches and nonlinear distortions in the receiver array make use of FIR filters and Volterra kernels. Parameter estimation is performed offline, during a calibration procedure, and the results are stored in the correction hardware to post-process the receivers' outputs and improve system performance. Both multiplications and FIR-filters can be efficiently implemented in most parallel hardware as there are few data dependencies and data locality is high. Hardware which is optimized to perform multiply-and-accumulate (MAC) operations is particularly suited for this purpose. Beam-forming can be seen as a form of complex matrix multiplication, an algorithm which can be easily parallelized, as it consists in a sequence of multiply-and-accumulate elementary operations. Computing the rotation matrix from angle data requires trigonometric functions, which can be efficiently handled by some hardware which have specialized units for transcendental functions. The Volterra kernels used to improve dynamic range have a FIR architecture, as the output only depends on the input and not on its previous values. This makes these kernels easier to parallelize, avoiding strong data dependence. Volterra kernels can be expressed as a series of FIR filter, whose input is a polynomial of the input signal. The polynomials always contain a term with zero lag, and are called "primary function". If we consider a second-order kernel with  $L_2 = 4$  lag, there are 15 terms in the kernel, from  $x_n^2$  to  $x_n(n-4)^2$ . Some of these terms can be expressed as a delay on other terms:  $x_n(n-4)^2$  is  $x_n^2$  delayed by four. The primary functions are the Volterra terms which cannot be obtained as a delay from other terms. In a second-order kernel with a delay of 4, these are:  $x_n^2$ ,  $x_n x_n (n-1)$ ,  $x_n x_n (n-2)$ ,  $x_n x_n (n-3)$  and  $x_n x_n (n-4)$ , and there are 5, 4, 3, 2, 1 such terms (delayed) in the full kernel, respectively. A FIR filter with 5 taps can take  $x_n^2$  as input; another FIR filter with 4 taps can take  $x_n x_n(n-1)$  as input, and so on. The output of the kernel is the sum of the outputs of the FIR filters. A similar structure is also valid for higher order kernels. A third-order kernel of length  $L_3$  will have  $(L_3 + 1)(L_3 + 2)(L_3 + 3)/6$  terms, and  $(L_3 + 1)(L_3 + 2)/2$  primary functions. Furthermore, the primary functions of the second-order kernel can be used to compute those of the third-order kernel, and so on:  $x_n x_n (n-1)^2$  can be for instance computed as  $x_n x_n (n-1) x_n (n-1)$ . There may be issues in the parallel implementation of Volterra kernels on some parallel hardware, especially GPUs: for instance, the FIR filters after the primary functions generator tend to be short, and the algorithm's nested summations are not independent. Alternative implementations may use the symmetric kernel approach, or frequency-domain algorithms based on the overlap-add or overlap-save methods.

#### 6.6 Conclusions

The digital calibration technique presented in the previous Chapter and applied to an AAF (latest stage of a digital-IF receiver) has been extended to a receiving DBFN and has been demonstrated with mixed circuital and behavioral simulations. The joint linear and non-linear equalization capabilities of the proposed technique achieve the enhancement of the array directivity and the overall system SFDR. An overview of the parallel architectures that can implement efficient Volterra filters has been carried out, and a future implementation will be realized on the manycores rad-hard processor RC-64 [39] using a parallel programming approach in C language. The more critical activity of the DBFN calibration remains the test setup for the synchronized generation and acquisition of the test waveforms. A possible realization would require an anechoic chamber with a controlled source positioning to characterize the spatial dependency of the array calibration coefficients.

# Chapter 7 Conclusions

The research activity reported in this thesis has been motivated by the technological trend of the "dirty RF" paradigm. The methodology for applying digital calibration algorithms to single and multi-channel systems has been developed, together with the tools to analyze and identify nonlinearities in complex systems such as TI-ADCs and sub-sampling receivers. The main contributions have regarded the identification and compensation of dynamic nonlinearities using post distortions methods based on Volterra models and the a posteriori complexity reduction of the models. The achieved results are very promising in the perspective of a growing gap between analog circuit performance and the progress of digital functions. Using a coarse estimate, analog performance (ADC FoM) doubles in 5 years while digital performance (MIPS) doubles every 1.5 years. This means an analog/digital gap near to 150 times in 15 years. These facts suggest that digital calibration will be ever more important and necessary for future CMOS mixed-signal systems.

A number of considerations must be done about the new prospects opened up by the research. Despite the advantage of the Volterra models to be intuitive and general, they are very efficient only in the identification of short memory smooth (low order) nonlinearities. New models can be synthesized that include short memory polynomial functionals and particular discontinuous functions, determined by the specific system under calibration. Different algorithms for the parameters estimation shall be considered especially in high complexity models that produce ill-conditioned sample matrices. The Orthogonal Matching Pursuit (OMP) could be used as a forward pruning algorithm, for the identification of a sparse Volterra model.

Other receiver architectures shall be addressed to understand the behavior and the potential of digtal post compensation. Due to the widespread use of Software Defined Radios (SDR), both in medium and high speed applications, a deepen research should be carried out to analyze the distortions arising from wideband LNAs in the presence of blockers and interferers.

The experience acquired in digital calibration should be exploited in analogdigital co-design. Starting from the anti-aliasing filter implemented in Chapter 5, an improved design should be realized minimizing the filter's noise figure and correct the degraded nonlinear behavior in post processing. The optimum target is to match the noise floor level with the highest post calibration spur, obtaining the best achievable dynamic range.

#### 7.1 Summary of the research contributions

The problem of complexity reduction has been addressed implementing an iterative backward pruning algorithm. This algorithm performs a selection of the Volterra kernels discarding the ones that impact linearity the least. This performance driven method produce a suboptimal solution due to the finite search through a subset of all the possible kernel combinations and because the Volterra functionals are not an orthogonal basis. Despite these limitations, the proposed method outstands the pruning based on a priori selection of the kernels that limits the generality of the Volterra model.

The analysis of the distortions in a sub-sampling receiver, mainly caused by the active anti-aliasing filter, has been the starting point for the derivation of an extended baseband equivalent Volterra model. This complex-valued model is able to represent out-of-band harmonic distortions that fold in the pass-band due to sub-sampling and the aliasing of the IMD components. This generalization enables the use of baseband processing schemes independently from the ratio between  $f_s$ and the bandwidth of the system (in the limits of the Nyquist theorem) and the order of nonlinearity.

The methodology developed for efficient time-series identification techniques using transient simulations simplifies the analog/digital co-design: circuit level simulations can be employed to assess the pre-calibration performance of the system; then, using behavioral simulations, a database of Monte Carlo realizations of the post layout system can be obtained and a statistical estimate of post-calibration performance and yield can be carried out. In this way digital calibration can be viewed as an integral part of the design process.

In the study of TI-ADCs, a methodology for solving the Papoulis equations in the presence of time skews and gain errors has been presented. The closed form of perfect reconstruction filters has been calculated and validated by numerical simulation for a 4-channel architecture. The shape of such filters has suggested the use of new base filters in the cyclo-stationary calibration architecture, obtaining better performance in terms of accuracy and complexity with respect to the state of the art.

#### 7.2 Future works

Along with the aforementioned contributions, this thesis have leaved open issues that will be faced in future research activities.

Regarding the post compensation of the digital-IF receiver, the proposed model can be improved covering the cases of  $f_{IF} \neq n \frac{f_s}{4}$  and adding new terms for representing IQ mismatches in quadrature sampling receivers. These further enhancement will make the model even more flexible and general with respect to the receiver architectures and the analog impairments.

For what concerns the research on TI-ADC, two topics must be further investigated: the implementation of a practical adaptive estimation algorithm together with the analysis of fixed-point performance in the Papoulis architecture starting from the presented results and the analytical derivation of the closed form solution of the correction filter in the cyclo-stationary architecture.

The development of a testbed addressed to RF receivers calibration is in progress in the Thales Alenia Space Italy laboratories. FPGA firmware and control software for synchronous data generation / acquisition have been developed. Coherent sampling between excitation design, generator output and acquisition front-end will be guaranteed feeding all the devices with a common 10 MHz reference. Different RF chains will be tested to assess model complexity for different architectures and carrier frequencies. A multi-channel extension is foreseen addressed to array processing in space environment, exploiting the architecture parallelism of a novel rad-hard many-cores processor to implement efficient Volterra filters and beam forming.

## List of Publications

- F.Centurelli, P. Monsurrò, F. Rosato, D. Ruscio and A. Trifiletti, "Calibrating sample and hold stages with pruned Volterra kernels", *Electronic Letters*, vol. 51 no. 25, 2015
- F.Centurelli, P. Monsurrò, F. Rosato, D. Ruscio and A. Trifiletti, "Calibration of pipeline ADC with pruned Volterra kernels", *Electronic Letters*, vol. 52 no. 16, 2016
- P. Monsurrò, F. Rosato and A. Trifiletti, "New models for the calibration of 4-channel Time-Interleaved ADCs using filter banks", *Transactions on Circuits and Systems II, Express Briefs*, volume PP, Issue 99, 2017
- F. Rosato, P. Monsurrò and A. Trifiletti, "Perfect Reconstruction Filters for 4-Channels Time-Interleaved ADC Affected by Mismatches", *European Conference on Circuit Theory and Design*, 2017
- G. Tomasicchio, G. Lulli, M. La Ferla, P. Monsurrò, F. Rosato and A. Trifiletti, "A non-linearities calibration technique for DBFN in SatCom payloads", 22nd Ka and Broadband Communications Conference, 2016
- G. Tomasicchio, G. Lulli, P. Monsurrò, F. Rosato, P. Tommasino and A. Trifiletti, "Models for wideband nonlinearities in SatCom payloads receiver channels and their parallelism", 23rd Ka and Broadband Communications Conference, 2017
- F. Rosato, P. Monsurrò and A. Trifiletti, "Subsampling receiver digital calibration", submitted to Transactions on Circuits and Systems II, Express Briefs, 2018

Out of the scope of this thesis, the author has participated to the conference paper:

G. Tomasicchio, A. Fiaschetti, M. La Ferla, G. Pastore, F. Rosato and V. Schena, "An Advanced SatCom System with Digital Processing Payload for Machine to Machine Applications", 22nd Ka and Broadband Communications Conference, 2016

### Bibliography

- BiCMOS process technology. http://www.st.com/content/st\_com/en/ about/innovation---technology/BiCMOS.html.
- [2] Cadence Virtuoso Analog Design Environment. https: //www.cadence.com/content/cadence-www/global/en\_US/ home/tools/custom-ic-analog-rf-design/circuit-design/ virtuoso-analog-design-environment.html.
- [3] Intel's 10 nm Technology: Delivering the Highest Logic Transistor Density in the Industry Through the Use of Hyper Scaling. https://newsroom.intel.com/newsroom/wp-content/uploads/sites/ 11/2017/09/10-nm-icf-fact-sheet.pdf.
- [4] ITRS 2015 Overview. https://www.dropbox.com/s/6eskh6bwdcuzpsa/ 1507\_11\_Paolo%200verview\_Out.pdf?dl=0.
- [5] Maxim Integrated Demystifying Delta-Sigma ADCs. https://www. maximintegrated.com/en/app-notes/index.mvp/id/1870.
- [6] IEEE Standard for Terminology and Test Methods for Analog-to-Digital Converters. IEEE Std 1241-2010 (Revision of IEEE Std 1241-2000), (2011), 1. doi:10.1109/IEEESTD.2011.5692956.
- [7] ACOSTA, L., CARVAJAL, R. G., JIMENEZ, M., RAMIREZ-ANGULO, J., AND LOPER-MARTIN, A. A CMOS transconductor with 90 dB SFDR and low sensitivity to mismatch. In 2006 IEEE International Symposium on Circuits and Systems, pp. 4 pp.-72 (2006). doi:10.1109/ISCAS.2006.1692524.
- [8] ADAMO, F., ATTIVISSIMO, F., GIAQUINTO, N., AND TROTTA, A. A/D converters nonlinearity measurement and correction by frequency analysis and dither. *IEEE Transactions on Instrumentation and Measurement*, **52** (2003), 1200. doi:10.1109/TIM.2003.815981.
- [9] AUMALA, O. AND HOLUB, J. Dithering design for measurement of slowly varying signals. *Measurement*, 23 (1998), 271. Available from: http:// www.sciencedirect.com/science/article/pii/S0263224198000372, doi: https://doi.org/10.1016/S0263-2241(98)00037-2.
- [10] BALESTRIERI, E., DAPONTE, P., AND RAPUANO, S. A state of the art on ADC error compensation methods. *IEEE Transactions on Instrumentation* and Measurement, 54 (2005), 1388. doi:10.1109/TIM.2005.851083.

- [11] BIERMAN, G. Factorization Methods for Discrete Sequential Estimation. Dover Books on Mathematics Series. Dover Publications (2006). ISBN 9780486449814. Available from: https://books.google.it/books?id=5AZjEmUKXGcC.
- [12] BJORSELL, N., RONNOW, D., AND HANDEL, P. Measuring Volterra kernels of analog to digital converters using a stepped three-tone scan. In 2006 IEEE Instrumentation and Measurement Technology Conference Proceedings, pp. 1047–1050 (2006). doi:10.1109/IMTC.2006.328342.
- [13] BOLSTAD, A., MILLER, B. A., GETTINGS, K., ERICSON, M., KIM, H., GREEN, M., AND SANTIAGO, D. Sparse polynomial equalization of an RF receiver via algorithm, analog, and digital codesign. In 2012 Conference Record of the Forty Sixth Asilomar Conference on Signals, Systems and Computers (ASILOMAR), pp. 1102–1106 (2012). doi:10.1109/ACSSC.2012.6489190.
- [14] BONTEMPI, G. AND TAIEB, S. B. Statistical foundations of machine learning. OTexts: Melbourne, Australia. (2017). Available from: https://www.otexts. org/book/sfml.
- [15] BOYD, S., TANG, Y., AND CHUA, L. Measuring Volterra kernels. *IEEE Transactions on Circuits and Systems*, **30** (1983), 571. doi:10.1109/TCS. 1983.1085391.
- BRUNI, C. AND FERRONE, C. Metodi di stima per il filtraggio e l'identificazione dei sistemi. Ingegneria industriale e dell'informazione. Aracne (2009). ISBN 9788854822733. Available from: https://books.google.it/books?id= G4GKPgAACAAJ.
- [17] CAMPELLO, R. J., FAVIER, G., AND DO AMARAL, W. C. Optimal expansions of discrete-time Volterra models using Laguerre functions. Automatica, 40 (2004), 815 . Available from: http:// www.sciencedirect.com/science/article/pii/S0005109803003807, doi: https://doi.org/10.1016/j.automatica.2003.11.016.
- [18] CENTURELLI, F., MONSURRÒ, P., ROMANO, F., SCOTTI, G., TOMMASINO, P., AND TRIFILETTI, A. Using feed array networks to control distortions in antenna reflector for astrophysical radio-astronomy, vol. 9145. SPIE (2014). ISBN 9780819496133. doi:10.1117/12.2055521.
- [19] CENTURELLI, F., MONSURRÒ, P., ROSATO, F., RUSCIO, D., AND TRIFI-LETTI, A. Calibrating sample and hold stages with pruned Volterra kernels. *Electronics Letters*, **51** (2015), 2094. doi:10.1049/el.2015.3269.
- [20] CENTURELLI, F., MONSURR'Ò, P., ROSATO, F., RUSCIO, D., AND TRIFI-LETTI, A. Calibration of pipeline ADC with pruned Volterra kernels. *Electro*nics Letters, **52** (2016), 1370. doi:10.1049/el.2016.1601.
- [21] CENTURELLI, F., MONSURRO, P., AND TRIFILETTI, A. Behavioral Modeling for Calibration of Pipeline Analog-To-Digital Converters. *IEEE Transactions* on Circuits and Systems I: Regular Papers, 57 (2010), 1255. doi:10.1109/ TCSI.2009.2033532.

- [22] CENTURELLI, F., MONSURRÒ, P., AND TRIFILETTI, A. Efficient Digital Background Calibration of Time-Interleaved Pipeline Analog-to-Digital Converters. *IEEE Transactions on Circuits and Systems I: Regular Papers*, **59** (2012), 1373. doi:10.1109/TCSI.2011.2177003.
- [23] CENTURELLI, F., MONSURRÒ, P., AND TRIFILETTI, A. Improved Digital Background Calibration of Time-Interleaved Pipeline A/D Converters. *IEEE Transactions on Circuits and Systems II: Express Briefs*, **60** (2013), 86. doi: 10.1109/TCSII.2012.2235014.
- [24] CHANG, D.-Y., LI, J., AND MOON, U.-K. Radix-based digital calibration techniques for multi-stage recycling pipelined ADCs. *IEEE Transactions on Circuits and Systems I: Regular Papers*, **51** (2004), 2133. doi:10.1109/TCSI. 2004.836863.
- [25] CHENG, C.-H. AND POWERS, E. J. Optimal Volterra kernel estimation algorithms for a nonlinear communication system for PSK and QAM inputs. *IEEE Transactions on Signal Processing*, 49 (2001), 147. doi:10.1109/78. 890357.
- [26] DA ROSA, A., CAMPELLO, R. J., AND AMARAL, W. C. Choice of free parameters in expansions of discrete-time Volterra models using Kautz functions. *Automatica*, 43 (2007), 1084 . Available from: http:// www.sciencedirect.com/science/article/pii/S0005109807000738, doi: https://doi.org/10.1016/j.automatica.2006.12.007.
- [27] DA ROSA, A., CAMPELLO, R. J. G. B., FERREIRA, P. A. V., OLIVEIRA, G. H. C., AND AMARAL, W. C. Robust expansion of uncertain Volterra kernels into orthonormal series. In *Proceedings of the 2010 American Control Conference*, pp. 5465–5470 (2010). doi:10.1109/ACC.2010.5530965.
- [28] DASGUPTA, K. Self-healing Techniques for RF and mm-Wave Transmitters and Receivers. Ph.D. thesis, California Institute of Technology (2015).
- [29] DEYST, J. P., VYTAL, J. J., BLASCHE, P. R., AND SIEBERT, W. M. Wideband distortion compensation for bipolar flash analog-to-digital converters. In [1992] Conference Record IEEE Instrumentation and Measurement Technology Conference, pp. 290–294 (1992). doi:10.1109/IMTC.1992.245131.
- [30] DUC, H. L., JABBOUR, C., DESGREYS, P., JAMIN, O., AND NGUYEN, V. T. A fully digital background calibration of timing skew in undersampling TI-ADC. In 2014 IEEE 12th International New Circuits and Systems Conference (NEWCAS), pp. 53–56 (2014). doi:10.1109/NEWCAS.2014.6933983.
- [31] EKSIOGLU, E. M. AND KAYRAN, A. H. Nonlinear system identification using deterministic multilevel sequences. In 2002 14th International Conference on Digital Signal Processing Proceedings. DSP 2002 (Cat. No.02TH8628), vol. 2, pp. 947–950 vol.2 (2002). doi:10.1109/ICDSP.2002.1028246.

- [32] ERDOGAN, O. E., HURST, P. J., AND LEWIS, S. H. A 12-b digital-backgroundcalibrated algorithmic ADC with -90-dB THD. *IEEE Journal of Solid-State Circuits*, **34** (1999), 1812. doi:10.1109/4.808906.
- [33] FAKATSELIS, J. AND CHESTER, D. B. Subsampling digital IF receiver implementations. In Southcon/96 Conference Record, pp. 92–97 (1996). doi:10.1109/SOUTHC.1996.535049.
- [34] FANG, Y.-W., JIAO, L.-C., ZHANG, X.-D., AND PAN, J. On the convergence of Volterra filter equalizers using a pth-order inverse approach. *IEEE Transactions on Signal Processing*, 49 (2001), 1734. doi:10.1109/78.934144.
- [35] FEHRI, B. AND BOUMAIZA, S. Baseband Equivalent Volterra Series for Behavioral Modeling and Digital Predistortion of Power Amplifiers Driven With Wideband Carrier Aggregated Signals. *IEEE Transactions on Microwave Theory and Techniques*, **62** (2014), 2594. doi:10.1109/TMTT.2014.2360387.
- [36] FEI, Y., SIN, S. W., SENG-PAN, U., AND MARTINS, R. P. A digital background nonlinearity calibration algorithm for pipelined ADCs. In 2010 Asia Pacific Conference on Postgraduate Research in Microelectronics and Electronics, pp. 115–118 (2010). doi:10.1109/PRIMEASIA.2010.5604949.
- [37] FETTWEIS, G., LOHNING, M., PETROVIC, D., WINDISCH, M., ZILLMANN, P., AND RAVE, W. Dirty RF: a new paradigm. In 2005 IEEE 16th International Symposium on Personal, Indoor and Mobile Radio Communications, vol. 4, pp. 2347–2355 Vol. 4 (2005). doi:10.1109/PIMRC.2005.1651863.
- [38] GIANNAKIS, G. B. AND SERPEDIN, E. A bibliography on nonlinear system identification. Signal Processing, 81 (2001), 533. Special section on Digital Signal Processing for Multimedia. Available from: http:// www.sciencedirect.com/science/article/pii/S0165168400002310, doi: https://doi.org/10.1016/S0165-1684(00)00231-0.
- [39] GINOSAR, R. Ramon Chips RC64 Many core rad-hard processor. http: //www.ramon-chips.com/RC64brief.Oct%202016.pdf (2017).
- [40] GLENTIS, G. O. A., KOUKOULAS, P., AND KALOUPTSIDIS, N. Efficient algorithms for Volterra system identification. *IEEE Transactions on Signal Processing*, 47 (1999), 3042. doi:10.1109/78.796438.
- [41] GRACE, C. R., HURST, P. J., AND LEWIS, S. H. A 12 b 80 MS/s pipelined ADC with bootstrapped digital calibration. In 2004 IEEE International Solid-State Circuits Conference (IEEE Cat. No.04CH37519), pp. 460–539 Vol.1 (2004). doi:10.1109/ISSCC.2004.1332793.
- [42] GRAY, R. M. AND STOCKHAM, T. G. Dithered quantizers. IEEE Transactions on Information Theory, 39 (1993), 805. doi:10.1109/18.256489.
- [43] GRIMM, M., ALLÉN, M., MARTTILA, J., VALKAMA, M., AND THOMA, R. Joint Mitigation of Nonlinear RF and Baseband Distortions in Wideband Direct-Conversion Receivers. *IEEE Transactions on Microwave Theory and Techniques*, 62 (2014), 166. doi:10.1109/TMTT.2013.2292603.

- [44] GUSTAVSSON, M., WIKNER, J. J., AND TAN, N. CMOS Data Converters for Communications (2002).
- [45] HENDERSON, H. V. AND SEARLE, S. R. On Deriving the Inverse of a Sum of Matrices. SIAM Review, 23 (1981), 53. Available from: http://dx.doi.org/ 10.1137/1023004, doi:10.1137/1023004.
- [46] HIRSCHORN, R. Invertibility of multivariable nonlinear control systems. *IEEE Transactions on Automatic Control*, 24 (1979), 855. doi:10.1109/TAC.1979.1102181.
- [47] HUMMELS, D. M., COOK, R. W., AND IRONS, F. H. Discrete-time dynamic compensation of analog-to-digital converters. In 1993 IEEE International Symposium on Circuits and Systems, pp. 1144–1147 vol.2 (1993). doi:10.1109/ISCAS.1993.393907.
- [48] (ITU), I. T. U. Draft new Report ITU-R M.[IMT-2020.TECH PERF REQ] -Minimum requirements related to technical performance for IMT-2020 radio interface(s). https://www.itu.int/md/R15-SG05-C-0040/en (2017).
- [49] JESPERS, P. G. A. Integrated Converters: D to A and A to D Architectures, Analysis and Simulation. Oxford University Press (2001). ISBN 0198564465. Available from: https://www.amazon.com/ Integrated-Converters-Architectures-Simulation-Engineering/dp/ 0198564465?SubscriptionId=0JYN1NVW651KCA56C102&tag=techkie-20& linkCode=xm2&camp=2025&creative=165953&creativeASIN=0198564465.
- [50] JUNG, Y. AND ENQVIST, M. Estimating models of inverse systems. In 52nd IEEE Conference on Decision and Control, pp. 7143-7148 (2013). doi: 10.1109/CDC.2013.6761022.
- [51] KEANE, J. P., HURST, P. J., AND LEWIS, S. H. Digital background calibration for memory effects in pipelined analog-to-digital converters. *IEEE Transactions on Circuits and Systems I: Regular Papers*, **53** (2006), 511. doi:10.1109/TCSI.2005.858760.
- [52] KIBANGOU, A. Y., FAVIER, G., AND HASSANI, M. M. Selection of generalized orthonormal bases for second-order Volterra filters. Signal Processing, 85 (2005), 2371. Available from: http:// www.sciencedirect.com/science/article/pii/S0165168405001258, doi: https://doi.org/10.1016/j.sigpro.2005.02.020.
- [53] KIM, J. AND KONSTANTINOU, K. Digital predistortion of wideband signals based on power amplifier model with memory. *Electronics Letters*, **37** (2001), 1417. doi:10.1049/el:20010940.
- [54] KWAK, S. U., SONG, B. S., AND BACRANIA, K. A 15-b, 5-Msample/s low-spurious CMOS ADC. *IEEE Journal of Solid-State Circuits*, **32** (1997), 1866. doi:10.1109/4.643645.

- [55] LEE, S., CHANDRAKASAN, A. P., AND LEE, H. S. A 1 GS/s 10b 18.9 mW Time-Interleaved SAR ADC With Background Timing Skew Calibration. *IEEE Journal of Solid-State Circuits*, 49 (2014), 2846. doi:10.1109/JSSC. 2014.2362851.
- [56] LI, J. AND MOON, U.-K. Background calibration techniques for multistage pipelined ADCs with digital redundancy. *IEEE Transactions on Circuits* and Systems II: Analog and Digital Signal Processing, 50 (2003), 531. doi: 10.1109/TCSII.2003.816921.
- [57] LIANG, P., GUOCANG, S., HAIHUA, D., AND MING, C. A Wiener model based post-calibration of ADC nonlinear distortion. In 2014 IEEE Workshop on Electronics, Computer and Applications, pp. 366–370 (2014). doi:10.1109/ IWECA.2014.6845633.
- [58] LIMA, E. G., CUNHA, T. R., TEIXEIRA, H. M., PIROLA, M., AND PEDRO, J. C. Base-band derived volterra series for power amplifier modeling. In 2009 IEEE MTT-S International Microwave Symposium Digest, pp. 1361–1364 (2009). doi:10.1109/MWSYM.2009.5165958.
- [59] LIPSHITZ, S. P., WANNAMAKER, R. A., AND VANDERKOOY, J. Quantization and dither: A theoretical survey. J. Audio Eng. Soc, 40 (1992), 355. Available from: http://www.aes.org/e-lib/browse.cfm?elib=7047.
- [60] LIU, W. AND CHIU, Y. Time-Interleaved Analog-to-Digital Conversion With Online Adaptive Equalization. *IEEE Transactions on Circuits and Systems I: Regular Papers*, **59** (2012), 1384. doi:10.1109/TCSI.2011.2177005.
- [61] LJUNG, L. System Identification: Theory for the User. Prentice Hall information and system sciences series. Prentice Hall PTR (1999). ISBN 9780136566953. Available from: https://books.google.it/books?id=nHFoQgAACAAJ.
- [62] LULLI, G., IACOMACCI, F., LOSQUADRO, G., AND TOMASICCHIO, G. Software Radio for OBP Systems Implementations: Architecture and Technology for a Reconfigurable Platform. In 19th Ka and Broadband Communications Conference (2013).
- [63] LUNDIN, H. AND HÄNDEL, P. Look-Up Tables, Dithering and Volterra Series for ADC Improvements, pp. 249–275. Springer Berlin Heidelberg, Berlin, Heidelberg (2014). ISBN 978-3-642-39655-7. Available from: https://doi.org/ 10.1007/978-3-642-39655-7\_8, doi:10.1007/978-3-642-39655-7\_8.
- [64] LUNDIN, H., SKOGLUND, M., AND HANDEL, P. A criterion for optimizing bit-reduced post-correction of AD converters. *IEEE Transactions on Instrumentation and Measurement*, 53 (2004), 1159. doi:10.1109/TIM.2004.831441.
- [65] MARTTILA, J., ALLÉN, M., KOSUNEN, M., STADIUS, K., RYYNANEN, J., AND VALKAMA, M. Reference receiver enabled digital cancellation of nonlinear out-of-band blocker distortion in wideband receivers. In 2016 IEEE Global Conference on Signal and Information Processing (GlobalSIP), pp. 684–688 (2016). doi:10.1109/GlobalSIP.2016.7905929.

- [66] MATHEWS, V. AND SICURANZA, G. Polynomial signal processing. Wiley series in telecommunications and signal processing. Wiley (2000). ISBN 9780471034148. Available from: https://books.google.it/books?id= xvNSAAAAMAAJ.
- [67] MEDAWAR, S., HANDEL, P., MURMANN, B., BJORSELL, N., AND JANSSON, M. Dynamic Calibration of Undersampled Pipelined ADCs by Frequency Domain Filtering. *IEEE Transactions on Instrumentation and Measurement*, 62 (2013), 1882. doi:10.1109/TIM.2013.2248289.
- [68] MIKULIK, P. AND SALIGA, J. Volterra filtering for integrating ADC error correction, based on an a priori error model. *IEEE Transactions on Instrumentation and Measurement*, **51** (2002), 870. doi:10.1109/TIM.2002.803513.
- [69] MONSURRÒ, P. Metodologie di progetto di circuiti integrati mixed-signal per l'elaborazione del segnale a R.F. Ph.D. thesis, Sapienza Universitá di Roma.
- [70] MONSURRÒ, P., ROSATO, F., AND TRIFILETTI, A. New models for the calibration of 4-channel Time-Interleaved ADCs using filter banks. *IEEE Transactions on Circuits and Systems-II*, (2017).
- [71] MONSURRÒ, P. AND TRIFILETTI, A. Subsampling Models of Bandwidth Mismatch for Time-Interleaved Converter Calibration. *IEEE Transactions on Circuits and Systems II: Express Briefs*, 62 (2015), 957. doi:10.1109/TCSII. 2015.2458131.
- [72] MONSURRÒ, P. AND TRIFILETTI, A. Calibration of Time-Interleaved ADCs via Hermitianity-Preserving Taylor Approximations. *IEEE Transactions on Circuits and Systems II: Express Briefs*, 64 (2017), 357. doi:10.1109/TCSII. 2016.2562184.
- [73] MOON, U.-K. AND SONG, B.-S. Background digital calibration techniques for pipelined ADCs. *IEEE Transactions on Circuits and Systems II: Analog* and Digital Signal Processing, 44 (1997), 102. doi:10.1109/82.554434.
- [74] MORGAN, D. R., MA, Z., KIM, J., ZIERDT, M. G., AND PASTALAN, J. A Generalized Memory Polynomial Model for Digital Predistortion of RF Power Amplifiers. *IEEE Transactions on Signal Processing*, 54 (2006), 3852. doi:10.1109/TSP.2006.879264.
- [75] MOULIN, D. Real-time equalization of A/D converter nonlinearities. In *IEEE International Symposium on Circuits and Systems*, pp. 262–267 vol.1 (1989).
   doi:10.1109/ISCAS.1989.100341.
- [76] MURMANN, B. AND BOSER, B. Digitally assisted analog integrated circuits. *Queue*, 2 (2004), 64. Available from: http://doi.acm.org/10.1145/984458. 984494, doi:10.1145/984458.984494.
- [77] NIKAEEN, P. AND MURMANN, B. Digital Compensation of Dynamic Acquisition Errors at the Front-End of High-Performance A/D Converters. *IEEE Journal of Selected Topics in Signal Processing*, 3 (2009), 499. doi:10.1109/JSTSP.2009.2020575.

- [78] NOWAK, R. D. AND VEEN, B. D. V. Random and pseudorandom inputs for Volterra filter identification. *IEEE Transactions on Signal Processing*, 42 (1994), 2124. doi:10.1109/78.301847.
- [79] PAPOULIS, A. Generalized sampling expansion. IEEE Transactions on Circuits and Systems, 24 (1977), 652. doi:10.1109/TCS.1977.1084284.
- [80] PENG, L. AND MA, H. Design and Implementation of Software-Defined Radio Receiver Based on Blind Nonlinear System Identification and Compensation. *IEEE Transactions on Circuits and Systems I: Regular Papers*, 58 (2011), 2776. doi:10.1109/TCSI.2011.2151050.
- [81] PRENDERGAST, R. S., LEVY, B. C., AND HURST, P. J. Reconstruction of band-limited periodic nonuniformly sampled signals through multirate filter banks. *IEEE Transactions on Circuits and Systems I: Regular Papers*, **51** (2004), 1612. doi:10.1109/TCSI.2004.832781.
- [82] PROAKIS, J. AND MANOLAKIS, D. Digital Signal Processing: Principles, Algorithms, and Applications. Simon & Schuster Books For Young Readers (1992). ISBN 9780023968150. Available from: https://books.google.it/ books?id=ywgfAQAAIAAJ.
- [83] RAZAVI, B. Design Considerations for Interleaved ADCs. IEEE Journal of Solid-State Circuits, 48 (2013), 1806. doi:10.1109/JSSC.2013.2258814.
- [84] RYU, S.-T., RAY, S., SONG, B.-S., CHO, G.-H., AND BACRANIA, K. A 14-b linear capacitor self-trimming pipelined ADC. *IEEE Journal of Solid-State Circuits*, **39** (2004), 2046. doi:10.1109/JSSC.2004.835823.
- [85] SARTI, A. AND PUPOLIN, S. Recursive techniques for the synthesis of a pth-order inverse of a volterra system. *Transactions on Emerging Telecommunications Technologies*, **3** (1992), 315.
- [86] SATARZADEH, P., LEVY, B. C., AND HURST, P. J. Digital Calibration of a Nonlinear S/H. *IEEE Journal of Selected Topics in Signal Processing*, 3 (2009), 454. doi:10.1109/JSTSP.2009.2020557.
- [87] SCHETZEN, M. Theory of pth-order inverses of nonlinear systems. IEEE Transactions on Circuits and Systems, 23 (1976), 285. doi:10.1109/TCS. 1976.1084219.
- [88] SCHLEMBACH, F., GRIMM, M., AND THOMAE, R. S. Real-time Implementation of a DSP-based Algorithm on USRP for Mitigating Non-linear Distortions in the Receiver RF Front-end. In *ISWCS 2013; The Tenth International Symposium on Wireless Communication Systems*, pp. 1–5 (2013).
- [89] SCHMIDT, C., COUSSEAU, J. E., FIGUEROA, J. L., WICHMAN, R., AND WERNER, S. ADC post-compensation using a Hammerstein model. In 2009 Argentine School of Micro-Nanoelectronics, Technology and Applications, pp. 71–76 (2009).

- [90] SCHMIDT, C. A., COUSSEAU, J. E., FIGUEROA, J. L., WICHMAN, R., AND WERNER, S. Non-linearities modelling and post-compensation in continuoustime ΣΔ modulators. *IET Microwaves, Antennas Propagation*, 5 (2011), 1796. doi:10.1049/iet-map.2011.0204.
- [91] SCHMIDT, C. A., LIFSCHITZ, O., COUSSEAU, J. E., FIGUEROA, J. L., AND JULIAN, P. Methodology and Measurement Setup for Analog-to-Digital Converter Postcompensation. *IEEE Transactions on Instrumentation and Measurement*, 63 (2014), 658. doi:10.1109/TIM.2013.2295877.
- [92] SINGERL, P. Complex Baseband Modeling and Digital Predistortion for Wideband RF Power Amplifiers (2006).
- [93] SONG, W., ET AL. Proceedings of the 13th Annual High Performance Embedded Computing Workshop. In Proceedings of the 13th Annual High Performance Embedded Computing Workshop (2009). Available from: http: //www.ll.mit.edu/HPEC/agendas/proc09/agenda.html.
- [94] TSENG, C. H. Estimation of cubic nonlinear bandpass channels in orthogonal frequency-division multiplexing systems. *IEEE Transactions on Communicati*ons, 58 (2010), 1415. doi:10.1109/TCOMM.2010.05.080573.
- [95] TSIMBINOS, J. Identification and Compensation of Nonlinear Distortion. Ph.D. thesis, School of Electronic Engineering, University of South Australia (1995).
- [96] TSIMBINOS, J. AND LEVER, K. V. Applications of higher-order statistics to modelling, identification and cancellation of nonlinear distortion in high-speed samplers and analogue-to-digital converters using the Volterra and Wiener models. In [1993 Proceedings] IEEE Signal Processing Workshop on Higher-Order Statistics, pp. 379–383 (1993). doi:10.1109/H0ST.1993.264531.
- [97] TSIMBINOS, J. AND LEVER, K. V. Computational complexity of Volterra based nonlinear compensators. *Electronics Letters*, **32** (1996), 852. doi: 10.1049/el:19960544.
- [98] TSIMBINOS, J. AND LEVER, K. V. Improved error-table compensation of A/D converters. *IEE Proceedings - Circuits, Devices and Systems*, **144** (1997), 343. doi:10.1049/ip-cds:19971589.
- [99] TSIMBINOS, J., MARWOOD, W., BEAUMONT-SMITH, A., AND LIM, C. C. Results of A/D converter compensation with a VLSI chip. In *Final Program* and Abstracts on Information, Decision and Control, pp. 289–293 (2002). doi:10.1109/IDC.2002.995413.
- [100] UHRMANN, H., KOLM, R., AND ZIMMERMANN, H. Analog Filters in Nanometer CMOS. Springer Series in Advanced Microelectronics. Springer Berlin Heidelberg (2013). ISBN 9783642380136. Available from: https: //books.google.it/books?id=k3i8BAAAQBAJ.
- [101] VALKAMA, M., HAGH GHADAM, A. S., ANTTILA, L., AND RENFORS, M. Advanced digital signal processing techniques for compensation of nonlinear

distortion in wideband multicarrier radio receivers. *IEEE Transactions on Microwave Theory and Techniques*, **54** (2006), 2356. doi:10.1109/TMTT.2006. 875274.

- [102] VOGEL, C. AND JOHANSSON, H. Time-interleaved analog-to-digital converters: status and future directions. In 2006 IEEE International Symposium on Circuits and Systems, pp. 4 pp.-3389 (2006). doi:10.1109/ISCAS.2006.1693352.
- [103] WEI, H., ZHANG, P., SAHOO, B. D., AND RAZAVI, B. An 8 Bit 4 GS/s
   120 mW CMOS ADC. *IEEE Journal of Solid-State Circuits*, 49 (2014), 1751.
   doi:10.1109/JSSC.2014.2313571.
- [104] WIDROW, B. A Study of Rough Amplitude Quantization by Means of Nyquist Sampling Theory. *IRE Transactions on Circuit Theory*, 3 (1956), 266. doi: 10.1109/TCT.1956.1086334.
- [105] WIDROW, B. Statistical analysis of amplitude-quantized sampled-data systems. Transactions of the American Institute of Electrical Engineers, Part II: Applications and Industry, 79 (1961), 555. doi:10.1109/TAI.1961.6371702.
- [106] ZHANG, Y. AND EL-SANKARY, K. Orthogonal polynomials non-linearity compensation for a digital VCO-based ADC. *Electronics Letters*, **52** (2016), 915. doi:10.1049/el.2016.0717.
- [107] ZHU, A. AND BRAZIL, T. J. RF power amplifier behavioral modeling using Volterra expansion with Laguerre functions. In *IEEE MTT-S International Microwave Symposium Digest, 2005.*, pp. 4 pp.- (2005). doi:10.1109/MWSYM. 2005.1516787.
- [108] ZHU, A. AND BRAZIL, T. J. An Overview of Volterra Series Based Behavioral Modeling of RF/Microwave Power Amplifiers. In 2006 IEEE Annual Wireless and Microwave Technology Conference, pp. 1–5 (2006). doi:10.1109/WAMICON. 2006.351917.