Table of Contents

Monday – Wednesday Technical Sessions

Session 1 – Plenary Session
Monday, 9/10/2012, 8:15 am
Session Chair: Aurangzeb Khan, Altia Systems
Oak Ballroom

8:15 am Welcome and Opening Remarks
Awards Presentations
Keynote Speaker Introduction
Tom Andre, General Chairman

Keynote Presentation, Architecture and Circuits for Energy Efficient Computing, Bill Dally, Stanford University

The end of Dennard scaling has made almost all of computing energy limited. The focus is now on delivering a better user experience within a constrained power envelope. This requires optimization at all levels. At the system level, moving energy consuming services to the cloud enables compelling user experiences in mobile devices. Energy efficient architecture involves minimizing data supply, instruction supply, and scheduling overhead. Energy efficient circuits reduce the energy of data movement, make flip-flops and clocking more efficient, and remove voltage and timing guardbands. This talk will discuss the challenge of energy efficient computing and give some examples of recent progress at several levels.

Session 2 – Behavioral Modeling for RF and AMS
Monday, 9/10/2012, 10:00 am
Oak Ballroom
Session Chair: Colin McAndrew, Freescale
Session Co-Chair: Brian Chen, Agilent

10:00 am Introduction
This session addresses X-parameters and their use in RF device modeling, behavioral modeling for AMS verification and event-driven AMS simulation.


This paper reviews three modern transistor modeling flows enabled by large-signal waveform and/or X-parameter measurements from a commercially available nonlinear vector network analyzer (NVNA). Parameter extraction to waveform data, advanced electro-thermal and trap-dependent III-V FET model construction, and X-parameters applied to transistors are demonstrated, along with their advantages and limitations.

10:55 am True Event-Driven Simulation of Analog/Mixed-Signal Behaviors in...

This paper presents a true event-driven simulation methodology for analog/mixed signal systems. The proposed methodology express an analog waveform with a unified basis function and an analytical form of the output waveform is derived directly without involving time-integration. Accuracy and speed of the proposed is demonstrated through a DFE example.

A Hybrid Electrical-Behavioral Modeling Approach for Pre- and Post-Silicon Electrical Validation (INVITED), N. Hakim, A. Bhaduri, K. Donepudi, S. Bodapati, Intel Corporation

This paper outlines a method to scale the simulation of analog mixed signal circuits with correct accounting of voltage, temperature and variations sensitivities. The results demonstrate large speedup and scalability and little loss in accuracy and illustrate the method's applications in circuit validation.

Session 3 – RF & mm Wave Power Amplifiers and Transmitters
Monday, 9/10/2012, 10:00 am
Fir Ballroom
Session Chair: Julian Tham, Broadcom
Session Co-Chair: Rick Booth, Consultant

10:00 am Introduction N/A

Transistors and power amplifiers are critical building blocks in wireless systems and dramatically affect system power consumption. This session discusses low power transmitter and power amplifier design techniques.

10:05 am A 17pJ/bit 915MHz 8PSK/O-QPSK Transmitter for High Data Rate Biomedical Applications, M.M.Izad, C.-H.Heng, National University of Singapore

A new 8PSK/O-QPSK transmitter architecture based on sub-harmonic injection-locked ring oscillator and digitally modulated power amplifier is proposed. Fabricated prototype in 65nm CMOS occupies an area of 0.038mm2 and consumes 938µW from a 0.8-V supply at the data rate of 55Mbps. The chip achieves the energy efficiency of 17pJ/b which is the lowest reported to date.


This paper reviews directions for CMOS PA development for handset applications, including FET stacking, envelope tracking, digital predistortion, and architectures based on digital control. These techniques, along with the use of SOI, high resistivity substrates, and integrated RF switches, enhance the attractiveness of CMOS for multiband, multimode LTE amplifiers.

11:20 am High Power, High Efficiency Stacked mmWave Class-E-like Power Amplifiers in 45nm SOI CMOS, Anandaroop Chakrabarti, Harish Krishnaswamy, Columbia University

Design guidelines and fundamental performance limits for stacked CMOS Class-E-like
mmWave PAs are presented. Two 45GHz 45nm SOI prototypes, with 2 and 4 stacked devices, exhibit PAEs of 32.3% (highest for CMOS mmWave PAs) and 15%, and saturated output powers of 16dBm and 20dBm (highest for CMOS mmWave PAs) respectively.

A 20 dBm Q-Band SiGe Class-E Power Amplifier With 25% Peak PAE, K. Datta, J. Roderick, H. Hashemi, University of Southern California

A Q-band two-stage Class-E power amplifier is designed and fabricated in a 0.13 μm SiGe HBT BiCMOS process. A mm-wave Class-E architecture considering the effect of various interconnect parasitics, impact ionization-induced negative base current and instability, is adopted to achieve high power efficiency. The measured performance of the fabricated chip show 20 dBm maximum output power, 25% peak power added efficiency, and 10 dB power gain across 4GHz centered around 45GHz for a supply voltage of 2.5 V.

Session 4 – Analog Techniques

Monday, 9/10/2012, 10:00 am
Pine Ballroom
Session Chair: Don Thelen, ON Semiconductor
Session Co-Chair: Jerry Jiang, Broadcom

10:00 am Introduction

Recent advances in analog techniques have yielded very low jitter PLLs and VCOs based on FBAR and LC resonators, improved class D amplifiers and bandgap references employing switched-capacitor techniques.

10:05 am A 1.5GHz 0.2ps_{RMS} Jitter 1.5mW Divider-less FBAR ADPLL in 65nm CMOS (INVITED), Julie R. Hu, Richard C. Ruby*, Brian P. Otis, University of Washington, *Avago Technologies, Inc

This paper presents a low power, low jitter, PVT-stable film-bulk acoustic wave resonator (FBAR) based all digital phase-locked loop (ADPLL) in a 65nm CMOS process. We introduce a power-efficient integer-N ADPLL architecture, where the digitally-controlled FBAR oscillator (FBAR DCO) achieves phase-lock to a reference clock without any explicit frequency dividers in the feedback path. The simplified divider-less ADPLL has a reduced phase difference at the input of the phase-frequency detector, avoiding a lengthy power hungry time-to-digital converter (TDC). The ADPLL consumes 1.5mW of power and has a measured integrated RMS jitter 0.19ps from 10kHz to 40MHz frequency offset at 1.5GHz carrier frequency. The measured frequency tuning range of 6300ppm for this ADPLL is wide enough to cover the FBAR frequency variations over PVT and provide moderate frequency modulation or channelization. This low power high performance FBAR ADPLL can be used in low power radios, high performance ADCs, and high speed data links.

10:30 am A 2.7GHz 3.9mW Mesh-BJT LC-VCO with -204dBc/Hz FOM in 65nm CMOS, T.-W. Chung, T.-C. Huang, S. Chung, M.-C. Huang, C.-C. Lin, C.-H. Chern, and F.-L. Hsueh*, Taiwan Semiconductor Manufacturing Company (TSMC), San Jose, CA, *TSMC, Hsinchu, Taiwan was with TSMC

Standards with narrow channel spacing have stringent requirements for in-band phase noise of oscillators. In this paper, we propose a LC-VCO that utilizes novel Mesh-BJT to suppress in-band phase-noise while maintaining low power consumption. The proposed
2.7GHz LC-VCO consumes 3.9mW from a 1.5V supply and the figures of merit (FOM) at 10KHz and 100KHz offsets are -222dBc/Hz and -204dBc/Hz in 65nm CMOS. The Mesh-BJT is compatible with standard CMOS and no extra masks are required. To our best knowledge, this is the best FOM ever reported at low offsets (<100KHz) for VCO in standard CMOS.

10:55 am **Linearization of Class D Amplifiers**, P. Balmelli, J. Khoury, E. Viegas, P. Santos, V. Pereira, Silicon Laboratories Inc

This paper reviews the main sources of nonlinearity in a class-D amplifier using DSP-based signal spreading techniques to control EMI in common-mode class-BD PWM. Two original solutions based on analog feedback techniques are described. Measurements of a 3W class-D audio amplifier implemented in a 110 nm CMOS process are provided.

11:20 am **A CMOS Switched-Capacitor Fractional Bandgap Reference**, W. Biederman, D. Yeager, E. Alon, J. Rabaey, University of California, Berkeley

An architecture for generating a voltage reference at a fraction of the silicon bandgap is proposed. It uses a two-phase switched-capacitor network to add multiples and fractions of VBE and ΔVBE to achieve a near zero temperature coefficient without the use of resistors or op-amps. The 0.0055mm2 circuit, implemented entirely on-chip in 65nm CMOS, produces a voltage of 423mV, has a measured σ of 2.2%, and consumes 138nA while operating at a supply as low as 750mV at -35°C.

Session 5 – Panel Session - "Will Photonics Dominate Electrical Backplane Transceivers in Five Years?"

Monday, 9/10/2012, 10:00 am
Cedar Ballroom
Moderator: Bill Walker, Fujitsu Laboratories of America

Since being introduced more than three decades ago, photonics has supplanted electrical transceivers for long-haul telecommunications, and is rapidly gaining market share in short distance links to 100-m. Active optical cables are replacing electrical links in the server room. The next frontier is the backplane – links up to 1-m with 1-2 connectors. To replace electrical links, optical links will need to demonstrate competitive or superior bandwidth density, energy efficiency, reliability, and cost. These criteria have not been met in today's server backplanes, all of which are electrical and operate up to 14-Gbps per lane. Within five years, electrical backplane links will operate at 25-Gbps and above per lane. Will optical links be competitive then? If not, when? Will VCSEL-based backplanes be viable, or do we need to wait for silicon photonics? This panel will help to shed light on these questions.

Panelists:
Tony Chan Carusone, Associate Professor
University of Toronto

Anthony Lentine, Principal Member of Technical Staff
Sandia Labs

Michael Hochberg, Associate Professor
University of Delaware

Ichiro Fujimori, Senior Director
Broadcom
Session 6 – Modeling & Design for Variability and Reliability
Monday, 9/10/2012, 1:30 pm
Oak Ballroom
Session Chair: Trent McConaghy, Solido Design
Session Co-Chair: Hidetoshi Onodera, Kyoto University

1:30 pm  Introduction  52

This session explores modeling and design techniques for statistical variability and aging/reliability, both at the circuit level, and at the device level.

1:35 pm  Large-Scale Statistical Performance Modeling of Analog and Mixed-Signal Circuits (INVITED), X. Li, W. Zhang, F. Wang, Carnegie Mellon University  54

This paper presents the recent development of statistical performance modeling and its important applications. We focus on two core techniques, sparse regression (SR) and Bayesian model fusion (BMF), that facilitate large-scale performance modeling with low computational cost. The efficacy of SR and BMF is compared to other traditional modeling approaches.

2:25 pm  Designing Reliable Analog Circuits in an Unreliable World (INVITED), G. Gielen, E. Maricau, P. DeWitt, Katholieke Universiteit Leuven  62

Reliability is one of the major concerns in designing integrated circuits in deep nanometer CMOS technologies. Problems related to transistor aging like BTI or soft breakdown cause time-dependent circuit performance degradation. Variability only makes these things more severe. This creates a need for innovative design techniques and tools that help designers coping with these reliability and variability problems. This invited overview paper gives a brief description of device aging models. It also presents tools for the efficient analysis and identification of reliability problems in analog circuits. Finally, it proposes solutions for the design of resilient, self-healing circuits.

3:15 pm  BREAK


Aging mechanisms, such as Negative Bias Temperature Instability (NBTI), limit the lifetime of CMOS design. Recent NBTI data exhibits an excessive amount of randomness and fast recovery, which are difficult to be handled by conventional power-law model (t^n). Such discrepancies further pose the challenge on long-term reliability prediction in real circuit operation. To overcome these barriers, this work (1) proposes a logarithmic model (log(t)) that is derived from the trapping/de-trapping assumptions; (2) practically explains the aging statistics and the non-monotonic behavior under dynamic voltage scaling (DVS); and (3) comprehensively validates the new model with 65nm silicon data. Compared to previous models, the new result captures the essential role of the recovery phase under DVS, reducing unnecessary guard-bandings in reliability protection.
Modeling Local Variation of Low-Frequency Noise in MOSFETs via Sum of Lognormal Random Variables, Bo Yu, Xin Li, James Yonemura, Zhiyuan Wu, Jung-Suk Goo, Ciby Thuruthiyil, Ali Icel, GLOBALFOUNDRIES Inc.

We investigate the geometry dependence for the local variation of low-frequency noise in MOSFETs via the sum of lognormal random variables. A compact model has been developed and applied to the measured data with excellent match, and therefore enables the coverage of low-frequency noise statistics in circuit design.

Impact of Subthreshold Hump on Bulk-bias Dependence of Offset Voltage Variability in Weak and Moderate Inversion Regions, K. Sakakibara, T. Kumamoto, K. Arimoto, Renesas Electronics Corporation

In MOSFET with STI structure, full suppression of subthreshold hump is difficult. As a result, offset voltage variability differs for every wafer also in moderate inversion region. By using ring gate structure, we have found that bulk-bias dependence of offset voltage variability becomes predictable even in weak inversion region.

Session 7 – High-Speed Wireline Transceivers and Clocking
Monday, 9/10/2012, 1:30 pm
Fir Ballroom
Session Chair: Gerrit den Besten, NXP Semiconductors
Session Co-Chair: Shunichi Kaeriyama, Renesas Electronics Corporation

1:30 pm  
Introduction

Advanced wireline multi-GHz transceivers and functions are presented, including clocking, synchronizing, driving, equalization and data acquisition.

1:35 pm  
A Digital Phase-Locked Loop with Calibrated Coarse and Stochastic Fine TDC, A. Samarah, A. Chan Carusone, University of Toronto

A coarse-fine time-to-digital converter (TDC) is presented with a calibrated course stage followed by a stochastic fine stage. On power-up, calibration algorithm based on a code density test is used to minimize nonlinearities in the coarse TDC. By using a balanced mean method, the number of registers required for calibration algorithm is reduced by 30%. Based upon the coarse TDC results, the appropriate clock signals are multiplexed into a stochastic fine TDC. The TDC is incorporated into a 1.9 - 2.54 GHz digital phase locked loop (DPLL) in 0.13 um CMOS. The DPLL consumes a total of 15.2 mW of which 4.4 mW are consumed in the TDC. Measurements show an in-band phase noise of -107 dBc/Hz which is equivalent to 4 ps TDC resolution, approximately an order of magnitude better than an inverter delay in this process technology. The integrated random jitter is 213 fs. The calibration reduces worst-case spurs by 16 dB.
A frequency agile multiplying injection-locked oscillator (MILO) suitable for fast power cycling is presented. Edge detectors and multiple injection sites extend the aggregate lock range of two MILOs to 55.7% of the 3.16-GHz center frequency. Monitoring circuits identify the correct MILO and power-off the other within 10 reference clock cycles.

An 8.5–11.5Gbps SONET transceiver with referenceless frequency acquisition is designed in a 65nm digital CMOS process. A modified digital quadricorrelator frequency detector (M-DQFD) is incorporated into an LC-based VCO coarse tuning adjustment. The transceiver complies with stringent SONET OC-192 jitter requirements. Within a 400μs acquisition time, the RX achieves a high-frequency jitter tolerance of 0.58UIpp at 10mVpp-diff input sensitivity. The TX serial output exhibits a random jitter (RJ) of 205fs (rms). The transceiver occupies 0.97mm2 and consumes 141mA at 1.0V.

A dynamic rate adjustable interface is designed a 40-nm LP CMOS process. On-the-fly dynamic rate change is enabled by an all-digital frequency multiplier that detects a reference frequency change, and accordingly provides 4x multiplied clock without any idle time. The clock multiplier, along with matched source synchronous clocking and clock equalization, allows blind reference clock shifting to scale the data rate from 1.6 to 6.4 Gb/s within 6.125ns without idle time or bit errors during transitions. The interface efficiency is 2.6 mW/Gb/s @6.4 Gb/s & 3.4 mW/Gb/s @3.2 Gb/s when using reduced clock swing and external transmitter swing at the reduced data rates.

This paper describes the design of the architecture and circuit blocks for backplane communication transceivers. A channel study investigates the major challenges in the design of high-speed reconfigurable transceivers. Architectural solutions resolving channel-induced signal distortions are proposed and their effectiveness on various channels is investigated. Subsequently, the paper describes the design of a 0.6-13.1Gb/s fully-adaptive backplane transceiver embedded in state-of-the-art low-leakage 28nm CMOS FPGAs. The receiver front-end utilizes a 3-stage CTLE, a 7-tap speculative DFE, and a 4-tap sliding DFE to remove the immediate post-cursor ISI up to 64 taps. The clocking network provides continuous operation range between 0.6-13.1Gb/s. The transceiver achieves BER < 10^-15.

3:55pm A 10Gb/s 10mW 2-Tap Reconfigurable Pre-Emphasis Transmitter in 65nm LP CMOS, Y. Lu, K. Jung, Y. Hidaka*, E. Alon, University of California, Berkeley, *Fujitsu Laboratories of America

A low-power pre-emphasis voltage mode transmitter architecture with output swing control, pre-emphasis coefficient control, and online impedance calibration is proposed and demonstrated. A 65nm LP CMOS implementation of this architecture dissipates only ~10mW from a 1.2V supply when transmitting 10Gb/s 400mV differential peak-to-peak data with 2-tap pre-emphasis, achieving 1pJ/bit energy efficiency.

4:20 pm A 6b 1.6GS/s ADC with Redundant Cycle 1-Tap Embedded DFE in 90nm CMOS, E. Zhiyan Tabasy, A Shafik, S. Huang, N. Yang, S. Hoyos, S. Palermo, Texas A&M University

Embedding partial equalization inside the front-end ADC of a serial link receiver can potentially result in a more energy-efficient system. This paper presents a 6b 1.6GS/s ADC with a redundant cycle loop-unrolled embedded DFE structure. Fabricated in an LP 90nm CMOS, the ADC with embedded DFE consumes 20mW total power.


A 3.2GS/s two-step subranging ADC is implemented in a 45nm SOI-CMOS technology. The measured ENOB is 4.55b at 1.6GHz. The IIP3 is -1.1dBm. The power consumption is 22mW from a 1.05V voltage supply for a FOM of 290fJ/ conversion-step. The chip occupies an active area of 0.07mm².


A multi-drop bus system using Dicode (1-D) partial response signaling transceiver is presented. Directional couplers with equi-energy and a newly developed impedance matched design allowed to get 12.5Gb/link speed. A pre-coder is placed in the transmitter to make the signal best fit for the channel. Transceiver ICs are made with 90nm-CMOS.

Session 8 – Radio Receiver Techniques
Monday, 9/10/2012, 1:30 pm
Pine Ballroom
Session Chair: Alberto Valdes-Garcia, IBM
Session Co-Chair: Ramesh Harjani, University of Minnesota

1:30 pm Introduction
This session reviews modern frequency impedance translation techniques, spatio-spectral beam forming, phase arrays and synchronization of IR radios.

1:35 pm Clock-Gated Harmonic Rejection Mixers (INVITED), A. Rafi, T. R. Viswanathan, Silicon
Harmonic-rejection mixing techniques using clock-gating are described. The conventional harmonic-rejection mixer (HRM) concept is generalized to reject higher order harmonics. An N-phase HRM robust to RF device mismatches is presented. A new technique that achieves 19dB better rejection of the previously un-rejected (N-1)th harmonic in 55nm CMOS is described.

A Full-Band Processor for Reduction of RF Mixer LO Harmonic Images, R. Gomez, Broadcom Corporation

A Harmonic Rejection Canceler processor is presented, including preamplifier, 2700Msamp/s 6-bit flash ADC, digital channelizers and adaptive cancelers. It removes interference caused by strong blockers in the TV band that may be inadvertently folded onto the desired channel by LO harmonics. The processor is embedded within an integrated DVB-T2 tuner/receiver.


A novel multi-frequency beamforming front-end is proposed. The proposed front-end allows for simultaneous and independent steering of multiple frequency beams. As proof of concept, an 8GHz 2-channel, 4-frequency phased array beamformer is designed and implemented in 65nm CMOS. The IF signal on each channel is frequency split using an all passive 4-point analog FFT. The orthogonal frequency outputs are then beam steered using an all passive I-Q vector combiner. The RF circuit draws a current of 22.8mA from a 1.2V supply while the analog baseband consumes 9pJ/conv. (135uW at 120MSps).

Impedance, Filtering and Noise in N-phase CMOS Passive Mixers (INVITED), A. Molnar, C. Andrews, Cornell University

We review a Linear Time-Invariant model of the impedance properties of CMOS passive mixers, capturing filtering and noise effects and compare this to measured results. We then suggest a simple metric for characterizing performance limits as a function of process and frequency. Finally we discuss how LO circuits constrain performance.

A Single-Chip X-Band Chirp Radar MMIC with Stretch Processing, Jianjun Yu, Feng Zhao, Joseph Cali, Desheng Ma, Xueyang Geng, Fa Foster Dai, J. David Irwin, and Andre Aklian*, Auburn University, *U. S. Army Research, RDECOM CERDEC Intelligence and Information Warfare Directorate

A single-chip X-band chirp radar transceiver with direct digital synthesis (DDS) for chirp generation is presented. The radar chip, including receiver, transmitter, quadrature DDS, phase-locked loop (PLL) and analog to digital converter (ADC), has been implemented in a 0.13μm BiCMOS technology. The stretch processing technique is employed to translate the time interval between the received and the transmitted chirp signals to a single tone at the baseband output with greatly reduced bandwidth, which allows for the use of a low-cost ADC with a reduced input bandwidth of 10MHz for digitizing the received RF chirp with a
bandwidth of 150MHz. A Weaver receiver with a dc-offset is employed in order to use a single ADC for detecting the received quadrature signals with image rejection. A quadrature 1GHz DDS with an inverse sinc function for zero-order hold correction is implemented to provide the chirp signals for both receiver and transmitter. A wide-tuning PLL frequency synthesizer is integrated to generate the local oscillator (LO) signals as well as the clock signal for the DDS and ADC. The implemented radar-on-chip (RoC) MMIC occupies a die area of 3.5x2.5mm². With a 2.2V supply voltage for analog/RF and a 1.5V supply for digital, the chip consumes 326mW in the receive mode and 333mW in the transmit mode.

**Session 9 – Advances in 3D Design and Optimization**

Monday, 9/10/2012, 1:30 pm
Cedar Ballroom
Session Chair: Steve Wilton, University of British Columbia
Session Co-Chair: Visvesh Sathe, Advanced Micro Devices

1:30 pm **Introduction**

Various topics that illustrate the state-of-the-art in 3D integration including physical optimization and wireless power transfer will be presented.


This paper presents an overview of design and manufacturing readiness for silicon interposer based 3D integration. We present a field programmable gate array research and development vehicle to demonstrate the capabilities of 3D technology. The characterization results show minimal performance impact due to through silicon via (TSV) to 10Gbps transceivers and potential improvement in performance by integrating metal-insulator-metal (MIM) capacitor on silicon interposer.

2:25 pm **A 10.35 mW/GFlop Stacked SAR DSP Unit using Fine-Grain Partitioned 3D Integration,** T. Thorolfsson, S. Lipa*, P. Franzon*, Synopsys Inc., *North Carolina State University

In this paper we present a technique for implementing a fine-grain partitioned three-dimensional SAR DSP system using 3D placement of standard cells where only one of the 3D tiers is clocked to reduce clock power. We show how this technique was used to build the first fine-grain partitioned 3D integrated system to be demonstrated with silicon measurements in the literature, which is an ultra efficient floating-point synthetic aperture
radar (SAR) DSP processing unit. The processing unit was fabricated in two tiers of GlobalFoundries, 1.5 V 130nm process that were 3D stacked face-to-face by Tezzaron. After fabrication the test chip was measured to consume 4.14 mW of power while running at 40 MHz operating for an operating efficiency of 10.35 mW/GFlop.

2:50 pm 0.61W/mm² Resonant Inductively Coupled Power Transfer for 3D-ICs, S. Han, D. Wentzloff, University of Michigan

A high power density wireless inductive power link targeting 3D-ICs and wireless testing is implemented in 65nm CMOS. The link exploits high-Q inductors and resonant inductive coupling to boost the received voltage and maximize the delivered power. A power density of 0.61W/mm² is achieved with 0.12x0.12μm² coils and 50μm separation.

3:15 pm BREAK

Session 10 – Forum Session – Green Electronics
Monday, 9/10/2012, 3:30 pm
Cedar Ballroom
Session Chair – Patrick Chiang, Oregon State University

3:25 pm Introduction N/A

The issues that surround the management of power -- generation, delivery, conversion, optimization, and even disposal -- are some of the most pressing concerns for next-generation integrated circuits and large-scale computing systems. What are some of the new innovative techniques at the integrated-circuit level that can help optimize power/performance at the system-level? What are some of the new power management applications that will affect future market consumption? This forum will address the state-of-the-art issues related to power management integrated circuits, both from cutting-edge academics as well as from industry leaders.

3:30 pm Future Trends in Power Management, Jonathan Audy, Analog Devices N/A

There are 3 main categories of reasons for reducing power consumption: economic, environmental, and practical. Like the dilemma of "paper or plastic", being green can be a difficult thing to determine. It is not just about power-drain during a product lifespan. The entire impact from design, materials, production, shipping, installation, lifetime operation, removal, disposal and back to replacement must be taken into account. A comparison of all of these with that of the alternative solutions must ultimately be taken into account. Future power management solutions can expect to be pressured in several directions simultaneously.

3:55 pm Fully Integrated Switched-Capacitor DC-DC Conversion, Elad Alon University of California, Berkeley N/A

Although there is a clear current need to support multiple independent supply voltages on the die, realizing the integrated switching DC-DC converters needed to support this functionality has remained challenging. Inductor-based designs are the default choice for off-chip converters, but in standard CMOS processes, on-die capacitors have much better loss and energy density characteristics than on-chip inductors. In this talk I will therefore describe recent advances in the design and optimization of fully integrated switched-
capacitor (SC) converters, focusing on their achievable efficiency and power density. As will be described in the talk, even in a standard, low-cost CMOS process, optimized experimentally demonstrated SC converters have achieved efficiency and power density high enough to enable them to be integrated into mobile parts at minimal area overhead.

4:20 pm  **DVFS-Capable Cross-Layer Power Management Circuit Design, Dongsheng Ma**

*University of Texas-Dallas*

With the perpetual power increase in modern VLSI systems, efficient and effective power management has been critical to next-generation IC designs. To overcome this grand challenge, techniques such as dynamic voltage/frequency scaling (DVFS), have been proposed to jointly optimize power, energy and operating performance, leading to significantly improved system reliability, efficiency and battery lifetime. From both system-level and circuit-level perspectives, this talk overviews key design issues, control schemes, and circuit architectures. This talk will also discuss future research directions involved in the development of application-aware, multiple- and variable-output DC-DC power converters.

4:45 pm  **Wide Bandgap Switches for Green Energy, Carl-Mikael Zetterling**

*KTH-Sweden*

Four times lower system losses have been demonstrated using commercial silicon carbide devices compared to silicon designs. This talk will discuss the tradeoffs between these different fundamental substrates, and will attempt to answer the following questions: How can immature wide bandgap switches such as silicon carbide and gallium nitride compete with a mature technology like silicon IGBTs? In which circuit designs can you take advantage of these new materials?

5:10 pm  **Sensors Processing Systems – a New Wave of Environmentally Aware Lighting, N/A**

*Sajol Ghosal, AMS*

Today adequate daylight is available through windows and skylights, however we do not have adequate sensor driven systems that can autonomously detect the daylight and adjust lighting controls. This would save a significant amount of energy and would require the integration of sensor technology, sensor processing and power management for delivering "Green" solid-state LED lighting. Daylight harvesting in commercial buildings can save over 50% of energy costs for lighting. Sensors, enabling autonomous energy management, will drive the next level of energy conservation. This paper will cover an integrated sensor driven lighting solution that adapts its energy consumption based on environmental sensing and lighting requirements.

---

**Poster Session**

Monday 9/10, 5:00 pm – 7:00 pm

Donner, Siskiyou, Cascade Ballrooms

**M-1  A 150nW, 5ppm/°C, 100kHz On-Chip Clock Source for Ultra Low Power SoCs, A. Shrivastava, B. Calhoun, University of Virginia**

This paper presents an ultra low power clock source using a 1µW temperature compensated on-chip digitally controlled oscillator (OscCMP) and a 100nW uncompensated oscillator (OscUCMP) with respective temperature stabilities of 5ppm/°C and 1.67%/°C. A fast locking circuit re-locks OscUCMP to OscCMP often enough to achieve a high effective
temperature stability. Measurements of a 130nm CMOS chip show that this combination gives a stability of 5ppm/oC from 20oC to 40oC (14ppm/oC from 20oC to 70oC) at 150nW if temperature changes by 1oC or less every second. This result is 7X lower power than typical XTALs and 6X more stable than prior on-chip solutions

M-2 An Uncalibrated 2MHz, 6mW, 63.5dB SNDR Discrete-Time Input VCO-Based $\Delta\Sigma$ ADC, J. Hamilton, S. Yan*, T. R. Viswanathan, UT Austin, *Silicon Labs

A 63.5dB, 2MHz ADC using two ICOs as an integrator/quantizer and a combined switched-capacitor V-I converter and feedback DAC in 0.18um is presented. A novel high-linearity ring oscillator architecture provides a high resolution quantizer output, and a digital modulator truncates this for a 17-level feedback DAC.

M-3 A Current Reference Pre-charged Zero-crossing Pipeline-SAR ADC in 65nm CMOS, Jayanth Kuppambatti and Peter. R. Kinget, Columbia University

Using a current reference pre-charge technique, the need for power hungry low impedance voltage reference buffers is eliminated in a zero-crossing pipeline-SAR ADC. The 40MS/s ADC prototype, implemented in a 65nm process, has an SFDR/SDR/SNDR of 70dB/66dB/59.5dB at Nyquist, while occupying 0.95mm2 and consuming 4.5mW from a 1.35V supply, requiring no additional power for the reference buffers.

M-4 A 0.5V, 11.3-$\mu$W, 1-kS/s Resistive Sensor Interface Circuit with Correlated Double Sampling, Hyunsoo Ha, Yunjae Suh, Seon-Kyoo Lee, Hong-June Park, Jae-Yoon Sim, Pohang University of Science and Technology

This paper presents a low-power resistive sensor interface circuit with correlated double sampling which reduces the effect of amplifier offset. The fabricated chip in 0.13$\mu$m CMOS demonstrates a sampling rate of 1-kS/s and a dynamic range of 117dB with a maximum conversion error of 0.32-percent while consuming only 11.3-$\mu$W.

M-5 7.5Vmax Arbitrary Waveform Generator with 65nm Standard CMOS under 1.2V Supply Voltage, Toru Nakura, Yoshio Mita, Tetsuya Iizuka, Kunihiro Asada, The University of Tokyo

This paper presents a digitally controlled arbitrary waveform generator whose output voltage range is 0 to 7.5V under a 1.2V supply voltage. The high output voltage generation is realized using only 65nm standard MOS transistors. It consists of a voltage increasing block and a voltage decreasing block to realize stable high voltage output. Experimental results show that our waveform generator can generate arbitrary waveform, and it directly drives a MEMS structure.

M-6 Design Of Organic Complementary Circuits For RFID Tags Application, M. Guerin, E. Bergeret, E. Bènevent, P. Pannier, IM2NP, France, A. Daami, S. Jacob, I. Chartier, R. Coppard, CEA LCEI, France

This paper presents organic complementary circuits that can be used in an all-organic sheet-to-sheet processed RFID tag on plastic foils. An integrated organic eight-stage rectifier reaching a 14 MHz maximum working frequency is presented, as well as a physical unclonable function generator able to generate a code depending on the organic process scattering.
Amorphous Silicon 5 Bit Flash Analog to Digital Converter, Aritra Dey and David Allee, Flexible Display Center at Arizona State University

A 5-bit fully flash A/D converter (ADC) is built using only n-channel amorphous silicon hydride (a-Si:H) thin film transistors (TFT), metal resistors and capacitors. The circuit is built on silicon using a low temperature process, compatible with flexible plastic substrates. The circuit consumes a power of 13.6 mW running at a speed of 2 k samples/sec. The measurements show reasonably good characteristics, achieving a DNL of less than ±1 LSB and INL of less than ±1.8 LSB without calibration.


We present a processor, the Digital Power Manager (DPM), providing power management and node/data/processing flow control for a 130nm battery-less power harvesting body sensor node. The DPM adjusts node power consumption, responding to available energy to support operation exclusively from harvested power. The DPM consumes 2.5pJ/instruction and 0.63pJ/cycle for NOPs.

A 0.791mm² Fully On-Chip Controller with Self-Error-Correction for Boost DC-DC Converter Based on Zero-Order Control, Tae-Hwang Kong, Sung-Wan Hong, Sungwoo Lee, Jong-Pil Im, Gyu-Hyeong Cho, KAIST

This paper introduces an on-chip controller without off-chip components to reduce controller size for use as a Zero-Order-Control converter (ZOC). DC offset error by adding saw-tooth signal in ZOC is self-corrected using a new control scheme. A 0.35um BCD process is used for chip with controller area of 0.791mm².


We present the design and implementation of high-performance soft-edge flip-flops (SEF) used in AMD microprocessors. Benefits of the SEF and a novel method for evaluating flip-flop designs in the presence of jitter are introduced, along with an area-efficient level-sensitive scan design. We compare different SEF topologies along with previous designs.


We have developed a power-gating technique for a mobile processor in 28-nm HKMG technology. The proposed EM-tolerant 1.8V I/O NMOS power switch reduces the standby power to 1/641× and achieves 79% channel utilization without weakening EM immunity. The active leakage power of the dual CPU cores can be reduced by 45 mW in a single core operation mode with a rapid 1.4-us wakeup time to full core operation. A mobile processor is designed and fabricated with proposed technique. Estimated standby power of the chip is 123 uW, resulting in one order of magnitude reduction compared to the conventional techniques. Measured leakage power shows a good agreement with the estimated one.
A 148ps 135mW 64-bit Adder with Constant-Delay Logic in 65nm CMOS, P. Chuang, D. Li*, M. Sachdev, V. Gaudet, University of Waterloo, *Qualcomm Inc.

A 148ps 64-bit adder with Constant-Delay logic is fabricated in a 65nm, 1V CMOS process. The pre-evaluation and constant delay features of CD logic makes it up to 2X faster than dynamic logic in realizing addition. At 1V supply, this adder’s worst-case and leakage power are 135mW and 0.22mW, respectively.


We propose a dynamic voltage-drop sensor, which is fully digital so that it is easy to design into products and use for testing. The 2.4K-gate GHz sensor exploits the difference in the voltage sensitivity between two paths composed of different types of standard cells. We have fabricated a test chip in a 28-nm HKMG process and confirmed its feasibility. This sensor can be used to evaluate optimal activity rates and peak power in scan testing.


This paper reports the first SONOS-based field-programmable ESD protection concept and structure. Prototype in 130nm CMOS demonstrates wide ESD triggering tuning range of ~2V and ultra low leakage of 1.2pA. It enables post-Si on-chip/in-system ESD design programmability for complex ICs.


As the supply voltage (VDD) approaches the device threshold voltage (VT), the elevated temperature results in increased device current. This phenomenon is generally known as Inverse Temperature Dependence (ITD). In this paper, we propose a test structure with a built-in poly-resistor-based heater to characterize ITD in digital circuits. Our measurements from a 130nm test-chip show that the Zero-Temperature-Coefficient (ZTC) point varies by circuit type, and further fluctuates due to process variation. A more accurate ITD-sensitive thermal sensor is thus needed for better temperature tracking.

A 2.45-GHz 20-dBm 0.13 µm CMOS Class-E Power Amplifier with 52% PAE and a N/A Rise/Fall-Time Configurable Switch for TOA Ranging Applications, Z. Li, G. Torfs, J. Bauwelincx, X. Yin, J. Vandewege, P. Spiessens*, H. Tubbax*, F. Stubbe*, Ghent University - IMEC, *Essensium N. V.

This paper presents a 0.13 um CMOS Class-E PA with 20.5 dBm maximum output power and 52.5% PAE. The PA covers a wide dynamic range from 1.5 dBm to 20.5 dBm. A novel configurable switch is applied for both ranging and communication. The measured fast rise time for ranging is merely 4.5 ns while the slow rise/fall time for communication is 90 ns, which allows a maximum data rate for return-to-zero BPSK modulation of 4 Mbps.

Multi-band, Multi-mode, Low-power CMOS Receiver Front-end for Sub-GHz ISM/SRD Band with Narrow Channel Spacing, C.-H. Yeh, H.-C. Hsieh, P. Xu, S.
Chakraborty, Texas Instruments

An inductorless sub-GHz multi-band receiver front-end with channel spacing down-to 12.5kHz, consists of LNA, quadrature mixer and IFA to provide 39dB gain with 60dB variation, 6.5dB NF, -14dBm IIP3, and 4.1mW. The end-to-end receiver achieves 64dB selectivity and 91dB blocker rejection at 12.5kHz and 10MHz respectively.


This paper presents a Digital PLL with two-step closed-locking technique, which allows to use a simple phase detector without a complex glitch compensation circuit. The proposed Digital PLL improves a close-in noise by 17dB compared with the conventional locking. The phase error is less than 3.0 degrees for various conditions.

M-19 A Supply-Voltage Scalable, 45 nm CMOS Ultra-Wideband Receiver for mm-wave Ranging and Communication, S. Kundu, A. Khairi, J. Paramesh, Carnegie Mellon University

A supply-voltage scalable, low-power, millimeter-wave UWB receiver fabricated in 45nm SOI CMOS is presented. All stages in the transformer-neutralized LNA and the quadrature mixers are limited to single-transistor in stack. The 46-64GHz (48-68GHz) receiver achieves 24.5dB (20.7dB) gain, 5.3dB (7.8dB) NF from 1.1V (0.6V) Vdd while consuming 18.4mW (7mW) power.

M-20 A Low-Power Highly Multiplexed Parallel PRBS Generator, M.-S. Chen, C.-K. K. Yang, University of California, Los Angeles

For high data rates, pseudo-random bit sequence (PRBS) patterns must be generated in parallel and then multiplexed. This paper introduces a design that reduces the number of XORs and DFFs to lower power dissipation and area. The maximum fan-out can be further constrained to improve gate delay and hence improve the output data rate. The procedure for applying the design to arbitrary PRBS lengths is provided and the design is suitable for standard-cell design flow. The design achieves 1.7-Gb/s data rate with 64-way multiplexing to support an output bandwidth of >100 Gb/s. The design is implemented in an 65-nm technology using 0.007 mm² area and dissipating 0.16 mW of power.


This paper presents the first reported design of a forward error correction (FEC)-based high-speed serial link. A 4 Gb/s line rate transceiver in 90nm CMOS is designed with short block length BCH codes. FEC is shown to be effective for high code rates, high information rates and low SNR channels. Measurement results of the transceiver over a 18.2 dB Nyquist loss channel show a 45x reduction in minimum BER, and an increase in jitter tolerance at low transmit swings. For a BER \( \leq 10^{-12} \), the addition of FEC reduces the required transmit signal swing, from approximately 0.75 Vppd to less than 0.5 Vppd.

5Gbps bi-directional RF-Interconnect with multi-drop and arbitration capabilities is designed in 65nm CMOS, with 4 drops along 5.5mm TL ring. Data are ASK modulated with 60GHz carrier. We insert directional couplers and MOS switches to reconfigure/arbitrate multi-cast and communication priority. Average power consumptions are 1.33pJ/b and 0.24pJ/b/mm, with 9ps/mm latency.

Session 11 – PLLs, VCOs, and Dividers
Tuesday, 9/11/2012, 9:00 am
Oak Ballroom
Session Chair: Fa Foster Dai, Auburn University
Session Co-Chair: Howard Luong, Hong Kong University of Science & Tech.

9:00 am  Introduction  250

This session presents various design techniques to achieve PLLs with low-power dissipation and low-phase noise, VCOs with ultra-wide tuning ranges, and frequency dividers with ultra-wide locking ranges.

9:05 am  A Quantization Noise Cancelling Fractional-N type \( \Delta \Sigma \) Frequency Synthesizer  252


A high-speed and low-power adaptable period SAR-based DAC gain calibration is presented for DSM quantization noise cancellation, which completes within 10us while dissipating 0.2mW. The proposed frequency synthesizer shows more than 30-dB quantization noise suppression, resulting in the phase noise \(-134.5dBc/Hz\)@1.25-MHz at 877MHz while consuming 11mA from 1.2-V supply.


A 30% frequency tuning range 23.5GHz 32nm SOI-CMOS PLL features an adaptively biased VCO. Adaptive biasing of the VCO lowers the average PLL power consumption from 34mW to 24mW, while keeping the jitter below 1.5o RMS across all frequency bands.

9:55 am  A 0.6V 2.2mW 58-to-73GHz Divide-by-4 Injection-Locked Frequency Divider, Liang Wu, Howard C. Luong, Hong Kong University of Science and Technology

A simple but effective locking range enhancement technique is proposed for LC-type divide-by-4 injection-locked frequency dividers (ILFDs) at millimeter-wave (mm-Wave) frequencies. By employing a 4th-order LC tank with the two frequency peaks properly designed at \( w_0 \) and 3\( w_0 \), the 3rd-order harmonic gets boosted that significantly enhances the injection efficiency and thus the locking range of divide-by-4 ILFDs. Implemented in 65-nm CMOS, the prototype measures a locking range of 21.9% from 58.53 to 72.92 GHz while consuming 2.2mW from a 0.6V-supply, which corresponds to an FoM of 6.54.

A 4.8mW Inductorless CMOS Frequency Divider-by-4 with more than 60%
Fractional Bandwidth up to 70GHz, A. Ghilioni, U. Decanis, A. Mazzanti, F. Svelto, Università degli Studi di Pavia

We propose a divider-by-4 based on clocked differential amplifiers working as dynamic CML latches. The clock modulates both the tail current and the load resistance of the pair. 32nm CMOS prototypes operate between 14GHz and 70GHz demonstrating more than 60% fractional bandwidth, 4.8mW maximum power consumption and 55x18μm² occupied area.

A Distributed "Hybrid" Wave Oscillator Array for Millimeter-Wave Phased-Arrays, A. Moroni, R. Genesi*, D. Manstretta, Università degli Studi di Pavia, Pavia, Italy, *now with Maxim Integrated Products, Italy

An array of distributed oscillators for millimeter-wave phased-arrays combines LO generation and distribution. Phase-noise improves proportionally to the number of array elements. A standalone oscillator and a 4-elements array have a measured phase-noise of -125dBc/Hz and -131dBc/Hz at 10MHz from the 52GHz carrier, corresponding to a FoM of -183.7dBc/Hz.

A 57.5-to-90.1GHz Magnetically-Tuned Multi-Mode CMOS VCO, Jun Yin, Howard C. Luong, The Hong Kong University of Science and Technology

A magnetically-tuned multi-mode VCO featuring an ultra-wide frequency tuning range is presented. By changing the magnetic coupling coefficient between the primary and secondary coils in the transformer tank, the frequency tuning range of a dual-band VCO is greatly increased to continuously cover the whole E-band. Fabricated in a 65-nm CMOS process, the presented VCO measures a tuning range of 44.2% from 57.5 to 90.1 GHz while consuming 7mA to 9mA at 1.2V supply. The measured phase noises at 10MHz offset from carrier frequencies of 72.2, 80.5 and 90.1 GHz are -111.8, -108.9 and -105 dBc/Hz, respectively, which corresponds to a FOMT between -192.2 and -184.2dBc/Hz.

A 65nm CMOS Current Controlled Oscillator with High Tuning Linearity for Wideband Polar Modulation, Yiwu Tang, Jianyun Hu, Jongmin Park, Jaehyouk Choi*, Lincoln Leung, Charn Narathong, Kamal Sahota, Qualcomm, *Ulsan National Institute of Science & Technology

A highly linear oscillator is presented for wideband polar modulation. It has both varactor voltage tuning for frequency locking and inductive current tuning for linear phase modulation. Implemented in 65nm CMOS, it achieved a gain variation of less than ±2% over more than 32MHz range meeting WCDMA polar modulation requirement.

Session 12 – Biomedical and Sensors
Tuesday, 9/11/2012, 9:00 am
Fir Ballroom
Session Chair: Emmanuel Quevy, Silicon Labs
Session Co-Chair: Pedram Mohseni, Case Western Reserve University
The state-of-the-art in biomedical electronics continues to advance on many fronts, including power-transfer, sensing and imaging.

A Feedback Controlled Coil Driver for Transcutaneous Power Transmission, E. Lee, Alfred Mann Foundation

A fully integrated feedback controlled coil driver for transcutaneous power transmission was proposed to power biomedical implants. For a normal power transmission operation, the voltage across the switch that energizes the coil is sampled and compared with ground, followed by an integration to obtain an optimal on-time for the switch such that the coil current was maximized for a given DC input power. The coil driver also provides ASK modulation on the coil current by changing the size of the switch according to the input data. In the normal power transmission operation, a peak-to-peak coil current of ~730mA was obtained with a power dissipation of 71.6mW at 5V.


Compressive sensing enables low-energy data reduction on sensors. Reconstruction costs, however, are severe, typically pushing signal analysis to a base station. We present a seizure-detection processor that directly analyzes compressively-sensed electroencephalograms. Besides circumventing reconstruction costs, it provides a power-management knob by reducing computational energy with the number of input samples.

A 46 μW Motion Artifact Reduction Bio-Signal Sensor with ICA Based Adaptive DC Level Control for Sleep Monitoring System, S. Hong, S. Lee, T. Roh, H. Yoo, KAIST

A motion artifact reduction multi-channel bio-signal sensor is proposed for sleep monitoring system with the help of independent component analysis (ICA). To prevent signal saturation, adaptive DC level control (ADLC) is adopted to adjust the DC level of the signal. The current-controlled level shifter (CCLS) is used in ADLC for the shifting voltage step as small as 5mV with low power consumption. The proposed 4-channel sleep monitoring consumes only 46μW to 73μW. With the proposed ADLC, 16.7 times larger motion artifact cannot saturate signal.


We propose an ultra-low power optical wake-up receiver with a novel front-end circuit and communication scheme suitable for miniature wireless sensor node applications. Named "FLOW" for Free-space Low-Power Optical Wake-up, the receiver consumes 695pW in standby mode, which is ~6,000× lower than previously reported RF and ultrasound wake-up radios. In active mode, it consumes 140pJ/bit at 91bps. A pulse width modulated communication encoding scheme is used, and chip-ID masking enables selective batch-programming and synchronization of multiple sensor nodes.
A. Wang, S. Sivaramakrishnan, A. Molnar, Cornell University

We demonstrate an 180nm CMOS image sensor which performs on-chip image compression. The pixels optically compute a 2D Gabor transform on visual scenes, and the readout back-end implements compression during digitization. The chip uses a 384x384 array of Gabor filters, consumes 2mW at 15fps, and achieves a 10:1 compression ratio.

Design of a Monolithic CMOS Image Sensor Integrated Focal Plane Wire-Grid Polarizer Filter Mosaic, X. Wu, M. Zhang, N. Engheta, J. Van der Spiegel, University of Pennsylvania

We report on an image sensor integrated focal plane wire-grid polarizer filter mosaic targeted at visible spectrum fabricated in 65nm CMOS processing line, enabling the reconstruction of the polarization characteristic for each pixel. Experimental results show an extinction ratio around 10 which are the best reported for monolithic polarization image sensor design.

Session 13 – High Speed Data Converters
Tuesday, 9/11/2012, 9:00 am
Pine Ballroom
Session Chair: Eric Naviasky, Cadence
Session Co-Chair: Mohammad Ranjbar, Cirrus Logic Inc.

Introduction

This year, the High-Speed Data Converter session includes four ADC designs with speeds of 1 GS/s and above and an invited tutorial on timing-mismatch effects in interleaved ADCs.

A 10-bit 1-GS/s CMOS ADC with FOM=70 fJ/Conversion, S. Hashemi, B. Razavi

A pipelined ADC incorporates a precharged resistor-ladder DAC in a multi-bit front-end, achieving fast settling and allowing calibration of both dynamic and static gain errors. Using simple differential pairs with a gain of 5 as op amps and realized in 65-nm CMOS technology, the 10-bit ADC consumes 36 mW at a sampling rate of 1 GHz and exhibits an SNDR of 52.7 dB at an input frequency of 490MHz.


The offset drift suppression techniques for dynamic comparator and preamplifier make the ADC robust against environmental variation. Once the ADC is calibrated at power up, no more calibration is necessary even under VDD or temperature variation. The developed 7b 1.4GS/s flash ADC occupies small area of 0.085mm² and dissipates 33.24mW.


A 45nm CMOS 7b nonbinary 2b/cycle SAR ADC that operates up to 1GS/s with a 1.25V
supply is presented. Use of a nonbinary decision scheme for decision error correction in a 2b/cycle structure not only increases the ADC speed with a relaxed DAC settling requirement but also makes the performance robust to reference fluctuation and signal-dependent comparator offset variation. Proposed dynamic registers and a register-to-DAC direct control scheme enhance the conversion speed by minimizing logic delay in the decision loop. At a sampling rate of 1GS/s, the chip achieves a peak SNDR of 41.6dB and maintains ENOB higher than 6b up to 1.3GHz signal frequency. The FoM is 80fJ/conversion-step at 1GS/s with a power consumption of 7.2mW.

A 240mW 2.1GS/s 12b Pipeline ADC Using MDAC Equalization, Jiangfeng Wu, 318
Chun-Ying Chen, Tianwei Li, Wenbo Liu, Lin He, Shauhyuarn Sean Tsai, Binning Chen, Chun-Sheng Huang, Juo-Jung Hung, Wei-Ta Shih, Hing Hung, Steven Jaffe, Loke Tan, Hung Vu, Broadcom Corporation

This paper introduces MDAC equalization, a digital correction technique for pipeline ADCs that corrects MDAC gain and settling errors using successive digital FIR filters operating on sub-ADC output samples. This technique reduces the required MDAC residue amplifier (RA) bandwidth relative to the sampling frequency, thereby reducing ADC power. MDAC equalization is demonstrated in a 240mW 2.1GS/s 12b ping-pong pipeline ADC in 40nm CMOS where MDAC RA power is reduced from 175mW to 53mW by 70%.

10:45 am  
BREAK

11:05 am  
Problem of Timing Mismatch in Interleaved ADCs (INVITED), Behzad Razavi, 322
University of California, Los Angeles

Time interleaving can relax the speed-power trade-off of analog-to-digital converters but at the cost of sensitivity to interchannel mismatches. This paper addresses the problem of timing mismatch, its detection, and its correction. A new frequency-domain analysis gives insight into the impact of the mismatch on random input signals and quantifies the resulting noise. A number of timing error calibration techniques are reviewed and a new approach is proposed.

Session 14 – Advanced IC Technologies I

Tuesday, 9/11/2012, 9:00 am
Cedar Ballroom
Session Chair: Alvin Loke, Advanced Micro Devices
Session Co-Chair: David Sunderland, Boeing Space and Intelligence Systems

9:00 am  Introduction 330

This all-invited session covers exciting technological advances, including the first production tri-gate devices, SiC devices for power management, ultra-thin SOI, and reliability challenges for scaled CMOS.

9:05 am  
22-nm Fully-depleted Tri-Gate CMOS Transistors (INVITED), C. Auth, Intel 331
Corporation

At the 22-nm technology node, fully-depleted tri-gate transistors were introduced for the first time on a high-volume manufacturing process. These transistors feature high-k metal-gates and channel strain techniques resulting in the highest drive currents yet
reported. The use of tri-gate transistors provides steep subthreshold slopes and very low 
DIBL values that are critical for low voltage operation.

9:55 am  Reliability Challenges for the Continued Scaling of IC Technologies (INVITED), 337  
A.S. Oates, TSMC

The rapid evolution of Si process technologies presents significant challenges to the 
understanding of the physics of failure of circuits and the characterization of their 
reliability. Introduction of new materials at the 28 nm node and below, as well as new  
FinFET transistor structures, complicates the task of reliability assurance. Here we review 
the major reliability challenges for transistors, interconnect and circuits that can be 
foreseen with these scaling trends.

10:45 am  BREAK

11:05 am  Extremely Thin SOI for System-on-Chip Applications (INVITED), A. Khakifirooz, K. 341  
Cheng, Q. Liu, T. Nagumo*, N. Loubet, A. Reznicek, J. Kuss, J. Gimbert, R. Sreenivasan, M.  
Ponath, S. Luning**, and B. Doris, IBM, STMicroelectronics, LETI, 
*Renexas,**GLOBALFOUNDRIES

We review the basics of the extremely thin SOI (ETSOI) technology and how it addresses 
the main challenges of the CMOS scaling at the 20-nm technology node and beyond. The 
possibility of VT tuning with backbias, while keeping the channel undoped, opens up new 
opportunities that are unique to ETSOI. The main device characteristics with regard to 
low-power and high-performance logic, SRAM, analog and passive devices, and embedded 
memory are reviewed.

11:30 am  Present and Future Applications of Silicon Carbide Devices and Circuits 345  
(INVITED), C-M Zetterling, KTH Royal Institute of Technology

Silicon Carbide (SiC) is a wide bandgap semiconductor now reaching maturity. Discrete  
high-voltage SiC devices are commercially available from several suppliers for low-loss  
power conversion. Future applications may include integrated circuits for high-temperature  
and radiation-hard applications. This paper introduces SiC material properties, processing,  
devices, and circuits.

Session 15 – Advanced Memory Topics
Tuesday, 9/11/2012, 2:00 pm
Oak Ballroom
Session Chair: Vikas Chandra, ARM
Session Co-Chair: Tom Andre, Everspin Technologies

2:00 pm  Introduction  353

This session covers the latest advances and future trends in phase-change memory,  
high-density embedded DRAM, mega-byte class SRAM, and memory-based physical  
unclonable functions.

2:05 pm  Phase Change Memory: Scaling and Applications (INVITED), R. Jeyasingh, J. Liang, 354  
M. A. Caldwell, D. Kuzum, H. –S. P. Wong, Stanford University
Phase Change Memory (PCM) technology is a promising candidate for the future non-volatile memory applications. Scaling of PCM into the sub-10 nm regime has been demonstrated using novel applications of nanofabrication techniques. PCM devices using solution-processed GeTe nanoparticles of diameter range 1.8 – 3.4nm has been demonstrated. Highly scaled (<2nm) PCM cross-point device using carbon nanotube as the electrode is fabricated proving the scalability of PCM to ultra small dimensions. The use of PCM as a nanoelectronic synapse for neuromorphic computation is also demonstrated as an illustration of PCM application beyond digital memory.

A 0.65V Embedded SDRAM with Smart Boosting and Power Management in a 45nm CMOS Technology, SS. Pyo, JS. Kim, JH. Kim, HT. Jung, TJ. Song, CH. Lee, GH. Kim, YK. Lee, KS. Kim, Samsung Electronics

An eSDRAM with smart boosting and power management (SB-PM) for low power operation has been designed. SB-PM scheme decreases 40.3% of dynamic power and 69.1% of standby power consumption. A 266Mb eSDRAM is designed with SB-PM scheme in a 45nm CMOS technology showing 51.2mW dynamic and 2.05mW standby power consumption.

A Write-Back-Free 2T1D Embedded DRAM with Local Voltage Sensing and a Dual-Row-Access Low Power Mode, W. Zhang, K. Chun, C. H. Kim, University of Minnesota

A gain-cell embedded DRAM (eDRAM) achieves 1.0GHz random read frequency by eliminating write-back. It also incorporates a local-sense-amplifier architecture to improve read-bitline swing, and a low-overhead dual-row-access mode for power reduction in partial utilization scenarios. Measurement results are presented from a 64kb eDRAM test chip in 65nm process.

An Energy Efficient 32nm 20 MB L3 Cache for Intel® Xeon® Processor E5 Family, M. Huang, M. Mehalel, R. Arvapalli, S. He, Intel Corporation

A 20-way set associative 20MB energy efficient L3 cache for the Intel® Xeon® processor E5 family is presented. The design is manufactured in the 32nm second generation of high-K dielectric metal gate process. The proposed high density modular cache uses advanced power saving schemes and effective Vccmin design techniques.

Comparison of Bi-stable and Delay-based Physical Unclonable Functions from Measurements in 65nm bulk CMOS, M.Bhargava, C.Cakir, K.Mai, Carnegie Mellon University

Physical Unclonable Functions (PUFs) are security primitives used in a number of security applications. We compare bi-stable based PUFs (SRAM and sense amplifiers) and delay based PUFs (arbiter and ring oscillator) on their security characteristics (uniqueness, randomness, and reliability), as well as conventional VLSI design metrics (area, power, and performance) using measurements from a 65nm testchip.
Session Chair: William McIntyre, Texas Instruments
Session Co-Chair: Christoph Sandner, Infineon

2:00 pm Introduction 377

Effective power management requires innovative techniques to minimize system cost while maximizing efficiency. This session covers a wide array of advances in power management for emerging applications and nanometer CMOS.

2:05 pm Flexible Solar-Energy Harvesting System on Plastic with Thin-film LC Oscillators 378
16-1


We present a flexible energy-harvesting system based on thin-film solar cells and transistors on plastic. An LC-oscillator-based power inverter operates beyond device ft, at a frequency greater than 2MHz. This improves the quality factor of patterned inductors, enabling wireless device charging with output power of 22mW and efficiency of 31%.

2:30 pm Reconfigurable Sleep Transistor for GIDL Reduction in Ultra-Low Standby Power Systems, Suyoung Bang(*), David Blaauw(*), Dennis Sylvester(*), and Massimo Alioto(*, **), University of Michigan, *University of Siena,

In this paper, we introduce the concept of reconfigurable sleep transistors in two different topologies when operating in active and sleep mode. In active mode, transistors are stacked as in traditional power gating schemes. In sleep mode, sleep transistors are reconfigured to reduce GIDL leakage, other than subthreshold leakage.

2:55 pm EChO Power Management Unit with Reconfigurable Switched-Capacitor Converter 386
16-3

in 65 nm CMOS, Massimo Alioto¹, Elio Consoli², Jan Rabaey³,¹ University of Siena, Italy and currently also with University of Michigan, Ann Arbor, MI,² Maxim Integrated Products, Catania, Italy,³ Berkeley Wireless Research Center, EECS, University of California, Berkeley

A novel power management unit is introduced. Its reconfigurable switched capacitor array reduces the energy associated with sleep-to-active and active-to sleep transitions by 64%. Energy reduction comes at small area overhead (<1%) and no penalty in active mode. Measurements on a 65-nm testchip demonstrate energy savings of 30%.

3:20pm Fully Integrated Capacitive Converter With All Digital Ripple Mitigation, Sudhir S. 390
16-4

Kudva and Ramesh Harjani, University of Minnesota

This paper presents digital ripple mitigation for fully integrated capacitive converters achieved by varying the size and charge/discharge time modulation of the bucket capacitors. The converter implemented in IBM’s 130nm CMOS showed 65% lower ripple for a load of 0.4V/2mA with ripple control enabled without significantly impacting the core efficiency.

3:45 pm BREAK

4:00 pm A 100 MHz Two-Phase Four-Segment DC-DC Converter with Light Load Efficiency Enhancement in 0.18 µm CMOS Technology, Han Peng, David I Anderson*, Mona M.Hella**, GE Global Research, *Texas Instruments, **Rensselaer Polytechnic Institute

A two-phase four-segment DC-DC converter with positively-coupled inductors between
segments and negatively coupled inductors between phases is designed in 0.18 µm CMOS technology with novel resonant gate drivers for light load efficiency enhancement. It maintain peak efficiency as the output current varies from 0.1 A to 1.86 A.

A 0.5V Start-up 87% Efficiency 0.75mm² On-Chip Feed-Forward Single-Inductor Dual-Output (SIDO) Boost DC-DC Converter for Battery and Solar Cell Operation Sensor Network Micro-Computer Integration, Yasunobu Nakase, Shinichi Hirose, Hiroshi Onoda, Yasuhiro Ido, Yoshiaki Shimizu*, Tsukasa Oishi*, Toshio Kumamoto, Toru Shimizu, Renesas Electronics, *Renesas Design Corp.

An on-chip single-inductor dual-output (SIDO) DC-DC converter is proposed for battery and solar cell operating sensor network systems. A test chip fabricated by 190nm CMOS achieves high efficiency of 87% with small area size of 0.75mm² and 0.5V start-up without external compensation components or special process technologies with feed-forward control and forward back-bias.

Non-load-balance-dependent high efficiency single-inductor multiple-output (SIMO) DC-DC converters, Y. H. Ko, Y. S. Jang, S. K. Han, S. G. Lee, Korea Advanced Institute of Science and Technology

A single-inductor multiple-output DC-DC converter providing buck and boost outputs with a new control topology is presented. In the proposed switching sequence, energy delivery is always accomplished by flowing energy through an inductor, which leads to high conversion efficiency regardless of the balance between the buck and boost output loads.

An Integrated MESFET Voltage Follower LDO for High Power and PSR RF and Analog Applications, William Lepkowski*,**, Seth J. Wilk*,**, M. Reza Ghajar*; Bertan Bakkaloglu*; Trevor J. Thornton*,**; *Arizona State University,**SJT Micropower

A CMOS LDO with a MESFET based follower output stage was designed and fabricated on a commercial 45nm SOI CMOS technology. The proposed LDO demonstrates a dropout voltage of <170mV at 1A load current while occupying 0.245mm² of die area. The approach includes a novel depletion mode n-channel MESFET in a low output impedance source follower configuration. This enables the LDO to achieve stable operation under all line and load conditions without the need for generating higher internal voltage rails or external compensation. The compact structure and its inherent stability make it ideal for high powered analog, mixed signal and RF system-on-chip applications that require high PSR under different loading conditions.

Session 17 – Energy Efficient Architecture and Enabling Technology for Advanced SoCs

Tuesday, 9/11/2012, 2:00 pm
Pine Ballroom
Session Chair: Arif Rahman, Altera
Session Co-Chair: Lawrence Clark, Suvolta/Arizona State University

Introduction

This session highlights energy efficient architectures for SoC designs with papers that explore architectures for massively-parallel simulation of neural-networks and optimization of SoCs for cost-sensitive applications.
The modelling of large systems of spiking neurons is computationally very demanding in terms of processing power and communication. SpiNNaker is a massively-parallel computer system designed to model up to a billion spiking neurons in real time. The basic block of the machine is the SpiNNaker multicore System-on-Chip, a Globally Asynchronous Locally Synchronous (GALS) system with 18 ARM968 processor nodes residing in synchronous islands, surrounded by a light-weight, packet-switched asynchronous communications infrastructure. The MPSoC contains 100 million transistors in a 102 mm$^2$ die, provides a peak performance of 3.96 GIPS and has a power consumption of 1W at 1.2V when all processor cores operate at nominal frequency. SpiNNaker chips were delivered in May 2011, were fully operational, and met power and performance requirements.

Heterogeneous multi-core object recognition processor with Reinforcement Learning (RL) NoC is proposed for efficient portable HD object recognition. RL NoC automatically learns management policies in the network of heterogeneous system without an explicit modeling. By adopting RL NoC, the throughput performances of feature detection and description are increased by 20.4% and 11.5%, respectively. As a result, the overall execution time of the object recognition is reduced by 38%. The implemented chip achieves 121mW power consumption with 1.24 TOPS/W power efficiency.

This paper presents design and implementation of a 12GS/s fully differential data acquisition (DAQ) System-on-Chip (SoC) in a standard 130nm CMOS process. The 12 GS/s DAQ system includes a 4-bit flash ADC and 4 channels of 1:32 DeMUX with on-chip custom registers. At 12GS/s sampling rate, the DAQ SoC achieves an SNDR of 19.2 dB for 2.9GHz input and 24.2 dB for low frequency inputs. The flash ADC and each DeMUX channel consume 200- and 260- mA from 1.3V supply, respectively. The active area of flash ADC and each DeMUX channel is 0.85- and 0.70-mm$^2$, respectively. The DAQ SoC does not employ time-interleaving and calibration techniques. Moreover, no BW- or speed-enhancing inductors have been used in the design. The circuit achieves the highest sampling rate in a standard 130nm CMOS technology.

A unified media application processor is presented for image-based media contents processing on handheld devices. Homography-based disparity estimation and 3-stage frame-level-pipelined architecture achieve real-time performance in 3D-view augmented reality, and dynamic analog-digital reconfiguration based on a mixed-mode feature extraction engine reduces dynamic power dissipation, so 84.2% energy reduction is achieved.

A 2.3nJ/Frame Voice Activity Detector Based Audio Front-end for Context-Aware System-On-Chip Applications in 32nm CMOS, Arijit Raychowdhury, Carlos Tokunaga, Willem Beltman, Michael Deisher, James Tschanz, Vivek De, Intel Corporation

An audio front-end with Voice Activity Detection (VAD) hardware targeted for low-power embedded SoCs, featuring a 512pt FFT, programmable filters, noise floor estimator and a decision engine has been fabricated in 32nm CMOS. The dual-VCC, dual-frequency design allows the core datapath to scale to near-threshold voltage, where power consumption is less than 50uW. At peak energy efficiency, the core can process audio data at 2.3nJ/frame – a 9.4X improvement over nominal voltage conditions.


We present LIT, a low power, low cost audio processor for information dissemination among illiterate people in developing regions. The 265K gate, 8 million transistor, 23mm2, ARM Cortex M0 uses a novel memory hierarchy with 128kB true LRU cache and off-chip flash designed for efficient operation on Carbon-Zinc batteries.


We present a novel driver circuit enabling electro-optic modulation with high extinction ratio from a co-designed silicon ring modulator. The driver circuit provides an asymmetric differential output at 10Gbps with a voltage swing up to 1.5Vpp from a single 1.0V supply, maximizing the resonance-wavelength shift of depletion type ring modulators while avoiding carrier injection.

Session 18 – Advanced IC Technologies II
Tuesday, 9/11/2012, 2:00 pm
Cedar Ballroom
Session Chair: Dinesh Somasekhar, GlobalFoundries
Session Co-Chair: Rajiv Joshi, IBM
2:00 pm  Introduction  N/A
This session of all invited papers covers the advent of fully-depleted devices, structural analysis of a commercial tri-gate CPU, and lithography/design interactions.

2:05 pm  Fully Depleted Devices for Designers: FDSOI and Finfets (INVITED), Terence B. Hook, IBM Corporation
Technologies featuring fully depleted transistors are now available for designers. The physical structure and features of the transistors vis-à-vis conventional planar devices are different. We discuss planar and three-dimensional fully depleted devices, comparing and contrasting them with one another and with dopant-controlled devices, and in both bulk and SOI manifestations.

2:55 pm  Intel Ivy Bridge Unveiled – the First Commercial Tri-Gate, High-k, Metal-Gate CPU (INVITED), Dick James, Chipworks Inc.
Intel was the first to use high-k/metal gate in its 45-nm product, and now the first 22-nm FinFET products have appeared – the Intel Ivy Bridge CPU. The paper discusses some of the different features we have seen within the chip, and illustrates the structure of the tri-gate transistors.

3:20 pm  Lithography and Design Integration-New Paradigm for the Technology Architecture Development (INVITED), J. Kye, Y. Ma, L. Yuan, Y. Deng, H. Levinson, GLOBALFOUNDRIES

3:45 pm  BREAK

Session 19 – Forum Session – Silicon-based THz Circuits, Systems and Applications
Tuesday, 9/11/2012, 3:30 pm
Cedar Ballroom
Chair: Alberto Valdes-Garcia, IBM T. J. Watson Research Center

Introduction  457
Silicon-based circuits in the sub-mmWave regime (>300GHz to 1THz) have become a reality and an increasingly popular topic for research. Are existing circuits ready for building practical systems? Are there compelling applications worth pursuing? This forum will address these questions; lead researchers in the area will review the current state-of-the art, discern the applications where silicon can make an impact, and outline the remaining challenges to tackle different usage scenarios.

4:00 pm  THz CMOS: Opportunities and Challenges, Ali Hajimiri, California Institute of Technology
Although "THz" has become the new trend in high frequency integrated circuit research, it is not clear how much of the proposed applications are real and which ones are just fads. We will discuss the applications of interest and the challenges associated with implementation. In particular, we will discuss the challenges of power generation and radiation from a silicon substrate and discuss potential solution to dealing with these issues through several practical examples.

4:25 pm  The Next Frontier for Circuit Designers: CMOS THz Systems, Ehsan Afshari, Cornell University
There is an increasing interest in low cost THz systems for medical imaging, spectroscopy, and high data rate communication. Recent results in the lower THz frequencies (<600 GHz) suggests that a standard CMOS process can compete with compound semiconductors for some applications. In this talk, first we present a few "real" applications for the CMOS THz systems as well as a few "fake" ones. Next, we discuss major challenges in realizing these systems in CMOS. Finally, we show several novel methods to overcome these challenges to generate mW-level powers above 300 GHz with relatively low noise.

CMOS and SiGe at THz Frequencies: What it can do and what it cannot, Gabriel M. Rebeiz, University of California, San Diego

There is a lot of interest about CMOS and SiGe at THz frequencies with demonstrations of transmitters, receivers and imaging arrays up to 800 GHz. There is no doubt that SiGe (and some CMOS) can be competitive in the 120 GHz to 160 GHz frequency range, but what is not discussed is the limitation of these technologies for > 200 GHz applications. These include relatively high noise figure or NEP (NF of 10-30 dB), low output power (mW or even microwatt level), and relatively low antenna efficiency (20-60%). This talk will present an honest look at these technologies in the THz range (> 200 GHz) and how system designers can overcome these limitations.

Design of Multipixel Terahertz Imagers Using Silicon Technologies, Hani Sherry, STMicroelectronics, Crolles, France, University of Wuppertal, Wuppertal Germany, ISEN/IEMN, Lille, France

THz systems of commercial viability will require portability, high integration-levels, video-rate speeds, low power-consumptions as well as room-temperature operation. Therefore, Silicon technologies are attractive system-on-chip alternatives to classical expensive THz systems based on III-V compounds, micro-bolometers and others. In this talk we will discuss the capability of detection of THz radiation well beyond ft/fmax of standard Silicon-based transistors. We will then address the key design challenges and techniques for designing, operating and characterizing efficient focal-plane arrays of direct power-detectors for Terahertz video-rate multi-pixel imaging, as well as the trade-offs involving bandwidth, sensitivity and power-consumption, in view of various electrical and electromagnetic constraints. Full system integration capabilities will be demonstrated based on a recently reported work of a 1kpixel CMOS video-camera for active THz-imaging (0.6-1.1THz).

Poster Session

Tuesday, 9/11/12, 5:00 pm – 7:00 pm
Donner, Siskiyou, Cascade Ballrooms


Switching noise cancellation (SNC) at the output of a pipelined ADC without noise sensors reduces noise errors stemming from digital switching noise coupling to analog circuits. A prototype 12b 40MS/s ADC plus digital test circuits was fabricated in 0.18um CMOS. With periodic and random noise, SNC nearly cancels all the noise errors.

A Wideband Ultra-Low-Current On-Chip Ammeter, J. Lu, J. Holleman, The University
A high-bandwidth ultra-low-current measurement circuit is presented. The circuit can measure an on-chip 75 fA current at a bandwidth up to 1 kHz with a noise floor of 0.235 fArms/√Hz. It occupies 0.065 sq.mm. of area in a 90 nm CMOS process and consumes 147 μW of power.

A 22dB PSRR Enhancement in a Two-Stage CMOS Opamp Using Tail Compensation, Paul M. Furth, Sri Harsh Pakala, Annajirao Garimella, Chaitanya Mohan, New Mexico State University

Novel "tail" compensation is established by connecting a capacitor between the output node and source node of the input differential amplifier in a two-stage CMOS opamp. Proposed compensation increases fUGF by 60% and 25% and improves PSRR from the positive rail by 22dB and 26dB over Miller and cascode compensation, respectively.

A 5-300MHz CMOS Transceiver for Multi-Nuclear NMR Spectroscopy, Jaehyup Kim, Bruce Hammer and Ramesh Harjani, University of Minnesota

This paper presents a 5-300MHz multi-nuclei fully integrated NMR transceiver designed with spectroscopy for drug discovery in mind. The overall transceiver is implemented in 130nm CMOS, occupies an active area of 2mm² and consumes 12mA from a 1.5V supply. High spectral resolution was validated using a variety of chemical samples.

A 14.5 fJ/cycle/k-Gate, 0.33 V ECG Processor in 45 nm CMOS Using Statistical Error Compensation, R. Abdallah and N. Shanbhag, University of Illinois at Urbana-Champaign

A subthreshold ECG processor is designed. Statistical error compensation is employed to reduce the critical supply voltage while showing an improvement of 19X in beat-detection performance, 600X in error-rate tolerance, and 28% in minimum-energy over conventional systems. The prototype IC consumes 14.5fJ/cycle/1k-gate and exhibits 4.7X better energy efficiency than state-of-the-art.

A 200Msps, 0.6W eDRAM-based Search Engine Applying Full-Route Capacity Dedicated FIB Application, Yasuto Kuroda*, Yuji Yano*, Hisashi Iwamoto*, Koji Yamamoto, Kazunari Inoue and Masahiro Suzuki*, *Renesas Electronics Corporation, Renesas Design Corporation, Osaka University and Nara National College of Technology

Ternary content addressable memory (TCAM) is popular LSI for use in high-throughput forwarding engines on routers. However, the unique structure applied in TCAM consume huge amounts of power, therefore it restricts the applicability to deployment for handling large lookup-table capacity in IP routers. In this paper, we propose a commodity-memory based hardware architecture for the forwarding information base (FIB) application that solves the substantial problems of power and density. The proposed architecture is examined by fabricated test chip with 40nm embedded DRAM (eDRAM) technology, and the effect of power reduction verified is 95% lower than conventional TCAM based.

6T SRAM and 3T DRAM Data Retention and Remanence Characterization in 65nm bulk CMOS, C. Cakir, M. Bhargava, K. Mai, Carnegie Mellon University

Both data retention and remanence have been exploited by attackers to compromise system
To measure retention and remanence in SRAM and DRAM, we implemented specially instrumented 6T SRAM and 3T DRAM test structures and tested them from -40°C to 85°C under accelerated aging conditions.


To reduce resonant supply noise, a simple, fully-digital and scalable technique based on staggering the activation time of the cores sharing the same power domain in a multi-core multi-power domain processor is presented. Measurement data from a 65nm test chip shows an Fmax improvement of 20% in a 3-core configuration.


Magnetic energy harvesting circuit for system sustainability and power monitoring system with a novel dual-wire current transformer are proposed in this paper. MEH circuit simultaneously senses and harvests the magnetic energy. DWCT has the benefits of non-invasion measurement and is easy to use. The designed direct AC-DC rectifier with maximum power extracting control fits the characteristic of magnetic energy source. Thus, 120% harvesting power improvement can be achieved under the same sensing current.


We present a non-iterative and physical five-step RF SPICE model extraction procedure. This procedure is applicable to any MOSFET compact model with all necessary RF related components in it. This methodology has been validated on silicon data from multiple technology nodes for a wide range of bias and frequency.

**A Unified Model and Direct Extraction Methodologies of Various CPWs for CMOS mm-Wave Applications**, Jun Luo, Lei Zhang, and Yan Wang, Institute of Microelectronics, Tsinghua University, China

A unified SPICE-compatible model for standard coplanar waveguides, grounded CPW, CPW with slotted shield, and corresponding direct extraction methodologies are proposed. The model is verified by 90nm CMOS processes with SLOT de-embedding techniques up to 67GHz. The direct extraction methodologies ensure the feasibility and availability of scalable modeling of CPWs.


The response of BB-PLLs depends on the phase error magnitude. This paper presents a
modeling methodology that predicts the response of digital BB-PLLs to phase error perturbations in the locked state, indicating stability and settling time. An example BB-PLL is implemented and modeled. The model is verified by AMS simulations.

T-13  
**A 40-nm 168-mW 2.4×-Real-Time VLSI Processor for 60-kWord Continuous Speech Recognition**, G. He, T. Sugahara, S. Izumi, H. Kawaguchi, M. Yoshimoto, Kobe University

This paper describes a low-power VLSI chip for speaker-independent 60-kWord continuous speech recognition based on a context-dependent Hidden Markov Model (HMM). Our implementation includes a compression-decoding scheme to reduce the external memory bandwidth for Gaussian Mixture Model (GMM) computation and multi-path Viterbi transition units. We optimize the internal SRAM size using the max-approximation GMM calculation and adjusting the number of look-ahead frames. The test chip, fabricated in 40 nm CMOS technology, occupies 1.77 mm × 2.18 mm containing 2.52 M transistors for logic and 4.29 Mbit on-chip memory. The measured results show that our implementation achieves 34.2% required frequency reduction (83.3 MHz) and reduces 48.5% power consumption (74.14 mW) for 60 k-Word real-time continuous speech recognition compared to the previous work. This chip can maximally process 2.4× faster than real-time at 200 MHz and 1.1 V with power consumption of 168 mW.

T-14  
**A Mixed-Mode FPAA SoC for Analog-Enhanced Signal Processing**, C. Schlottmann, S. Nease, S. Shapero, P. Hasler, Georgia Institute of Technology

We present the RASP 2.9v, an FPAA for mixed-signal computation with an emphasis on enhanced digital support. This 25mm^2, 350nm CMOS chip includes on-chip compilable DACs, dynamic reconfigurability and digital storage, and 76,000 programmable elements. We demonstrate an analog image-transform processor, an arbitrary waveform generator, and a mixed-mode FIR filter.

T-15  
**A 0.1~4GHz Receiver and 0.1~6GHz Transmitter with Reconfigurable 10~100MHz Signal Bandwidth in 65nm CMOS**, Xinwang Zhang, Yun Yin, Meng Cao, Zhigang Sun, Ling Fu, Zhaokang Xia, Hongxing Feng, Xing Zhang, Baoyong Chi, Ming Xu*, Zhihua Wang, Institute of Microelectronics Tsinghua University, *Alcatel-lucent Shanghai Bell Co.,Ltd

A SDR transceiver with reconfigurable 10~100MHz signal bandwidth in 65nm CMOS is presented. Rx features 2 LNAs, 25% passive mixer, 3rd/5th-order baseband filtering and IIP2/Tuning/IQ calibration. It achieves NF of 3~8dB over 0.1-4GHz and 21mA current consumption for 20MHz LTE at 2.3GHz. Tx features a low-noise sub-path and high dynamic range main-path. It achieves 1.7% EVM at 1.5dBm output for WCDMA, -31/-51 ACLR1/ACLR2 for 2.3GHz LTE, <-42dBc LO feedthrough and >51dBc image rejection.

T-16  
**A Near-Threshold, Multi-Node, Wireless Body Area Sensor Network Powered by RF Energy Harvesting**, Jiao Cheng, Lingli Xia, Chao Ma, Yong Lian*, Xiaoyuan Xu**, C. Patrick Yue***, Zhihong Hong****, Patrick Y. Chiang, Oregon State University, *NUS, **CVPL, ***HKUST, ****Fudan University

A wirelessly-powered, near-threshold, body area network SoC supporting synchronized multi-node TDMA operation is demonstrated in 65nm CMOS. For an energy-harvested local VDD=0.56V, measurements demonstrate full functionality over 1.4m between the base-station and four worn sensors, including two that are NLOS.
A Low-Phase-Noise Wide-Tuning-Range Quadrature Oscillator in 65nm CMOS, Guansheng Li, Ehsan Afshari, Cornell University

We present a low-phase-noise wide-tuning-range quadrature oscillator. It achieves low phase noise by employing passive coupling between multiple stages and it covers a wide tuning range by switching between two resonant modes. A prototype QVCO in a 65nm CMOS process covers 2.78GHz–5.00GHz with excellent FoM of 186dB.

A 5-Gbps 1.7 pJ/bit Ditherless CDR with Optimal Phase Interval Detection, M.-J. Park, H. Kim, S. Son, J. Kim, Seoul National University

A novel phase interval detector that looks for a phase interval enclosing the desired lock point is shown to find the optimal phase that minimizes the timing error without dithering. The prototype achieves low jitter of 4.1-mUIp-p with a coarse phase adjustment step of 0.11-UI, while dissipating only 8.4mW at 5Gbps.

A 27-Gb/s, 0.41-mW/Gb/s 1-Tap Predictive Decision Feedback Equalizer in 40-nm Low-Power CMOS, Kambiz Kaviani, Masum Hossain, Meisam Honarvar Nazari*, Fred Heaton, Jihong Ren, Jared Zerbe, Rambus Inc.,*Caltech Institute of Technology

A new 1-tap predictive decision feedback equalizer (prDFE), implemented in 40-nm CMOS LP process, achieves 27-Gb/s operation with 0.41-mW/Gb/s power efficiency. The prDFE employs a novel quad-data rate sampling architecture to improve power efficiency while minimizing critical feedback path timing constraint of the equalizer to enable post-cursor inter-symbol interference (ISI) cancellation at high data-rate operations.

Session 20 – Design Solutions for 3D Integration and Signal Integrity

Wednesday, 9/12/2012, 9:00 am
Oak Ballroom
Session Chair: Yu Cao, Arizona State University
Session Co-Chair: Siva Mudanai, Intel

9:00 am  Introduction  534

Design robustness is increasingly challenging in large-scale system integration. This session presents solutions to 3D integration, ESD, and on-chip interconnect.

9:05 am  Slew-Aware Buffer Insertion for Through-Silicon-Via-Based 3D ICs (INVITED), Young-Joon Lee, Inki Hong*, Sung Kyu Lim, Georgia Institute of Technology, Cadence Design Systems

Large parasitic capacitances of through-silicon-vias in 3D ICs cause signal slew and delay to increase. We propose a buffer insertion algorithm that further reduces delay by considering slew explicitly. Compared with the well-known van Ginneken algorithm and a commercial 2D tool, our algorithm improves full-chip timing with acceptable runtime overhead.


calibrated pathfinding on silicon interposer is presented for exploring the impact of interconnect geometries on signal integrity. ABCD matrix-based model and single bit
method are used for the pathfinding by estimating the worst-case eye opening. Experiment-based eye-diagrams using measured S-parameters on the fabricated silicon interposer are compared with the pathfinding showing 6% max difference.

Increase of Crosstalk Noise Due to Imbalanced Threshold Voltage between NMOS and PMOS in Sub-Threshold Logic Circuits, H. Fuketa, R. Takahashi, M. Takamiya, M. Nomura*, H. Shinohara*, T. Sakurai, University of Tokyo, *Semiconductor Technology Academic Research Center (STARC)

Abnormal increase of the crosstalk noise in the sub-threshold logic circuits is found for the first time. The large crosstalk noise due to the imbalanced VTH between nMOS and pMOS is measured in a test chip with 1.5-mm coupled wire fabricated in a 40-nm CMOS and a new crosstalk noise model is proposed and verified with SPICE simulations. In the worst case fastnMOS/slow-pMOS corner simulations, the noise amplitude increases by 1.5 times when the supply voltage is reduced from 1.1V to 0.3V, which is explained by the proposed model.

Electronic Design Automation (EDA) Solutions for ESD-Robust Design and Verification (INVITED), Michael G. Khazhinsky (1), Shuqing Cao (2), Harald Gossner (3), Gianluca Boselli (4), and Melanie Etherton (5), (1) Silicon Labs, Inc., (2) GLOBALFOUNDRIES Inc.; (3) Intel Mobile Communications; (4) Texas Instruments; (5) Freescale Semiconductor, Inc.

The paper describes the essential requirements of the ESD EDA verification flow which offers a systematic approach to check ESD robustness across all IC blocks during different design phases. This flow is substantiated by case studies of key ESD checks, demonstrating the advantages of EDA tool enabled verification.

Session 21 – Data Converter Techniques
Wednesday, 9/12/2012, 9:00 am
Fir Ballroom
Session Chair: Ron Kapusta, Analog Devices
Session Co-Chair: Yuji Nakajima, Renesas Electronics Corporation

Introduction
ADCs are critical blocks in a variety of systems. This session highlights advances in delta sigma, SAR and pipelined ADCs.

A Continuous-time $\Delta\Sigma$ Modulator with 87dB Dynamic Range in a 2MHz Signal Bandwidth Using a Switched-Capacitor Return-to-Zero DAC, T. Nandi, K. Boominathan, S. Pavan, Indian Institute of Technology, Madras

We introduce the Switched-Capacitor Return-to-Zero (SCRZ) DAC, which combines the low clock jitter sensitivity of a Switched-Capacitor DAC with the low distortion of a Return-to-Zero DAC. A single-bit continuous-time $\Delta\Sigma$ modulator that uses the SCRZ technique and opamp-assistance to improve DAC linearity and reduce jitter sensitivity achieves 87.1/84.5/82.3\,dB DR/SNR/SNDR in a 2\,MHz bandwidth. Operating at a sampling rate of 256\,MHz in a 0.18\,$\mu$m CMOS process, the CTDSM dissipates...
A 160mV 670nW 8-bit SAR ADC in 0.13μm CMOS, Xiong Zhou, Qiang Li*, University of Electronic Science and Technology of China, *Aarhus University, Denmark

An inverter-based amplifier and a dynamic latch with both gate and bulk driven input are proposed. Improved switching technique is exploited. The fabricated ADC works from 40kS/s to 400kS/s with 160mV to 300mV supply, respectively. This work demonstrated the feasibility of sub-0.2V analog design with reasonable dynamic range.

A 12-bit 50-MS/s 3.3-mW SAR ADC with Background Digital Calibration, W. Liu, *P. Huang, **Y. Chiu, Broadcom Corp., *Analog Devices, **UT Dallas

This paper describes a background digital calibration technique based on bitwise correlation (BWC) to correct the capacitive digital-to-analog converter (DAC) mismatch error in successive-approximation-register (SAR) analog-to-digital converters (ADC's). Aided by a single-bit pseudorandom noise (PN) injected to the ADC input, the calibration engine extracts all bit weights simultaneously to facilitate a digital-domain correction. The analog overhead associated with this technique is negligible and the conversion speed is fully retained. A prototype 12-bit 50-MS/s SAR ADC fabricated in 90-nm CMOS measured a 66.5-dB peak SNDR and an 86.0-dB peak SFDR with calibration, while occupying 0.046 mm2 and dissipating 3.3 mW from a 1.2-V supply. The calibration logic is estimated to occupy 0.072 mm2 with a power consumption of 1.4 mW in the same process.

A 2.3mW 10-bit 170MS/s Two-Step Binary-Search Assisted Time-Interleaved SAR ADC, Si-Seng Wong, U-Fat Chio, Yan Zhu, Sai-Weng Sin, Seng-Pan U, R. P. Martins, University of Macau

A 10-bit 170MS/s two-step binary-search assisted time-interleave SAR ADC is proposed, which composed of a 5b binary-search front-end stage, shared by two time-interleaved 6b SAR ADCs in the 2nd-stage without opamp. The ADC was fabricated in 65nm CMOS, achieving 54.6dB SNDR with 2.3mW of power, leading to FoM of 30.8fJ/step.

A 0.015mm2 63fJ/conversion-step 10-Bit 220MS/s SAR ADC With 1.5b/step Redundancy and Digital Metastability Correction, R. Vitek, E. Gordon, S. Maerkovich, A. Beidas, Intel Corporation

A very low power & area 10-bit 220MS/s SAR ADC is presented. The ADC employs a redundancy scheme that relaxes settling requirements as well as a metastability correction algorithm that exploits the redundancy as an error-correction code. A CMOS 65nm implementation achieves 63 and 43 fJ/(conversion-step) at 220MS/s and 160Ms/s, respectively.


A 12-bit 200MS/s zero-crossing based pipeline ADC is presented. A coarse phase followed
by a level-shifted fine phase is employed for higher accuracy. To enable high frequency operation, sub-ADC flash comparators are strobed immediately after the coarse phase. The ADC occupies 0.276mm² in 55nm CMOS and dissipates 28.5mW. 62.5dB SNDR and 78.6dBc SFDR with a 99.6MHz input signal at 200MS/s are achieved for a FOM of 131fJ/step. The reference buffer, bias circuitry, and digital error correction circuits are all implemented on chip.

Session 22 – Tutorial - Power Delivery: Droop, Jitter, Test and Debug Story

Wednesday, 9/12/2012, 9:00 am
Pine Ballroom
Session Chair – Mike Li, Altera

9:00 am     Introduction     584

9:05 am
22-1
Analyzing and Understanding Power and Power Distribution Integrity for High Performance Processors, Aaron Grenat and Sam Naffziger, AMD

This presentation provides an overview of the impact of the power distribution network and package resonance on the associated frequency loss in multi-core processors. It covers both the pre-silicon analysis techniques and the post-silicon measurement techniques to quantify the impact of power distribution integrity imperfections on achieved frequency. This includes the techniques to differentiate between the on-die power distribution effects, the package power distribution effects, and the systematic manufacturing process variations during speed debug in silicon.

9:30 am
22-2
Power Droops During Test - Beyond Filling in Scan Don’t Care Bits, T.M. Mak, Intel

Power droops during test have been in the news, (at least in the test world), for quite some time now. Droops can arise from higher than normal activities during shift or the glitch from the launch/capture activity which is central to the scan test operation. There has been a lot of work, mostly from CAD suppliers and academics, on scan "don't care bit fill" for the past 5-7 years. This is one way to minimize shift activities, but it conflicts with the goal of increasing test coverage & test compression implementation. Scan chain partitioning and selective activation is also supported by most CAD tools, but again, one has to tradeoff efficiency versus noise and it is difficult to quantify the actual noise impact. We will examine all these issues and will also review some recent advances on the quantification of power droops and the supply droop mitigation strategies.

9:55 am
22-3

This talk will discuss the impact that time-mode signal processing has had on the field of analog/mixed-signal testing. Beginning with a historical account of the first developments of time-mode techniques for electron emission measurements in nuclear science experiments to present-day methods used for on-chip DFT/BIST purposes. Experimental data suggests that time mode circuits have the ability to operate at much higher speeds that their voltage mode counterparts, although resolution and noise issues have yet to be brought under design control. Time-mode signal processing is largely based on the control of the delay of a single digital inverter. As such, time-mode circuits generally follow a digital design methodology (synthesizable) and can be structurally tested using digital
scan. It is the presenter's belief that time-mode circuits may offer the analog world a circuit methodology that is both synthesizable and testable; an engineering task that has seriously been lacking in the analog design world.

**How to Test & Debug On-Chip Jitter Using Neither High-Frequency Pin Nor Reference Clock, Takahiro Yamaguchi, Advantest**

Off-chip measurements require a high-frequency pin and an on-chip high-performance driver to deliver the on-chip jittery signal to the external instrument without distortion. New architecture presented in this tutorial provides all-digital timing jitter measurement capabilities, which use a clock under test as a referenced clock to directly measure timing jitter. Portability across multiple technologies was also validated using 65 nm and 40 nm CMOS technologies.