## Inductor based Circuit Techniques for Chip-to-Chip Interconnect and Standing Wave Clock Generation

Mamoru Sasaki, Bin Yan, Daisuke Arizono, Mitsuru Shiozaki,

Atsushi Mori and Atsushi Iwata

Graduate School, Hiroshima University

Kagamiyama 1-4-1, Higashihiroshima-shi, Hiroshima, Japan

E-mail sasaki@dsl.hiroshima-u.ac.jp

## Introduction

There is few report of applying on-chip inductor to high-speed digital circuit, while it is useful passive element for RF circuit design. The on-chip inductor helps the digital circuit operate faster and can reduce the power consumption, because the inductance can cancel the capacitive load of the active device. In this manuscript, we present two conspicuous cases where on-chip inductor has been applied effectively in digital circuit design. One is spiral-inductor based wireless chip-interconnect. As an alternative of single chip system LSI, 3D-IC fabrication attracts attention and wireless chip-to-chip interconnects using capacitive and inductive coupling are now under development [1],[2]. They are expected to overcome the process cost and low yield problems of the silicon through-via technologies [3]. Another is over 10GHz clock generation and distribution using standing-wave network with inductively loaded. Global clock distribution becomes increasingly difficult for multi-GHz microprocessors, because skew and jitter are proportional to latency, which does not scale with clock period for the conventional tree network. Resonant techniques are expected to overcome the clock distribution problems [4]-[6].

## Spiral-Inductor Based Wireless Chip-Interconnect

Figure 1 illustrates the inductive coupling between spiral inductors on stacked silicon chips and an nMOS FET excites the spiral inductor pair, as shown in Fig. 2. The spiral inductors are modeled as  $\pi$ -type equivalent circuits and the magnetic coupling is introduced as a coupling coefficient "k". Simulation results are shown in Fig. 3. Let us consider, first, the wide-width pulse shown at the left hand. Two damping oscillations appear on the output node (Vout), when the pulse rises and falls. However, the amplitude at the later is larger than the former. The difference has been caused by on-conductance of the nMOS FET. In the former, the input inductor has conductive connection to the ground due to the nMOS FET. On the other hand, the switch is opened in the later and it is better to generate damping oscillation in open condition. Next, let us consider another case shown at the right hand in Fig. 3. The narrow-width pulse is employed in order to enlarge the amplitude still more, by superposing the two damping oscillations. The narrow-width pulse shortens the time when the current flows into the inductor, and the average of the inductance current  $i_L$  becomes small and the power consumption can be drastically reduced. Because the lower bound of the pulse width is 100-150ps in a typical CMOS technology, the self-resonance frequency of the spiral inductor has to be 3.3-5GHz in order to superpose the two waves. Thus, a stacked structure is employed for satisfying the lower self-resonance frequency. Larger coupling coefficient than 0.1 requires that the diameter of spiral inductor is twice as wide as the distance between the inductor pair. The distance is nearly equal to the thickness of the silicon chip and the diameter of the spiral inductor becomes 200µm, because the chip thickness is 100µm in 3D-stacked structure of our prototype system.











It is feared that conductivity of silicon substrate becomes obstacle to inductive coupling between spiral inductors. The following equation is still held in the spiral inductor

$$f_{self} \ll \frac{\sigma}{2\pi \varepsilon_{a}\varepsilon_{a}}$$

coupling :

where  $\sigma$  and  $\varepsilon_r$  are conductivity and permittivity of silicon substrate, respectively. Hence, attenuation of magnetic field due to silicon substrate can be estimated as illustrated in Fig. 4. In the silicon substrate of 100µm thickness, the attenuation is 0.96. The above estimation has been also confirmed by 3D electromagnetic field solver.



Fig. 4 Analysis of silicon substrate conductivity

The receiving wave is damping oscillation as shown in Fig.3, and the frequency is determined by self-resonance of the spiral inductor. In synchronization scheme using conventional latch comparator, the timing margin is very narrow as illustrated in Fig. 5. The timing margin is related to the self-resonance frequency, regardless of transmission rate. The self-resonance frequency becomes high in small dimension spiral inductor and it makes the timing margin narrow still more. In order to realize an asynchronous communication without any timing clock, dynamic circuits and self pre-charge technique are employed. Circuit schematics are depicted and simulation results are shown in Figs. 6 and 7, respectively. The received signal " $V_C$ " is damping oscillation wave. First, the signal is level-shifted by C1 and R1 to a bias voltage "Vbn" generated by the bias circuit, and the shifted signal is " $V_G$ ". The node indicated by " $V_D$ " is dynamically charged up and discharged. M3 discharges the node according to the shifted receiving signal. After discharging, M4 charges up again the node and it is triggered by the delayed pulse " $V_P$ ". This is called with "self pre-charging". The communication scheme can transmit NRZ signal using "Pulse Generator" and "Rec. Unit".

In order to confirm the proposed concept, a test chip was designed and fabricated in 0.18µm 6 metal CMOS technology. Fig. 8 shows a micrograph of the test chip integrated 12 transceivers. Two chips were set on manipulators in face to face and transfer characteristics were measured. Measured results are shown in Figs. 9 and 10. Fig. 9 shows transmitted and received waveforms of the neighboring three channels, where pseudo random data were transmitted at 1.0Gbps. BER (Bit Error Ratio)  $< 10^{-10}$  was achieved without cross talk between the neighboring spiral inductors. The delay time was 2.7ns (=1.2ns[transceiver] + 1.0ns[I/O] + 0.5ns[PCB]). Measured power consumption is shown in Fig. 10, where "Inductor (Tx)" and "Sensing (Rx)" show powers of the spiral inductor in Tx and the dynamic circuits with pre-charging operation in Rx, respectively. The data transmission activities were set to 0.5 and the power consumption reduces in proportional to the data transmission activity due to the asynchronous scheme. The power consumptions of Pulse Generator (PG) and Rec. Unit (RU) were relatively large. However, it is expected that the power consumptions of CMOS logic circuits are reduced in scaled devices.



Fig. 5 Timing margin



Fig. 6 Transmitter and receiver.



Fig. 7 Operations of dynamic circuit and self pre-charging.



Rond Pads

Fig. 8 Photomicrograph of test chip.





Fig. 10 Measured power consumption of each circuit blocks

## **Standing-Wave Clock Distribution**

Figure 11 shows a long transmission line (TL) grounded at both ends and a short TL with two inductors. The former has the conventional standing-wave-resonance mode and the first resonant standing wave is illustrated. On the other hand, the latter has an interesting standing wave that cuts the low-amplitude segment away from the conventional one, but the resonance frequency is identical to the former. The standing wave has uniform-phase and almost uniform-amplitude. As shown in Fig. 11, it has achieved clock distribution with fine grid mesh and the depth of clock tree became very shallow. It results in small latency, and low jitter and low skew. However, the mesh structure has a spiral inductor and a MOSFET cross-couple pair, which are not illustrated in Fig. 11, at every grid point and it is one of the issues to reduce the area overhead.

In order to overcome the drawback, a new mesh structure of standing wave oscillators is proposed. In the structure, the inductors are magnetically coupled by mutual inductance "M" each other at the end of TL composing the oscillators, as shown in Fig. 12. The magnetic coupling can synchronize the standing wave oscillations. The synchronized oscillation frequency  $f_{ck}$  can be expressed by almost same equation for the single inductively loaded standing wave oscillator, as described in Fig. 12.  $Z_0$ , and  $\beta$  are the characteristic impedance and the phase constant of the TL. k (= M/L) is the coupling coefficient between the inductors. Figure 12 shows a low area-overhead clock distributed network with the inductively coupled standing wave oscillators. In the network, the standing wave oscillators are placed in rows and columns and they are inductively coupled in circular form. It results in a mesh structure, but there is no electrical contact at every grid point. Spiral inductors and MOSFET



Fig. 11 Inductively loaded standing-wave oscillator.



Fig. 12 Magnetic coupling synchronization.



Fig. 13 Chip micrograph.

cross-couple pairs, which are not illustrated in Fig. 12, surround the mesh structure. As compared with the previous mesh structure in Fig. 11, the coupling technique and the elaborate placement can reduce the area overhead caused by spiral inductors and MOSFET cross-couple pairs. The reduction becomes more effective as the grid pitch becomes finer and the grid size becomes larger. Clock buffers are uniformly distributed on the TL for supplying the clock signal to FFs, as shown in Fig. 12. Same design method can be adopted by modifying the distributed TL unit capacitance  $\Delta C$  into  $\Delta C + (n C_{buf})/l$ , where  $C_{buf}$  and *n* are the input capacitance and the number of the clock buffer, respectively. The oscillation starts as follows: Immediately after the power supply is given, many oscillation modes



Fig. 14 Measured frequency spectrum.



Fig. 15 Measured phase noise.



Fig. 16 Measured oscillation waveform.

having small amplitude appear. However, only stable oscillation grows up from these oscillation modes. As regards fundamental frequency, fortunately, the stable mode is unique as described above.

The 12GHz magnetically coupled standing wave clock distribution network has been prototyped in a 0.18µm

CMOS technology with 6 Al metal layers. Figure 13 is microphotograph and it integrates a 5x5 mesh clock distribution network. The TL has been configured as a coplanar structure with 6th metal layer, and the power and ground lines were placed under the TL by 1st and 2nd metal layers. The width and space of the coplanar were 5µm and 2µm, respectively. The length of the TL was 1mm. The inductor has been implemented as a spiral inductor with 5th and 6th metal layers, and it was magnetically coupled to the adjacent inductors on both sides, by the stacked structure as illustrated in Fig. 13. The outside diameter and internal diameter of the spiral inductor are 70µm and 50µm respectively. The TL has 2 spiral inductors at each end and they are connected to the differential lines respectively. A simple NMOSFET cross-coupled pair shown in Fig. 13 was employed and it was placed between the 2 spiral inductors. 0.9V power is supplied through the spiral inductors. The transistor size was  $1.2\mu m/0.18\mu m$ , and the finger count was 20. For measurement, G-S-S-G pads were placed on the outside of the mesh network and the signal pads were capacitivelly coupled to the differential TL as shown in Fig. 13. The measured attenuation of the capacitivelly coupled pad was -21dB at 11.5GHz. The attenuation can minimize the probing influence on the TL.

The measured spectrum is shown in Fig. 14 and 11.5GHz clock oscillates on the 5x5 mesh clock distribution network. The measured phase noise at 1MHz offset was -103dBc/Hz as shown in Fig. 15, and the RMS clock jitter (calculated from the phase noise) was 0.86ps. Figure 16 shows the oscillation waveform. The period was 86.1ps and the peak-peak jitter and the RMS jitter were 4.7ps and 0.81ps, respectively. The measured RMS jitter has agreed with the RMS jitter calculated from the phase noise. The peak-peak jitter was 5.5% of the clock period. The oscillation swing of 0.6Vpp has attained by calibrating with the -21dB attenuation of the capacitivelly coupled pad for measurement. The power consumption was 80mW at 0.9V supply voltage. The length of TL (1mm) was 1/3 of that of the conventional standing-wave oscillator. The mesh pitch was 200µm and the fine pitch has achieved in the elaborate placement. The supply voltage becomes the center voltage of the oscillation wave and it can be tuned to the threshold voltage of the inverters by an on-chip regulator. The 0.6Vpp swing is sufficient for compensating the mismatch of the threshold voltage among inverters and it is possible to use simple inverters as the clock buffers.

References

[1] K. Kanda et al., "1.27Gb/s/ch 3mW/pin Wireless Superconnect (WSC) Interface Scheme," *ISSCC Digest of Technical Papers*, pp.186-187, Feb. 2003.

[2] N. Miura, et al., "A 1Tb/s 3W Inductive-Coupling Transceiver for Inter-Chip Clock and Data Link," *ISSCC Digest of Technical Papers*, pp424-423, Feb. 2006.

[3] M. Koyanagi, et al., "Neuromorphic Vision Chip Fabricated Using Three-Dimensional Integration Technology," *ISSCC* 

Digest of Technical Papers, pp.270-271, Feb. 2001.

[4] F. O'Mahony et al., "10GHz clock distribution using coupled standing-wave oscillators," *ISSCC Dig. Tech. Papers*, pp.428-429, Feb. 2003.

[5] S.C.Chan et al. "Uniform-phase uniform-amplitude resonant-load global clock distributions", *J. Solid State Circuits*, vol.40, pp.102-109, Jan. 2005.

[6] M. Sasaki et al., "17GHz Fine Grid Clock Distribution with Uniform-Amplitude Standing-Wave Oscillator", *Dig. Symp. VLSI Circuits*, pp.124-125, June 2006.