

# 20Gb/s 0.13um CMOS Serial Link



Patrick Chiang (pchiang@stanford.edu) Bill Dally (billd@csl.stanford.edu) Ming-Ju Edward Lee (ed@velio.com) Computer Systems Laboratory Stanford University

# Outline



- Motivation
- Background
  - Static phase offset
  - Random/power supply induced jitter
- Proposed 20Gb/s transceiver
  - New Architecture
  - Circuit Blocks
  - Receiver Design
  - Preliminary Results
- Conclusion

# I/O Bandwidth is Limiting Factor



• Predicted Off-Chip Bandwidth growing slower than On-Chip



Higher bit rate I/O's needed to close this gap

20Gb/s 0.13um CMOS Transceiver Goals



- Systematic/static phase offset
- Random/power supply induced jitter
- Not addressing channel equalization
- Reasonable power dissipation(200mW/link)
- Small area footprint(500um x 500um) for high integration on single chip

# Outline



- Motivation
- Background
  - Static phase offset
  - Random/power supply induced jitter
- Proposed 20Gb/s transceiver
  - New Architecture
  - Circuit Blocks
  - Receiver Design
  - Preliminary Results
- Conclusion



#### Static Phase Offset—Ideal Transceiver



#### Timing Margin=12ps



#### Timing Margin=7ps 42% reduction

# **Power Supply Induced Jitter**





#### 20Gb/s Transmitter Design Spaces





# Outline



- Motivation
- Background
  - Static phase offset
  - Random/power supply induced jitter
- Proposed 20Gb/s transceiver
  - New Architecture
  - Circuit Blocks
  - Receiver Design
  - Preliminary Results
- Conclusion

# **New Architecture**





### New Architecture Reduces Jitter/Phase Offset





# 20Gb/s Transmitter





# 20Gb/s Output Stage





# **10GHz Analog Latch**





- Full pass gates provide symmetric clock injection
- Gain loss of ½ from 10Gb/s input to output

# 4:1 10Gb/s Mux Design





#### **10GHz Clock Alignment Problem**



• How do you ensure 10Gb/s data is in phase with 10Ghz clock?



# Phase Adjusting FSM





• Align zero crossings of 10GHz clock and 8 multi-phases of 2.5GHz Clock

# **Transmitter Outline**





# **Phase Interpolator**





# **10GHz LC Oscillator**



- Use passive L,C elements for frequency synthesis
  - 10x less jitter/power supply sensitivity than ring oscillator VCO's
  - Significantly less static phase offset
  - Higher frequency of oscillation
- Disadvantage--area is significantly larger than conventional techniques
  - Area disadvantage mitigated by higher frequency--inductor size reduces by factor of 4 for 2x increase in frequency
  - A 130um x 130um 1nH inductor deemed reasonable area / per IO
- Tuning range given by inversion mode PMOS capacitors





< 3ps pk-pk jitter--2000 cycles, with 20mV wideband Vdd noise

# **Receiver Design**





- Clock recovery done at reset time
  - Sampling clock swept across entire bit period at reset time
  - Bit error is measured for sampling instances, and optimum sampling time chosen at startup
  - Periodic retraining of receiver to compensate for slowly varying timing drift

### **Simulated Results**







Simulated 20Gb/s Output, with Clean Supply

| Data Rate                  | 20Gb/s                                   |
|----------------------------|------------------------------------------|
| Process                    | 1.2V, 0.13um Generic CMOS                |
| Power                      | 200mW(transmitter & receiver) (PLL=20mW) |
| Estimated Area             | 500um x 500um                            |
| Pk-Pk Jitter               | < 10ps, with 20mV Vdd Noise              |
| Output Swing               | 100mV                                    |
| Input Receiver Sensitivity | 40mV                                     |
| Tuning Range               | 10ps (10%)                               |

1.54

1.10

1.34

137

1.1





• A 20Gb/s CMOS I/O Link has been designed

 Low Power, Low Area enable high integration of these 20Gb/s I/O pads on a single chip

# Acknowledgements



- Velio Communications—Ramesh Senthinathan, Mark Kellam, John Poulton
- Jaeha Kim, Mark Horowitz, Niranjan Talwalkar for discussion

# **BW Numbers**



|                           | 1999     | 2000     | 2001     | 2002     | 2003     | 2004     | 2005     |
|---------------------------|----------|----------|----------|----------|----------|----------|----------|
| # of pins                 | 1600     | 1792     | 2007     | 2248     | 2518     | 2820     | 3158     |
| I/O bw/pin                | 1.92E+09 | 2.77E+09 | 3.20E+09 | 3.50E+09 | 3.70E+09 | 4.00E+09 | 4.07E+09 |
| total I/O bw              | 1.54E+12 | 2.77E+12 | 3.21E+12 | 3.94E+12 | 4.66E+12 | 5.64E+12 | 6.43E+12 |
| on-chip bw/wire           | 1.20E+09 | 1.40E+09 | 1.60E+09 | 1.72E+09 | 1.86E+09 | 2.00E+09 | 2.12E+09 |
| chip size                 | 1.76E-02 | 1.76E-02 | 1.76E-02 | 1.80E-02 | 1.84E-02 | 1.89E-02 | 1.93E-02 |
| minimum wiring width(161) | 1.44E-06 | 1.44E-06 | 1.04E-06 | 1.04E-06 | 1.04E-06 | 7.20E-07 | 7.20E-07 |
| # of wires                | 1.22E+04 | 1.22E+04 | 1.69E+04 | 1.73E+04 | 1.77E+04 | 2.63E+04 | 2.68E+04 |
| Total on-chip BW          | 1.46E+13 | 1.71E+13 | 2.72E+13 | 2.98E+13 | 3.30E+13 | 5.30E+13 | 5.68E+13 |