# ALL PROGRAMMABLE



Hot Chips Aug 21, 2016

# HBM Package Integration: Technology Trends, Challenges and Applications

Suresh Ramalingam



# Agenda

- Motivation
- >HBM Packaging Options
- >Interposer Design
- > Supply Chain
- > Application and Challenges
- **>** Summary



Stacked Silicon Interconnect Technology Refers to Xilinx 3D solutions

### **EXILINX >** ALL PROGRAMMABLE.

© Copyright 2016 Xilinx

## **FPGA's in the Data Center Today**

- The Accelerator (FPGA or GPU) is used to offload only certain tasks
  - These tasks are called "Workloads", and FPGA's are well suited for many workloads
  - Note: Accelerators don't replace the CPU!
- > Hard and Soft is the basic approach
  - Hard is the IO, Memory and PCIe interfaces
    - Does not change
  - Soft is the workload being accelerated
    - Is configured on the fly using P.R.
- API's are run on the CPU to reprogram the FPGA to accelerate the workload as needed.
  - Average P.R. happens every 15 minutes!

> Acceleration requires lots of memory BW

#### **Convey-Xilinx Accelerator**













© Copyright 2016 Xilinx



#### 







© Copyright 2016 Xilinx

#### **EXILINX >** ALL PROGRAMMABLE.

### Multi-Die Package Design Rule Comparison

| Design Rules for Die to Die<br>interconnection                | MCM (Substrate)<br>Integrated Fine Layers | EMIB<br>(Embedded Multi-die<br>Interconnect bridge) | Silicon Interposer<br>(65 nm BEOL) | WLFO<br>(Wafer Level Fan-out) |
|---------------------------------------------------------------|-------------------------------------------|-----------------------------------------------------|------------------------------------|-------------------------------|
| Minimum Bump pitch (um)                                       | 130 (C4)<br>40 (u-bump d2d interface)     | 130 (C4)<br>40 (u-bump) bridge                      | < 40 (u-bump)                      | 40 um RDL pad pitch           |
| Via size / pad size (um)                                      | 10 / 25                                   | 0.4 / 0.7                                           | 0.4 / 0.7                          | 10/25                         |
| Minimum Line & Space (um)                                     | 2/2                                       | 0.4 / 0.4                                           | 0.4 / 0.4                          | 2/2                           |
| Metal thickness (um)                                          | 2-5                                       | 1                                                   | 1                                  | 2-5                           |
| Dielectric thickness (um)                                     | ~5                                        | 1                                                   | 1                                  | < 5                           |
| # of die-to-die connections per layer + GND shield layer (2L) | 1000's                                    | 1000's (bridge interface length limited)            | 10,000's                           | 1000's                        |
| Minimum die to die spacing (um)                               | < 500                                     | <2500                                               | <100                               | < 250                         |
| # of High density layers feasible                             | Not a limitation<br>1-3L                  | Not a limitation                                    | Not a limitation                   | 1-3L layers                   |
| Die Sizes for assembly and # of assemblies                    | Not a concern<br>d2d interconnect only    | Size & # limitation?                                | Not a concern                      | Size limitation?              |
| In Production                                                 | No (2018)                                 | No (2017)                                           | Yes                                | No not for 2/2um L/S (2018)   |

#### **EXILINX >** ALL PROGRAMMABLE.

## **HBM2 System Overview (Jedec)**

- > HBM2 system with SOC/DRAM on interposer with 3-6mm length
- > 24 signals across 55um u-bump pitch across interface
- > Supports 2Gb/s PHY (1Tb/sec bandwidth for 4-Hi)





## **Interposer Design Tools & Methodology**

### > Vertical Routing

a model from ubump to package pin is generated and used by high frequency designs (e.g. GT and IO)
Vertical route

### Horizontal Routing

### >Die LVS and extraction

- > Standard extraction
- > ubump is extracted as a subcircuit

### Interposer with box die LVS and extraction

- > Interposer metal extraction
- > TSV is extracted as a subcircuit

### Combine the extracted netlists from die and interposer for simulation



## **Interposer Design Tools & Methodology**

- 2.5D uses the same tool sets as single die design with customized interposer / top die PDK
- EDA vendor Tools are validated by TSMC design reference flow
- >PKG uses same tool sets as Flip chip (C4-to-BGA)
  - TSV budget is handled in the Silicon design environment
  - Layout and PI tools must be capable to handle large data sets



### **EXILINX >** ALL PROGRAMMABLE.

## **Supply Chain – Silicon Interposer Approach**

> Xilinx in production with 2<sup>nd</sup> generation of products with TSMC CoWoS



#### XILINX > ALL PROGRAMMABLE.

## **HBM Integration – HPC Application**

Temperature [C]

71.31

65.34 59.37 53.40

47.43 41.46

35.49

29.52

23.55

PCI-e card: Full Length/Full Height Card power: 320W Airflow: 15CFM Typical ambient 30C

- HBM Power map provided by vendors
- Thermal model can be done in Flotherm or IcePak environments for example

HBM can <u>be 97C</u> and HBM I/F 96C @30C HBM gradient ~14C (~2.5C/Layer)

### Air cooling can be a challenge! HBM 8-Hi needs to support > 95C T<sub>j</sub> .....



### **Telecom Application with HBM**



### Summary

> Tb/s low latency bandwidth and lower system power is driving the need for HBM adoption

Silicon Interposer (2.5D) is the incumbent technology of choice. Potentially lower cost, fine pitch interconnect wafer-level and substrate based technologies are emerging

> To drive broader adoption of HBM applications (cooling limited) and higher performance stacks (8-Hi), higher HBM junction temperature (>95C) needs to be supported



© Copyright 2016 Xilinx







# **Follow Xilinx**







Tube

youtube.com/XilinxInc

linkedin.com/company/Xilinx



plus.google.com/+Xilinx

