## M32Rx/D - A Single Chip Microcontroller with A High Capacity 4MB Internal DRAM

#### Toru Shimizu

Mitsubishi Electric Corporation System LSI Division 4-1 Mizuhara. Itami. Hyogo. 664 Japan

MITSUBISHI ELECTRIC CORPORATION



## Overview

- Highlights
- M32Rx Architecture
  - M32Rx ISA, and Micro-Architecture
- Embedded RAM (eRAM) Technology
  - Fusion of a 32-bit RISC and a High Capacity DRAM
  - High bandwidth can be achieved by wide internal bus
- Summary



# M32Rx Highlights

• A high capacity (4MBytes) DRAM is integrated with a 32-bit RISC core



- A dual-issue pipeline is implemented
- Fast data transfer to and from the external bus can be achieved using the wide (128bit) internal bus



## M32Rx Features

- Simple 32-bit core coupled to a high capacity DRAM
  - Overcomes the memory access bottleneck in execution
  - Low power dissipation due to main memory integration
- Good performance for embedded systems
  - Target applications
    - Multimedia applications: Image compression, Audio, Speech recognition, Voice compression, etc.
    - Communications: Decode/Encode, Networking, Modem, etc.
  - Target Systems
    - Digital cameras, Internet terminals, Telephone, PDAs, etc.



## M32Rx Architecture

- M32Rx ISA (Instruction Set Architecture)
  - M32R ISA Upwards Compatible
    - 32-bit x 16 General Purpose Registers
    - 56-bit x 2 Accumulators utilized by DSP function instructions for multimedia applications
  - Variable Length Code Format: 32-bit / 16-bit
    - Two 16-bit instructions can be executed in parallel
    - High code efficiency due to 16-bit instructions
- Pipeline Structure
  - Dual-Issue, 6-Stage Pipeline
  - In-Order Issue, Out-of-Order Completion

#### M32Rx ISA (Instruction Set Architecture)

- M32R ISA Upwards Compatible
  - Total 95 instructions = M32R compatible 83 instrs.
    - + 12 additional instrs. (including 5 additional DSP function instrs.)
- Variable Length Code Format: 32-bit / 16-bit
  - Two 16-bit instructions can be executed in parallel





## **M32Rx Pipelines**

• Dual issue implemented by using two pipelines



## **Instruction Issuing**

- Available instruction categories for each pipeline
  - Arithmetical/Logical operations can be executed in both pipelines
  - Load/Store and Jump/Branch operations can be executed only in Pipeline1
  - Multiply and Accumulate operations can be executed only in Pipeline2

| Operation                   | Pipeline1 | Pipeline2 |
|-----------------------------|-----------|-----------|
| Arithmetic Op.              | Ο         | Ο         |
| Logical Op.                 | Ο         | Ο         |
| Load/Store Op.              | Ο         | Х         |
| Jump/Branch Op.             | Ο         | Х         |
| Multiply and Accumulate Op. | Х         | О         |

## **Pipeline Structure**

• Two 6-Stage Pipelines



#### **DSP Function Instructions**

• Multiple and Accumulate Instructions



#### • Rounding Instructions



#### M32Rx Block Diagram



MITSUBISHI ELECTRIC CORPORATION

#### eRAM: Embedded RAM Technology

- Logic and memory are integrated in one chip
  - Logic : CPU, ALUs, Multipliers, En/Decoders, etc.
  - Memory : DRAMs, SRAMs, Flash-ROMs, etc.
- High bandwidth can be achieved by connection via wide internal buses
- Total system performance can be increased
  - High performance
  - Low power consumption
  - Small package footprint on PCB

## **High Performance Memory System**

- Cache SRAM and Internal DRAM are interconnected by a 128-bit Internal Bus
  - High Bandwidth : 1.5GByte/s @ 100MHz
  - High speed cache-line replacement

Applied to embedded systems

- Very good cost/performance memory system
  - Simple Cache can be employed due to high speed internal DRAM
    - Cache can be simplified to realize the same processing performance
    - Large and complex cache is expensive
  - Power dissipation is reduced



## **Internal Bus Organization**

• Modular design methodology has been employed



PSU: Power Saving Unit



#### **External Bus-Master Access**

- M32Rx/D chip can be accessed like a memory chip
  - Fast external bus-master accesses can be executed using the wide internal bus
  - CPU operations are almost undisturbed by external accesses



#### External Bus-Master Access (Cont.)

- Internal bus arbitration
  - An Operand Access (OA) request and an External Bus-Master Access (EA) request may happen at the same time
  - EA requests take priority over OA requests
  - To avoid dead-locks, D-cache can accept EA requests during miss operations as well
- To keep data coherency :
  - Data must be accessed through the D-Cache
  - Only data in the internal DRAM space is cacheable



#### **Cache Memories**

- Single cycle read, two cycle write
- Instruction Cache: 4KByte, Data Cache: 4KByte
  - Direct-mapped, separate I and D caches
  - Data cache is a write-back cache
- D-Cache is accessed during external bus-master accesses to internal DRAM so as to keep data coherency
- Data Buffers
  - D-Cache has *Write* and *Read Buffers* to enhance write and write-back performance



## **BIU and Buffers**

- The BIU converts data between the 128-bit internal bus and the 32-bit external bus
- The BIU supports a **burst transfer** mode to realize fast data transfers to and from external devices
- The BIU has two *Read Buffers* and two *Write Buffers* 
  - Double buffering is employed to enable seamless burst transfers between the internal DRAM and external devices



#### **External Bus-master Access** (Evaluation)

- CPU operations are almost undisturbed by external accesses
  - ex. Dhrystone 2.1 + External Bus-Master Access (Burst Read)



## **Other M32Rx/D Features**

- Debugging Support
  - JTAG interface
- Multiprocessing Support
  - Master-Slave mode
- Power Saving Modes
  - Stand-by mode:
    - Only the DRAM clock is supplied, other clock supplies are stopped
  - CPU sleep mode
    - CPU and Caches are stopped
    - D-Cache is woken up by external bus-master accesses



## M32Rx/D Specification

| CPU Core Architecture |              | M32R Architecture Upwards Compatible<br>(12 additional instructions including<br>5 additional DSP function instructions) |  |
|-----------------------|--------------|--------------------------------------------------------------------------------------------------------------------------|--|
|                       | Pipeline     | 2-instr. parallel execution, 6 stage                                                                                     |  |
|                       | DSP Function | MAC (32bit x 16bit + 56bit) 1 cycle execution, 2 accumulators                                                            |  |
|                       | Cache        | Instruction: 4K Byte, Data: 4K Byte                                                                                      |  |
| Performance           |              | 110 MIPS (Dhrystone), 200MOPS @100MHz                                                                                    |  |
| Internal Memory       |              | 4M Byte (32M bit), x 128 bit organization                                                                                |  |
| Peripheral Functions  |              | JTAG Interface / Debug Function                                                                                          |  |
| External Bus          |              | Address: 27 bit, Data: 16/32 bit, 25MHz(max)                                                                             |  |
| Operating Clock       |              | 100 MHz (internal)                                                                                                       |  |
| Power Supply          |              | External: 3.3V, Internal: 2.5V                                                                                           |  |



## M32Rx Chip Layout

- Design Rule
  - 0.25. m CMOS, 3 Metal
- Chip Size
  - $9.7 \ x \ 10.29 \ mm^2$



#### Summary

- M32Rx/D chip as a CPU with DRAM
  - A high performance microcontroller boosted by a high performance internal memory system
- M32Rx/D chip as a DRAM with CPU
  - An intelligent memory
  - Multi-processor/multi-memory systems

Mitsubishi M32R family chips provide many possibilities for new styles of computing, Enjoy!