# MULTICORE DESIGN SIMPLIFIED

## System-Level Automation Tools for MPSoC Designs

Peter Flake August 14th 2006



- Trends and Challenges
- Exploration Tools
- Requirements for MPSoC Design
- Solution Providers
- Conclusion



## It's all your fault ...



- End consumer is very demanding ...
- Convergence
  - More performance
  - Less power
  - Lower cost
  - Shortest time to market
- Skyrocketing chip development cost
  - > Flexibility
  - Longest time in market



Source: Anssi Vanjoki Executive Vice President and General Manager Nokia Nokia Capital Market Days

## ... today's methods are running out of steam!



- The complex devices of the future will consist of heterogeneous parallel processor frameworks executing huge software applications.
- New key technologies and methodologies are required to automate and streamline Multi-Core IC design & programming



- Introduction
- Trends and Challenges
- Exploration Tools
- Requirements for MPSoC Design
- Solution Providers
- Conclusion

### **Observations....**



- Prof. Kurt Keutzer, Berkeley
  - The ad hoc approach to SoC design simply cannot scale with Moore's Law because it does not sufficiently reduce the complexity of SoC design
  - The "software-development environment as afterthought" era of IC design is rapidly drawing to a close

## **Methodology Evolution**





© 2005-2006 Imperas, Inc.

8/14/2006

## The Multi-core IC Trend to MPSoCs



Multi-Core IC usage is rapidly increasing, and will take over as the dominant method of executing large design projects efficiently

| Use of IP in designs<br>Embedded processors were most commonly used IP                                                                                                                                    |                                                          |
|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------|
| Do you use or reuse any thir<br>your most recent system de                                                                                                                                                | d-party/internal intellectual property (IP) for<br>sign? |
| No, do not use third-party/<br>internal IP<br>55%                                                                                                                                                         | Yes, use third-party/<br>internal IP<br>45%              |
| Functional blo                                                                                                                                                                                            | ks provided via third-party/internal IP                  |
| Which of the following func<br>or internal IP?                                                                                                                                                            | tional blocks were provided via third-party              |
| or internal IP?                                                                                                                                                                                           |                                                          |
| Embedded processors                                                                                                                                                                                       | 58%                                                      |
|                                                                                                                                                                                                           | 58%                                                      |
| Embedded processors                                                                                                                                                                                       |                                                          |
| Embedded processors                                                                                                                                                                                       | 50%                                                      |
| Embedded processors<br>Memory—RAM<br>Serializer/deserializer                                                                                                                                              | 50%<br>30%                                               |
| Embedded processors<br>Memory—RAM<br>Serializer/deserializer<br>DSPs                                                                                                                                      | 50%<br>30%<br>25%                                        |
| Embedded processors<br>Memory—RAM<br>Serializer/deserializer<br>DSPs<br>Analog-to-digital converter                                                                                                       | 50%<br>30%<br>25%<br>20%                                 |
| Embedded processors<br>MemoryRAM<br>Serializer/deserializer<br>DSPs<br>Analog-to-digital converter<br>DDR DRAM controller                                                                                 | 50%<br>30%<br>25%<br>20%<br>20%                          |
| Embedded processors<br>Memory—RAM<br>Serializer/deserializer<br>DSPs<br>Analog-to-digital converter<br>DDR DRAM controller<br>Digital-to-analog converter                                                 | 50%<br>30%<br>25%<br>20%<br>20%<br>15%                   |
| Embedded processors<br>Memory—RAM<br>Serializer/deserializer<br>DSPs<br>Analog-to-digital converter<br>DDR DRAM controller<br>Digital-to-analog converter<br>Memory—non-volatile                          | 50%<br>30%<br>25%<br>20%<br>20%<br>15%                   |
| Embedded processors<br>Memory—RAM<br>Serializer/deserializer<br>DSPs<br>Analog-to-digital converter<br>DDR DRAM controller<br>Digital-to-analog converter<br>Memory—non-volatile<br>PCI 65                | 50%<br>30%<br>25%<br>20%<br>20%<br>15%                   |
| Embedded processors<br>MemoryRAM<br>Serializer/deserializer<br>DSPs<br>Analog-to-digital converter<br>DDR DRAM controller<br>Digital-to-analog converter<br>Memorynon-volatile<br>PCI 60<br>RF circuit 65 | 50%<br>30%<br>25%<br>20%<br>20%<br>15%                   |

Embedded SW increasing -Doubling annually

Source: ITRS 05



Dataquest: Use of processor based platforms growing 8-10% CAGR

Collett: ">60% of designs now contain more than one processor"

## The MPSoC Development Challenges



Software Complexity

Design environments today still represent old thinking and are inappropriate for combined HW / SW design Power, performance, cost, design time all very difficult to optimize collectively Application to architecture mapping, including the selection of the most effective HW and SW architecture not addressed

Increasing MC IC Complexity

- Introduction
- Trends and Challenges
- Exploration Tools
- Requirements for MPSoC Design
- Solution Providers
- Conclusion



## **Requirements for MPSoC Design**





- How do I program it and express parallelism?
- How do I simulate this at reasonable speeds?
- How do I debug this?
- How do I optimize the software?
- How do I optimize across hardware and software?
- How do I deliver it to my software users?

## Requirements: Programming

#### <u>Today</u>

- Various languages
  - Do not express parallelism
  - Often limited to specific application domains
- Several incompatible programming models used for special applications
  - OpenMP
  - YAPI
  - DSOC
  - SMP
  - xUML
  - **-** . .



#### **MPSoC Requirements**

- Appropriate Programming Models
  - Task level parallelism
  - Flexibility
  - Efficiency

### One Approach ...





- Communication structure is separated from tasks
  - Coordination language
- Various modes of communication can be supported
  - blocking, non-blocking
- Communication can be implemented in various ways
  - Depending on the platform

## **Requirements: Debug & Simulation**





Sources: ARM IQ Magazine



- Today, simulation speed is limiting
- Need faster simulation
  - Enabling trade offs
  - Flexibility: appropriate accuracy at appropriate speed
- Today, single core debugging approaches don't scale to MPSoC
- Need true multi processor debug
  - Focused on threads
  - Scaling to 10+ processors

## Today's Approaches with SystemC



#### SW development

- Run application code compiled for host
  - Fast
  - Not instruction accurate
  - May give different results
- Model peripherals and communication in SystemC
  - Special OS code
  - Not timing accurate
  - Performance bottleneck

#### SW verification

- Run application code on ISS
  - Slow
  - Instruction accurate or cycle accurate
  - May use vendor debugger
- Wrap ISS in SystemC
  - Memory inside or outside
  - Speed or accuracy
- Model peripherals and communication in SystemC
  - OS can run on ISS

## **Other Approaches ...**



#### Code Morphing

- Run application code compiled for ISS but translated into host instructions
  - Fast
  - Instruction accurate
  - May use vendor debugger

#### <u>Hardware</u>

- FPGA Development systems
  - Fast
  - Late in the flow ... a fair amount of implementation has to be done
- Emulation
  - Pretty fast ...
  - Sometimes painful to set up (order of weeks)
  - Also late in the flow

# Requirements: Software & Automation



#### <u>Today</u>

- Limited SW support
  - SystemC models slow for SW developers.
  - Models not well verified
  - Debugger integration poor
- Focus on Analysis
  - "here you go ... now fix it yourself manually and resimulate"

#### **MPSoC Requirements**

- True HW/SW Interaction
  - Higher levels of speed/accuracy trade-off
  - Easily verifiable models
- True HW/SW Automation
  - SW Mapping & Optimization
  - HW/SW Optimization
  - "here is the solution for your power/performance objectives"

- Introduction
- Trends and Challenges
- Exploration Tools
- Requirements for MPSoC Design
- Solution Providers
- Conclusion



## Different users have different needs ...



#### **Platform Designer**

- Ensure that selected applications can be run at required performance and efficiency
- Optimize platform architecture
- Programming of compute intensive portions of application
- Efficient modeling of platform options

#### Platform User

- Try new platform
- Add new applications to existing platform
- Check performance and power constraints
- Find optimal SW to HW mapping
- Optimize hardware parameters
- Platform independent IDE
- Platform models
- Exploration of various SW partitioning options

### ... which are not met today!



#### <u>Today</u>

- Lots of individual single and fixed core offerings
- Parallelism, configurability and multiplicity of processing not appropriately addressed
- "[...] the design of complex embedded systems with multiple configurable, extensible processors demands new ESL tool capabilities that go well beyond current offerings!"

Grant Martin Chief Scientist Tensilica

#### **MPSoC Requirements**

- Solutions with parallelism and multiplicity of processing in mind
  - Compilation
  - Simulation
  - Debug
  - Programming
  - SW/SW Optimization
  - SW/HW Optimization
- True System Design Automation across hardware and software

- Introduction
- Trends and Challenges
- Exploration Tools
- Requirements for MPSoC Design
- Solution Providers
- Conclusion



## Platform Eco-System: e.g. TI OMAP





Source: IEEE Computer

## Who can provide Solutions?



- Current tools provided by different parts of the Eco-System
  - Not well integrated
- Next generation System Design Automation tools
  - Will be provided by specialist suppliers
  - Close cooperation with hardware and software designers required
  - Will probably have to be funded by the hardware world
    - Software developers expect a state of the art software development environment supporting the platform
    - Otherwise they will simply switch platforms or remap the application
- New MPSoC Methodology will be supported

- Introduction
- Trends and Challenges
- Exploration Tools
- Requirements for MPSoC Design
- Solution Providers
- Conclusion



## The Future of IC design: MPSoC

## MULTICORE DESIGN SIMPLIFIED

Processor performance growth through improved technology is becoming exhausted, so the next phase is <u>multi-core</u> to provide additional processing capability



#### **Discontinuity:**

How will these devices be programmed? Can all the device developers provide good programming tools?