# Performance evaluation of a multi-core system using Systems development method utilizing Reverse modeling and Model-based Simulation

14th International Forum on Embedded MPSoC and Multicore July 7 - 11, 2014

> Yoshifumi Sakamoto, Ph.D. , PMP Digital Front Office - Industrial Services, Global Business Services IBM Japan, Ltd.,

# OUTLINE



- Introduction
- Proposed Method
- Subject to be solved
- Objectives
- Modeling
- Simulation
- Conclusions

### Introduction Embedded systems Play an important role in configuring and

maintaining Social Infrastructure



#### Specially, Ensuring dependability is highly prioritized issues



Based on products already been released

### Introduction



Why many embedded system development organizations applied derivational development?

#### To improve

- Development Efficiency
- Time-to-market
- Product Quality

### Introduction



#### What is **Biggest advantage** of applying derivational development?



# Reuse of existing design assets

The smaller the Reuse cost, the more the benefits for reuse

### Problems to be solved

Development efficiency



Derivational development was applied, **but** 

Could NOT improved

Product quality Time-to-Market what's happened?
Development scale is Large?

Systems become complicated?

Investigations were only few cases



All 20 teams are applying derivational development

### Survey result

- Reuse rate for specifications was only 12%
- Reuse rate for codes and models were much higher than for specifications
  Model, 21%

## rarely used

To create codes or models after creating specifications



Deliverables for reusing

#### Survey result • Requirements are used as an input for architectural design, detailed design and implementation

Creating Codes & Models first, then create documents



## Survey result • To specify design addition or modifications, the most popular method is Face to Face comunication

Specifications aren't popularly used, also



45%

14%

investigation, 9%

Source code

investigation,

32%

How to identify the differences

#### Issue measurement - what's happened?

- Architecture design and System design activity are skipped
- Requirements used as an input for detailed design and implementation
- Main development deliverable is limited to source code



### Issue measurement - Examine



Why derivational development tends to take place in such a manner?

- Documentation is eliminated in order to reduce development cost and to achieve development schedule
- Customers don't require documents, because source code delivered and Integration activity need only them
- There isn't a useful process that suits derivational development

### Objectives



 Resolve Specified unique issues of derivational development

 Propose a methodology to solve issues of derivational development



# What is SRMS ?

Systems development method utilizing Reverse modeling and Model-based Simulation

> Method to verify the design quality of the embedded system in upper reaches of process of the system development.

## SRMS - Proposed Method



#### **STEP1- Requirement Definition**

Clarify the functions, users of the system, system operations and boundaries

#### **STEP2- Reverse Modeling**

Create system configurations, logical sequence, parameters, constraints and preconditions

#### **STEP3- Model-based Simulation**

Model-based performance evaluation

#### STEP4- New System Development

Gradual development from upper to lower activities of the process

### Performance Evaluation for SoC



 Appropriate to estimate the performance of the SoC in the early stages of SoC development.

 Higher accuracy than conventional estimation methods.

### SRMS - Outline of the Reverse Modeling

#### Embedded System



# Observation technology that I used for dynamic behavior analysis

# Low invasiveness



Entry, funcA, Entry, funcB,

Exit, funcB,

flag, name, timestamp [,ret(param,...]

10000000

## Dynamic Behavior Analysis

- Observe a Dynamic behavior of the embedded system and to collect the execution trace data.
- The execution trace data includes :
  - the **timestamps** for function calls and returns, **processor usage** per unit time, and the **sizes of the data transfers** via the buses.

### Observation Environment









Each color represents a different task

#### Constitution of the performance evaluation model

- Change easiness of the model analyze performance for some architecture candidates
- Adopt the structure that isolated a Dynamic Behavior Model and System Resource Model
- Execution trace data are stored away by a Parameter File



Simulator

**Rational Simulation framework** 

### Variation of the System Resource Model

#### Time Resource Model

The model describes progress of the time
Job allocation

AMP: Job cue allocates a peculiar job to a Hardware Resource Module **statically** SMP: Job cue allocates a job to a Hardware Resource Module **dynamically** 





### Variation of the System Resource Model Memory Resource Model

- This model expresses the consumption of the memory resource to assign to each processing.
- The dynamic behavior model shifts to the next processing step depending on a notice of the success or failure.



# System Architectures

#### Single-Core Architecture

#### Multi-Core Architecture



## Performance Evaluation Model

#### Single-Core Architecture



#### Multi-Core Architecture



### **Execution Scenario**

Print quality evaluation image JEITA J12-P11



JEITA : Japan Electronics and Information Technology

### Performance Evaluation – Model Simulations



Relative comparison with the single core architecture

Short reduction ratio of the print processing time Max. 22.0% Mean. 8.3%

#### Memory Usage – Model Simulations



# Validity Verification

- To verify the whole processes.
- Used an FPGA-based evaluation platform.
- Model Simulation vs Evaluation Platform.
- Difference : 1.1% ~ 6.0%





### Energy Estimation for SoC



 Appropriate to estimate the energy consumption of the SoC in the early stages of SoC development.

 Higher accuracy than conventional estimation methods.

## Model for Energy Estimation

The energy consumption of an SoC used in embedded systems is strongly affected by the dynamic behavior of the software.

- Describe the dynamic behavior of the software.
- executable UML model
- Describe the energy consumption and the delay time of the SoC.
- executable UML model







Created form the energy consumption **specification** of the SoC and IP Core.



Created from the existing embedded system utilizing a **Dynamic Behavior Analysis**.

#### Dynamic behavior model

Task Manager Module
Control the SoC behavior model
described by a sequence diagram

#### Task Module

Parameter file

- Created from the execution trace data
- Ioaded for a simulation scenario



#### SoC behavior Model

#### Power Module

Controls the state of each IP core.

Calculate IP core energy consumption
 Track the delay time of each behavior for the dynamic behavior model.

#### **Power Sim Model**

- Accumulates the energy consumption of each power module
- Calculates the total power consumption



#### Dynamic behavior Model of MFP

## The proposed method is applied for an actual embedded system in an MFP.

Execution trace data from the actual MFP is collected.Execution scenario calls for 4-pages continuous printing.



### Internal structure of the SoC and energysaving technology to be applied



#### Equations of Energy Consumption

SoC Total

 $P = P_{\text{mpu_rate}} + P_{\text{memc} \cdot \text{bus_rate}} + P_{\text{acc_A}} + P_{\text{acc_B}}$ 

 $P_{\text{mpu_rate}} = (P_{\text{mpu_max}} - P_{\text{mpu_min}}) \cdot \text{MPU usage}(\%) + P_{\text{mpu_min}}$ 

#### Bus and Memory Controller

 $P_{\text{memc} \cdot \text{bus}_{\text{rate}}} = \left\{ (P_{\text{mem}_{\text{max}}} - P_{\text{mem}_{\text{min}}}) + (P_{\text{bus}_{\text{max}}} - P_{\text{bus}_{\text{min}}}) \right\} \cdot U + P_{\text{mem}_{\text{min}}} + P_{\text{bus}_{\text{min}}}$ 

 $U = \frac{\text{MemoryTransferSize}}{\text{EffectiveMemoryBandwidth}}$ 

Accelerator A and B

$$P_{\text{acc}_X} = \begin{cases} \sum_{t \in T} P_{\text{acc}_op}(t) + \sum_{t \in V} P_{\text{acc}_ft}(t) & (\text{BaselineSoC}) \\ \sum_{t \in T} P_{\text{acc}_op}(t) + \sum_{t \in V} P_{\text{acc}_eg}(t) & (\text{ClockGating}) \\ \sum_{t \in T} P_{\text{acc}_op}(t) + \sum_{t \in V} P_{\text{acc}_pg}(t) & (\text{DynamicPowerGating}) \\ (X = A, B) \end{cases}$$

#### TABLE I. ENERGY CONSUMPTION FOR EACH IP CORE IN THE SOC

| IP Core                                | Condition               | Name     | Energy<br>Consumption<br>(mJ) |
|----------------------------------------|-------------------------|----------|-------------------------------|
| Processor                              | Working(Max.)           | Pmpu_max | 615                           |
|                                        | Idle                    | Pmpu_min | 120                           |
| Memory<br>Controller                   | Working(Max.)           | Pmem_max | 53                            |
|                                        | Idle                    | Pmem_min | 35                            |
| Bus and Inter<br>connections           | Working(Max.)           | Pbus_max | 37                            |
|                                        | Idle                    | Pbus_min | 20                            |
| Accelerator –A<br>And<br>Accelerator-B | Working(Max.)           | Pacc_op  | 165                           |
|                                        | Idle                    | Pacc_fr  | 140                           |
|                                        | Clock Gating            | Pacc_cg  | 84                            |
|                                        | Dynamic Power<br>Gating | Pacc_pg  | 9                             |



#### Collaboration Diagram – Energy Evaluation Model









#### Comparison of Simulation Results



#### Validity Verifications

Comparison : Proposed method, Spreadsheet method and Actual SoC.

- Proposed method is appropriate to estimate the energy consumption of the SoC in the early stages of SoC development
- Proposed method had higher accuracy than the standard spreadsheetbased method

| SoC type                           | (a)<br>Proposed<br>Method<br>(mJ) | (b)<br>Spread<br>sheet<br>(mJ) | (c)<br>Actual<br>SoC<br>(mJ) | Error<br>(a) vs (c)<br>(%) | Error<br>(b) vs (c)<br>(%) |
|------------------------------------|-----------------------------------|--------------------------------|------------------------------|----------------------------|----------------------------|
| Baseline                           | 1,509                             | 1,620                          | 1,361                        | 10.9                       | 19.0                       |
| Clock Gating<br>Applied            | 1,412                             | 1,488                          | 1,250                        | 13.0                       | 19.0                       |
| Dynamic Power<br>Gating<br>Applied | 1,282                             | 1,377                          | 1,113                        | 15.2                       | 23.7                       |
| Average                            | -                                 | -                              | -                            | 13.0                       | 20.6                       |

#### Simulation – Tradeoff of Dynamic Power Gating

One tradeoff is between the startup waiting time and the energy consumption.



#### Simulation – Dynamic Power Gating with Inrush

•Surge current - at the time of power supply startup.

• Hold time – extend time to Cut off a Voltage-Island power



## Conclusions

Proposed SRMS and verified the effectiveness by Model-Based simulation

showed the effectiveness by applying SRMS to the development of the real embedded system



Possibility of the application in a wide domain of the embedded system development

# Thank you for your attention