MPSoC 2011

Agenda

Sunday
3rd July

Monday
4th July

Tuesday
5th July

Wednesday
6th July

Thursday
7th July

Friday
8th July

Session 1
Keynote
(8:30 - 9:30) Session 5
Keynote
(8:30 - 9:30) Session 9
Keynote
(8:30 - 9:30) Session 12
Keynote
(8:30 - 9:30) Session 15
Keynote
(8:30 - 9:30)

Break

Session 2

Memory/Cache/Storage
(10:00 - 11:12)
Session 6
3D IC
(10:00 - 11:18) Session 10
Mobile Platform
(10:00 - 11:48) Session 13
Reconfigurable
(10:00 - 11:36) Session 16
Methodologies for MPSoC
(10:00 - 11:30)

Panel
(11:12 - 12:30) Panel
(11:18 - 12:30) Panel
(11:48 - 12:30) Panel
(11:36 - 12:30) Panel
(11:30 - 12:30)

Lunch (12:30 - 14:00)

Session 3
High Performance Computing
(14:00 - 15:30) Session 7
Power/Energy
(14:00 - 15:24) Session 11
Multicore Architecture & Panel
(14:00 - 15:36) Session 14
Architecture
(14:00 - 16:06)

Panel
(15:36 - 16:25)

Break

Session 4
Continuing Moore's Law
(16:00 - 17:00) Session 8
Software
(16:00 - 16:36) Session 11
Speaker's meeting
(16:45 - 17:45)

Panel
(17:00 - 18:00)
Registration
(18:00 - 19:00)    Panel
(17:00 - 18:00)   Panel
(16:36 - 17:40)

Welcome reception(19:00) Dinner
(19:30) Dinner
(20:15) Banquet in Clos de Vougeot (Bus depart 18:15)
(19:00) Dinner
(19:30)

Keynote In-Depth Technical Presentation and Mini-Keynote

Lectures

Monday July 4

SESSION 1: Keynote

Kerry Bernstein, IBM., USA
Architectural Directions for Future Nanoscale Computing Systems
Sooner or later, CMOS Scaling will come to an end. What do we do next?
A number of very different switches have been proposed as replacements, some of which in fact do not even use electron charge as the state variable.
Instead, these switches pass tokens in the spin, excitonic, photonic, magnetic, quantum, or heat domains. The emergent physical behaviors and idiosyncrasies of these novel switches can compliment the execution of specific task algorithms or workloads, and improve overall thru-put in high performance computing. This talk will describe potential CMOS replacements, focusing upon their preferred circuits and architectures. Looking forward, emergent capabilities and system architectures will redefine the boundaries of what is normally considered the limits to computing.

SESSION 2: Memory/Cache/Storage
(IN-Depth Technical Presentation and Mini-Keynote)

Sungjoo Yoo, POSTECH, Korea
Low power hybrid PRAM/DRAM main memory
Phase-change RAM (PRAM) is a strong candidate to be used as a part of main memory in the near future. In this talk, we will present a PRAM/DRAM hybrid main memory for low power. The hybrid main memory is utilized to reduce DRAM refresh energy by exploiting the non-volatility of PRAM. In order to achieve further reduction in DRAM refresh energy, we apply a DRAM decay method which manages data hotness based on a time-out scheme in order to minimize the energy consumption of both DRAM and PRAM. Experimental results show a significant reduction in the energy consumption of hybrid main memory. It is mainly because most of programs require very short time-outs without noticeable performance degradation.

Hsien-Hsin Sean Lee, Georgia Institute of Technology, USA
Toward the design of robust and self-healing memories
A new class of memory called Storage Class Memory (SCM) is emerging to be considered an integral part of the main memory hierarchy. The salient features of these memory classes include non-volatility, MLC capability, better scalability, fast access speed, low supply voltage, and lower power without mechanical operations. Despite these prominent properties, their adoption is hindered by their operational reliability, primarily due to their low write endurance. The situation is exacerbated under the scenarios of deliberately designed malicious attacks. In this talk, I will discuss our recent research for improving their reliability via three architectural technologies: taper-proof wear-leveling, self-healing capability, and hybrid memory architecture with multi-dimensional address classification. The first approach aims at randomizing and obfuscating the write patterns of these memories with one additional level of address remapping within the memory module. The second method presents a novel multi-bit error recovery mechanism that can self-repair multi-bit stuck-at faults and continue to use these memories even if some cells become faulty and unusable. The third technique advocates a new memory architecture to enhance the reliability by including a small SRAM cache and a low-cost classification and allocation policy. With these techniques, we will be able to extend the lifetime of these SCM to approach the theoretical limit and even continue to operate them when faulty cells are present.

Hiroyuki Tomiyama, Ritsumeikan University, Japan
Challenges of Programming Embedded Many-Core SoCs with OpenCL

David Kleidermacher, Green Hills Software, USA
Multicore Embedded/Mobile Virtualization Update
Update on multicore system virtualization: techniques, use cases, challenges, and recent technological advances

Raphaël David, CEA LIST, France
Hardware support for online resources management

Soo-lk Chae, Seoul National University, Korea
Optimization of an H.264 decoder using its communication- centric model
A full high-definition video decoder requires huge memory bandwidth, which heavily depends on its communication architecture.  To find a better communication architecture, we propose to build a library of performance models in C for communication subcomponent. By building a performance model of the communication part of the decoder with the library components, we can easily explore communication architecture space by changing parameters of its components. Applying this modeling approach, we optimized the performance of a H.264 decoder from 720p 30 fps to 1080p 30 fps. We will briefly describe how its communication architecture is changed for performance improvement.

SESSION 3: High Performance Computing
(In-Depth Technical Presentation)

Dongrui Fan, Chinese Academy of Science, China
Godson-T: A High-Efficient Many-Core Architecture for Parallel Program Executions
Moore’s law will grant computer architects ever more transistors for the foreseeable future, and the challenge is how to use them to deliver efficient performance and flexible programmability. We proposed a many-core architecture, Godson-T, to attack this challenge. On the one hand, Godson-T features a region-based cache coherence protocol, asynchronous data transfer agents and hardware-supported synchronization mechanisms, to provide full potential for high efficiency of the on-chip resource utilization. On the other hand, Godson-T features a high efficient runtime system, a Pthreads-like programming model, and versatile parallel libraries, which make this many-core design flexibly programmable. This hardware/software cooperating design methodology bridges the high-end computing with mass programmers. Experimental evaluations are conducted on a cycle-accurate simulator of Godson-T. The results show that the proposed architecture has good scalability, fast synchronization, high computational efficiency, and flexible programmability.

Paolo Faraboschi, HP Labs, Spain
System-Level Integration for Scale-out Computing
In the last decade, System-on-Chip (SoC) designs have become the dominant technology in the high-end embedded, consumer electronic, and telecommunication markets. To date, SoCs remain relatively absent from the general purpose processor mainstream, where power efficiency and cost are traditionally sacrificed for higher performance. However, if we look at historical trends in general purpose processors, we can observe a slow but steady pace of introduction of system-level integration features: on-chip caches, memory controllers, GPUs and accelerators. At the same time, the mounting cost pressures in scale-out datacenters demand technologies that can decrease the total cost of ownership (TCO) and are creating a stronger case for SoC-based servers. This talk examines the system level integration design options for the server market, specifically targeting throughput computing warehouse-scale datacenter workloads. The talk evaluates the benefits and trade-offs of system level integration on in-order and out-of-order processors using novel tools to model the area and power of a variety of discrete and integrated server configurations. It shows that system integration has substantial benefits in reducing total chip area and dynamic power, highlights interesting architectural trends as technology nodes and die sizes change, and comes with important implications on future technologies that can reduce datacenter TCO.

Patrick Blouet, ST Ericsson, France
Mobile cloud computing trends and challenges
Mobile cloud computing is on the path to become a reality but pull with it a lot a very complex technical challenges. The mobility push very far requirements on wireless link as well on system autonomy. The cloud aspect brings new challenges on the partitioning of application and could bring opportunities for new use cases. The computing dimension linked with the huge increase in term of data are putting the technical problems in the mobile systems at a very high level. But beyond all the very difficult issues to solve at the device level, there is as well some other problems which need to be addressed at the network and server side to really release the power on mobile cloud computing.

SESSION 4: Continuing MOORE's Law
(In-Depth Technical Presentation)

John Goodacre, ARM, UK
Understanding what those 250 million transistors are doing

Michael Chang, Global Unichip, Taiwan
A 28nm Dual Cores SoC Design
Two most popular consumer products- Smart Phone and Tablet PC both devices require to equip high performance CPU core and GPU to run many graphical riches applications , and yet need to consume less power. In order to fulfill the requirement of being able to run high clock frequency and yet to consume low power, we select the most advanced 28nm TSMC process node for the SoC design.
The Dual Core SoC consists of a dual-core Cortex A9, Mali 400 GUC, DDR2/3/LPDDR combo phy and Controller, and other peripheral ports. The presentation will share the experience to overcome challenges in CPU core, DDR, and package designs.

Tuesday July 5

SESSION 5: Keynote

Bing Sheu, Director of R&D, TSMC
Design and Technology for Future Computing Systems
The talk materials are obtained from various organizations inside TSMC R&D. In regular talks, bottom-up approach is used to begin with fabrication technology, then transistor design, etc. In view of the nature of this Forum, top-down approach will be used. The talk begins with design trends for computing systems, especially focusing on speed performance and power consumption. Various sectors, such as server/desktop, graphics, and mobile are covered.
Next, key design building blocks, such as standard cells library and SRAM are addressed in detail. Comparison spans across multiple generations of technology nodes: from 40 nm node, through 28 nm node, to 20 nm node, and beyond.
Moore’s law continues to apply to next several generations, to possibly below the 10 nm node. In the meanwhile, innovations and breakthroughs in system-level packaging will enable & enhance future computing systems. Further scaling at system-level is best illustrated in recent advances in 3D ICs. On the other hand, further scaling at SoC (system-on-chip) is facilitated by advances in 3-dimensional transistors, such as FinFETs. In a FinFET transistor, effective conduction occurs at 3 sides: the top and 2 sides. Transistor performance per given footprint can be significantly improved. In addition, subthreshold slope of the transistor is enhanced that facilitates low-voltage operation which can reduce power consumption of the computing systems.
The talk will stimulate discussions that can help shape future developments in architectures for MPSoC and Multicore.

SESSION 6: 3D IC
(In Depth Technical Presentation and Mini-Keynote)

Koji Inoue, Kyushu University, Japan
Adaptive Execution on 3D Microprocessors
This talk introduces our research status focusing on 3D-implemented microprocessors. 3D-IC is one of the most interesting techniques to achieve high-performance, low-power VLSI systems. Stacking multiple dies makes it possible to implement microprocessor cores and main memory (or large memory) into the same chip. Although this kind of integration has a great potential to bring a breakthrough in computer systems, its efficiency strongly depends on the characteristics of target application programs. Unfortunately, applying die stacking implementation causes performance degradation for some cases. To tackle this issue, our architectural techniques for 3D implemented microprocessors are introduced. They attempt to dynamically adapt to varying behavior of application programs in order to compensate for the negative impact of the die stacking approach. This kind of run-time optimization is a key technique to make 3D-IC systems in practical.

Yukoh Matsumoto, TOPS Systems Corporation, Japan
COOL System: Low-Power 3-D Heterogeneous Multi-Core/Multi-Chip Architecture
Heterogeneous 3-D Multi-Chip-Stacking COOL System enables systems to be lower power consumption, higher scalability in its functionality and performance with increasing the number/type of chips, more flexible system development with chip assembly only, and better cost performance than conventional 2-D SoC based design. The COOL System consists of three core technologies; 1) Low power processor chips based on heterogeneous Multi-Core architecture, 2) Highly efficient distributed processing software for Multi-Core/Multi-Chip, and 3) Low-capacitance TSV based ultra-wide inter-chip connection. This presentation gives the COOL System concept, architecture, and design that aim to realize highly energy-efficient information systems. Especially, we present the COOL System design for a next generation digital-TV system that takes advantages of 3-D Multi-Chip Stacking.

David Atienza, EPFL, Swizerland
System-Level Thermal Management of 3D MPSoCs with Active Cooling
Continuous progress in manufacturing technologies are enabling the development of powerful and compact 3D multi-processor system-on-chip (MPSoC). However, 3D stacking originates higher power and heat densities, leading to degraded performance and large cooling costs if temperature management is not considered properly at all levels of abstraction in 3D MPSoC design. In this mini-keynote, I present a novel thermal-aware system-level design approach to design thermally-balanced 3D MPSoC with inter-tier liquid cooling for energy-efficient datacenter design. This new design approach, which is developed in cooperation with IBM, combines thermal modeling with fuzzy-logic based dynamic thermal management at system-level to control DVFS, task assignment and tuning the coolant flow rate in the tiers of high-performance 3D MPSoC architectures.

Yuan Xie, Pennsylvania State University, USA
3D NVM for Exascale Computing
Silicon Interposer and System-in-Package (SiP) provide a near-term solution to leverage 3D DRAM stacking for future high-performance microprocessor design, bringing more memory onto a microprocessor package to mitigate the “memory wall” problem. With a large amount of on-package DRAM, we compare the case of using it as last level cache or as part of main memory. We conclude that a heterogenous main memory using both on- and off-package memories providing both fast and high-bandwidth on-package accesses and expandable and low-cost commodity off-package Memory capacity.

Ahmed Jerraya,CEA LETI, France
3D-IC for HPC

SESSION 7: Power/Energy
(In-Depth Technical Presentation and Mini-Keynote)

Tohru Ishihara, Kyushu University, Japan
Energy Characterization of Embedded Processors for Software Energy Optimization
The presentation addresses our recent research activities and results on characterizing and reducing the energy consumption in embedded systems. Firstly, a technique for characterizing the energy consumption of embedded processors during an application execution is presented. The technique trains a per-processor linear approximation model for fitting it to the energy consumption of the processor obtained by post-layout simulation. Secondly, based on the energy model mentioned above, the presentation shows techniques for reducing the energy consumption by optimally mapping program code, stack frames and data items to the scratch-pad memory (SPM) of the processor memory space.

Yoshinori Takeuchi, Osaka University, Japan
Task Assignment Method for DVS based multiprocessor SoC
Multi processor SoC (MPSoC) system is one of key solutions for low energy consumption systems. However, as application scale steadily grows, low energy consumption is still an important problem. In this talk, task assignment method for DVS based multiprocessor SoC is presented.

Edith Beigne, CEA LETI, France
Fine-grain DVFS power-aware control
For years, CMOS technology scaling has lead to improvements in cost, speed and power while allowing an exponential gain in complexity. However, as transistors dimensions are getting closer to atomic scale, new phenomena appear and cause functional and performance issues. One of the main problems is variability, causing an increasing spread in characteristics of digital circuits, like maximum frequency and power consumption. To reach performance targets, the traditional worst-case approach leads to increasing design margins. Dynamic methods based on threshold voltage or supply voltage scaling are an effective way to compensate both intrinsic and extrinsic variability also called PVT (Process/Voltage/Temperature) variations, and it has been proved to be the most efficient way to optimize power consumption for a given performance target. In the meantime, the trend for chip architecture is to integrate more and more functionalities in complex System on Chip or MPSoC (Multi-Processor SoC). Moreover, because mobile devices are the driving force for the industry, power efficiency is now one of the main constraints and the use of effective power management methods is mandatory. Adaptive Voltage and Frequency Scaling (AVFS) is a well-known method to improve power efficiency of systems by lowering the supply voltage and the clock frequency when applicative constraints make it possible. Using a monolithic AVFS scheme (only one V/F point for the full chip), the worst functional core limits the power gain. As the activity of the different cores (or blocks, or IPs) is not necessarily correlated, if at least one functional unit has to run at full speed, the achievable power gain is highly limited.
This is the reason why multi-domain AVFS and ultimately fine-grain AVFS (or Local AVFS) architectures, are proposed to maximize energy savings. Globally Asynchronous and Locally Synchronous (GALS) architectures are proposed as a key enabler for fine-grain AVFS systems: functional cores are still synchronous, but supply voltage and clock frequency are independently adjusted in each core domain.

Youn-Long Lin, National Tsing Hua University, Taiwan
Multiprocessor Scheduling taking into account Energy Harvesting and Storage
Scheduling is one of the most important tasks in high-level and system synthesis. Traditional time-constrained scheduling algorithms try to meet performance target while minimizing hardware cost and energy consumption rate assuming that power supply is unlimited. As energy-efficiency is getting important, we should consider using the concept of energy harvesting for portable applications. We propose a time-constrained task scheduling algorithm for portable system synthesis. The targeted system is powered by an energy harvesting device together with a rechargeable battery. We show that there exists trade-off among hardware computational resource, energy harvester, and battery. We formulate the problem as an integer linear programming (ILP) one. Experimental results show that it can find optimal solutions for practical-sized problem instances. With battery, we can minimize the total hardware cost more than that without battery.

SESSION 8: Software
(In-Depth Technical Presentation and Mini-Keynote)

Yuichi Nakamura, NEC, Japan
A software development toolset for multi-core processors

Emil Matus, Technical University Dresden, Germany
Benchmarking of Dataflow Programming Models for MPSoC
Heterogeneity and parallelism in communications signal processing MPSoCs for 4G and beyond is inevitable. Optimized and flexible components will be preferred in order to meet stringent power constraints and performance requirements. The question arises on how to cope with the problem of estimating performance/cost figures of future architectures for numerous of application scenarios. High-level system exploration may be promising approach for predicting system characteristics. One of the major challenges in the high-level system exploration is the development of adequate and representative models of benchmark applications and system architectures. This contribution presents the methodology for semi-automated model generation of data-flow application benchmarks.

Jenq-Kuen Lee, National Tsing Hua University, Taiwan
Support of C++ Compiler for Embedded Multi-Core DSP Systems
As processor numbers grow in embedded multicore platforms, programming model and compiler supports become a critical issue in developing multicore applications. Increasingly, C++ compilers are now in demand for DSP systems in additional to traditional C compiler. In our talk, we will present methods and discussions in supporting C++ compiler for multi-core DSP systems. As embedded DSP systems are with tight resource, we will address this issue in supporting C++ compiler. In addition, we will present how C++ language constructs can help for design patterns of parallel programming of embedded multi-core systems. PAC DSP systems will be used as a case study for our presentation. Experimental results with stereo-vision and image-processing applications will be presented with our platforms..

Wednesday July 9

SESSION 9: Keynote

Alain Artieri, VP of ST Ericsson, France
Technical challenges to be in the race of the exploding Smartphone and Tablet market
We are seeing the most advanced computing concepts developed for mainframes and PCs over the past 6 decades coming into the Smartphone arena. The keynote speech will discuss the technical challenges and paradigm shift the industry has to face to drive this fantastic revolution.

SESSION 10: Mobile Platform
(In-Depth Technical Presentation and Mini-Keynote)

Ruchir Puri, IBM, USA
Design and CAD challenges: 22nm and beyond
Technology scaling clearly has been the driver of semiconductor and thereby EDA industry. In the semiconductor industry today, 45nm CMOS designs are in full production and 32nm design rules and infrastructure are already in place for designs starting later this year. It will not be long before the beat of 22nm will be upon us. Due to ever increasing cost of doing design, design productivity and more specifically, cost of design has become a major bottleneck in large scale design projects. Due to this cost crunch, automated synthesis techniques have been becoming increasingly important and this is bound to become a major trend going into 22nm for high performance SoCs. In addition, in 22nm and beyond, 3D IC technology has the potential of easing the system performance challenge problem. In order to exploit the full potential of 3D technology, new challenges in the area of physical design, thermal analysis, system level design and analysis need to be addressed. 3D interconnects have the potential of reducing critical paths delays significantly, which are typically between memory and the interfacing logic. In addition, now that the physical limits are beginning to impact scaling, the question is: how can we cost effectively design with complicated technology requirements presented by 22nm node and how the design automation community can help to achieve this goal? What are the challenges at 22nm and what would design look like going into 22nm and beyond? In this paper, we will focus on the major design and CAD challenges associated with 22nm and beyond

Lasse Harju, ST Ericsson, France
Sensor processing and power management in smartphone platforms
Sensors play an important role in improving the user experience of future mobile devices. Applications such as gesture recognition, automatic scene detection, and augmented reality will be possible with more accurate and intelligent sensors. Advanced sensor applications require constant sensor monitoring and various forms of pre-processing. The always-on nature of the sensor processing introduces a special challenge for SoC power management. Typical power management methodologies, such as clock gating, power gating, and dynamic voltage and frequency scaling provide means to limit the power consumption. However, typically these methods require explicit control from the software. Employing implicit power management structures minimizes the software effort for obtaining low-power.

Rudy Lauwereins, IMEC, Belgium
BOA-ADRES: a scalable baseband processor template for Gbps radios
Wireless baseband modulation in mobile platforms is traditionally implemented in hardware for energy efficiency reasons. For base station infrastructure, it is mapped on general purpose DSP processors and FPGAs, because it's market is too small to economically justify the development of a special purpose ASIC. In both domains, we observe a trend towards mapping baseband algorithms onto specialized processors. The need for cost efficient implementation of more than a dozen standards on a mobile device pushes us to software defined radios. In the cellular communication infrastructure domain, the need to avoid an exponential growth of the carbon footprint due to the exponential increase in communication capacity forces us to smaller cell sizes and hence to an increased number of femtocell base stations; it becomes economically viable indeed to develop specialized processors for this growing market. In this presentation, I will detail the properties of the baseband modulation processor BOA-ADRES designed to support radio standards above 1 Gbps for mobile devices and femtocells. The design combines compiler support for plain vanilla C code with Gbps throughput, close to hardware energy efficiency, flexible duty cycling, broad scalability and fast tenability for any subset of radio standards.

Chris Rowen, Tensilica, USA
Design of a 100GMAC/sec DSP Core for 4G Wireless
Two conflicting goals drive wireless baseband design. On one hand, the data-rates are so high (reaching 1Gbps) and power budgets so small (~ 200mW at for PHY at full bandwidth) that designs must be highly optimized to the necessary DSP algorithms. On the other hand, the complexity of the individual standards, and the need to simultaneously support multiple standards, dictate migration to a more flexible, programmable multi-core platform approach. No one baseband subsystem will satisfy the full spectrum from handsets to macro basestations, but families of specialized and configurable cores can help. This talk briefly introduces the world’s highest-performance DSP core, the ConnX BBE64, reaching to 100GMACs per second, and shows how multiple cores – in homogenrous and heterogenous multi-core configurations – are being deployed in 4G wireless platforms.

Kees van Berkel, STEricsson, Eidhoven Univ.
Multicore for 4G: 3GPP versus ITRS
Since 1970 CMOS feature size has scaled 0.5x every 5 years. Since 1990 cellular downlink bitrates have scaled 10x every 5 years. According to ITRS and 3GPP, both trends are to continue for another decade. Can both trends co-exist given the limited power dissipation budget for smart phones? What are the HW-SW and multi-core architecture implications and trade-offs?

Yankin Tanurhan, Synopsys, USA
MPSoC Subsystems: A New Reuse Paradigm
Modern MPSoCs integrate a diverse set of system functions like audio processing, video processing, wired and wireless connectivity, and application processing. Increasingly, these MPSoCs are built in a modular way from coarse-grain subsystems, as this helps to manage design complexity and to organize the design activities among different design teams.
We see an upcoming shift from IP reuse to subsystem reuse, where suppliers provide pre-integrated subsystem solutions including middleware and application layer software for well-recognized system functions. This approach solves a range of integration issues for the SoC integrator and gives a level of configurability and programmability to achieve optimum performance density and power consumption.

SESSION 11: Multicore Architecture
(Mini-Keynote)

Martin Schoeberl, Technical University of Denmark
A Time-predictable Microprocessor: the Patmos Approach
Current processors are optimized for average case performance, often leading to a high worst-case execution time (WCET). Many architectural features that increase the average case performance are hard to be modeled for the WCET analysis. We present Patmos, a processor optimized for low WCET bounds rather than high average case performance. Patmos is a dual-issue, statically scheduled RISC processor. The instruction cache is organized as a method cache and the data cache is organized as a split cache in order to simplify the cache WCET analysis. To fill the dual-issue pipeline with enough useful instructions, Patmos relies on a customized compiler. The compiler also plays a central role in optimizing the application for the WCET instead of average case performan

Kees Goossens, Eindhoven University of Technology, Netherland
Architecture Requirements for Composability and Predictability

Off-chip DRAM memory bandwidth is the scarcest resource in modern (multi-processor) systems on, and must be managed with high efficiency. Moreover, diverse applications with different quality of service (bandwidth, latency) requirements share the SDRAM. This is challenging, as most current SDRAM controllers and arbiters are quite unpredictable, in the sense that average and worst case behaviour are very different. We outline several strategies and architectures to offer real-time guarantees, with high efficiency.

Martti Forsell, VTT, Finland
MCPA -- MultiCore Portability Abstraction

Application portability between different architecture-paradigm/programming tool pairs for MP-SOCs is a big problem nowadays leading often to a complete rewrite of an application when switching from an architecture-paradigm pair to another. This is caused by a wide variety of architectural properties requiring different optimization techniques for different architectures, typically hiding the essence of parallel computing defined by the application.

In this presentation, we introduce the MultiCore Portability Abstraction (MCPA) simplifying portability and implementation of parallel applications. It abstracts away typical architecture dependent effects caused by latency, synchronization, and partitioning and acts as an executable intermediate abstraction/reference implementation as well as a tool for analyzing the intrinsic parallelism of the application and relative goodness of architectures in executing it. We give a short application example with performance measurements.

Interestingly, the MCPA appears to be architecturally directly implementable via our advanced configurable emulated shared memory architecture (CESM), which we are currently prototyping in our recently launched REPLICA project. If successful, this promises to simplify MP-SOC application programming radically.

Charlies Janac, Arteris InC., USA
Interchip Link Technology
Network on Chip(NoC) interconnect IPs can be applied to the problem of inter-die and inter-chip communications. The driver for using Interchip link IPs is implementation of System in Package(SIP) type SoCs with reduced Bill of Material cost as well as improved design/manufacturing flexibility. This presentation will cover the application of the MIPI standard based FlexLLITM interchip link to the problem of allowing a wireless modem to use the memory of the wireless application processor. Other potential applications will be mentioned in the presentation. Presentation will cover architecture approach, performance metrics and system benefits.

Gerhard P. Fettweis, TU Dresden, Germany
Exploration of NoC Design & Management Concepts for MPSoC

Marcello Coppola, ST, France
SoC interconnect: future directions and challenges

Pieter van der Wolf, Synopsys, Netherland
Audio Subsystem Solutions for Consumer SOCs
SoC integrators that want to implement audio processing on consumer SoCs for mobile phones or digital TVs need to shop around for a range of components. They typically need processor cores, a variety of audio processing software, digital peripherals like I2S and SPDIF, and an analog front-end with ADCs, DACs and drivers. Integration of these components into a working audio subsystem is a significant task that includes e.g. clock management, hardware / software integration, integration with the software on the host CPU, etc.
We illustrate how SoC integrators can benefit from pre-integrated audio subsystem solutions from specialized suppliers. These solutions integrate the various hardware and software components, while maintaining flexibility for integration of proprietary audio processing functions by the customer. Key benefit for the SoC integrator is a significantly reduced integration effort, freeing-up engineering resources while improving time to market.

Drew Wingard, Sonics, US
TBD

Thursday July 7

SESSION 12: Keynote

Prof. Kasahara Hironori, Waseda Univ., Japan
Homogeneous and Heterogeneous Multicore / Manycore Processors, Parallelizing Compiler and Multiplatform API for Green Computing
Multicore and manycore processors have been attracting much attention in wide variety of fields, such as consumer electronics including mobile phones, tablet computers, cameras and games, PCs, robots, medical servers, cloud servers and supercomputers, to attain high performance and low power. This talk introduces the OSCAR (Optimally Scheduled Advanced Multiprocessor) low power multicore chips, such as 8core homogeneous multicore RP2 and 15 core heterogeneous multicore RPX, the OSCAR parallelizing compiler and the OSCAR API to allow us high software productivity and low power consumption on multiple platforms. Those chips, compiler and API have been developed in METI/ NEDO national projects with Japanese IT/Semiconductor companies like Hitachi, Renesas Electronics, Fujitsu, Toshiba, Panasonic, NEC and so on. The OSCAR compiler automatically parallelizes sequential programs written in Fortran or "Parallelizable C" with multigrain parallel processing using coarse grain task parallel processing and data localization for caches and local memories with DMA data transfer using the OSCAR API. The compiler and API also allow us automatic power reduction using "DVFS," "clock gating," and "power gating." Performance evaluation shows the compiler and API give us scalable performance improvement on IBM, Intel, AMD, SGI, Fujitsu and Hitachi servers and 4core RP1 and 8 core RP2 SH4A homogeneous multicores, 15 core RPX heterogeneous multicores, ARM/NEC 4core MPcore and Fujitsu 4 coreFR1000. Also, the OSCAR compiler and API realized automatic power reduction by 70% during single application program execution on RP2 and RPX. Furthermore, they allow us automatic software coherent control for non-coherent cache manycores required for connecting hundreds of cores with low cost and power consumption. Currently, applications of the multicores and manycores, compiler and API to real products like automobiles, cameras, medical image processing systems, robotics and servers have been researched with industry in Waseda University "Green Computing Systems Research & Development Center" supported by METI.

SESSION 13: Reconfigurable Architecture
(In-Depth Technical Presentation and Mini-Keynote)

Paul Heysters, Recore systems, Netherland
A Glimpse into Future Reconfigurable Many-cores for Embedded Stream Processing

Kees Vissers, Xilinx, USA
Programming for performance in FPGAs using multiple processors and accelerators with C/C++ programming
Modern SoC designers are exposed to a large number of complex tools and IP blocks. They combine multiple processors, advanced interconnect, advanced memory subsystems and multiple IP blocks to achieve a total system performance. Modern FPGA programmers have technology to combine multiple processors, advanced interconnect, advanced memory systems and multiple accelerators to combine their system on a single FPGA. In this presentation I will show the design and challenges of modern systems in FPGA combining processors, accelerators that are derived with C to FPGA tools and external memory. I show an example from the wireless domain for C to FPGA tools, and an example from the Medical imaging domain for the combination of processors and accelerators that are designed with C to FPGA tools.

Kiyoung Choi, Seoul National University, Korea
Virtualized Processor Power Management
A coarse-grained reconfigurable array (CGRA) can dramatically speed up compute-intensive kernels of applications, offloading the burden of the main processor. However, the communication overhead between the CGRA and the main processor offsets the speedup obtained by the CGRA. Communication through a shared scratchpad memory is an effective way of reducing the communication overhead. However, for a transparent binary acceleration, such a setup poses a significant challenge on the main processor, which now must manage data on the scratchpad memory explicitly,   often resulting in superfluous data copies. This talk presents an enhancement to the shared scratchpad memory, called configurable range memory (CRM), which reduces the need for explicit management and thus reduces data copies and promotes data reuse in the shared memory.

Ian O'connor, Lyon institute of Nanotechnology, France
Nanofabrics for reconfigurable computing cores
It is widely recognized that CMOS transistor scaling, as a vector for the pursuit of performance levels predicted by Moore's Law and required by future applications, will not last through the next decade. Alternatives must be found, at both architectural and device levels. In this context, the emergence of new research devices based on nanotubes (CNFET) or nanowires (NWFET), offers the opportunity to provide novel logic building blocks, to explore new possibilities for digital design and consequently to reconsider the paradigms of computing architectures to achieve orders of magnitude improvements in conventional figures of merit. In this talk, I will look at the emergence of technologies capable of building large regular structures out of silicon nanowires or carbon nanotubes, and how logic functions can be mapped onto them, particularly in the context of reconfigurable applications. Some pointers to the future evolution of these technologies and associated architectures will be given, as well as the issues that must be solved before nanoscale computing fabrics become a viable alternative to CMOS.

Omar Hammami,Lensta Paristech, France
NMPSOC Synthesis: Combining NOC Synthesis with Multiobjective Design Space Exploration on large Scale Emulator
Increasing complexity of SOC design is pushing for increased automation. MPSOC synthesis automatically generate MPSOC but does not catch the advantages of multiobjective design space exploration. We combine NOC synthesis with multiobjective design space exploration in order to propose a MPSOC synthesis flow fully operationnal to exploit large scale emulator. Both exact methods and heuristics are exploited for this goal.

SESSION 14: Architecture
(In-Depth Technical Presentation)

Kunio Uchiyama, Hitachi, ltd., Japan
Heterogeneous Multicore Processor Technologies for Embedded Systems
For embedded systems in the digital-convergence era, various functions such as communication, security, audio, video, and recognition are required in a single device. However, improving the operating frequency of an embedded LSI in the system is saturated due to the significantly increasing power consumption problem. To solve this difficulty, heterogeneous parallelism on an SoC has been studied. A power-thrifty architecture, which combines embedded CPUs and special processing cores such as dynamic reconfigurable processors, has been proposed targeting a superior performance per power ratio and functional flexibility. From the viewpoint of programming, a parallelizing compiler and an Application Program Interface (API) have been developed that are suitable for heterogeneous parallelism. The evaluation results of various applications tested using prototype chips and programs will also be discussed.

Pierre Paulin, ST, Canada
Exploring H/W and S/W solutions to MP-SOC platform mapping: An Industrial Perspective
In this talk, we will describe the MultiFlex platform mapping technology and its use with S/W-dominated or H/W-dominated platform variants of STMicroelectronic's "Platform 2012" MP-SoC fabric. The MultiFlex platform mapping technology includes:
- High-level capture and simulation of applications expressed using a range of parallel programming models.
- Mapping of the application onto target H/W and/or S/W SoC platforms.
- Programming model-aware debug, trace and analysis tools.
The use of the MultiFlex mapping tools will be demonstrated using an HD video high-quality rescaling application, mapped onto two platform variants:
1) A purely programmable MP platform, using SIMD and SPMD.
2) A platform with data processing implemented as H/W-based processing units, with flexible S/W-based control and communication. We compare the capture and optimization of both implementation styles, starting from a common high-level reference HQR algorithm expressed using a streaming-style programming model.

Nakajima Masaitsu, Panasonic, Japan
Next Generation Multi-Processor Architecture for "Network Era UniPhier"
UniPhier is the Panasonic common integrated platform for digital consumer electronics and is widely adopted in SoCs for DTV, BDR, mobile phone, cam coder, etc. Toward Smart CE Era, we introduce a next generation multi-processor architecture for “Smart CE Era UniPhier”. It consists of heterogeneous multi-processor with ARM processor(s) and newly developed triadic multithreaded processor(s), Ashura, and shared system level L2 cache.

Norbert Wehn, University of Kaiserslautern,
Hardware Accelerators for Financial Mathematics - Methodology, Results and Benchmarks
Financial markets are as vivid as never before. In modern electronic markets, stock prices may change several times within a few milliseconds. Moreover the evaluation of the corresponding mathematical models, e.g. stochastic differential equations, consumes more and more computational power and energy. Thus, finding energy efficient algorithms and implementations to accelerate these calculations is therefore a very active area of research. In this talk we give an overview on hardware accelerators with special emphasize on methodology and benchmark aspects. The European pricing model based on the state-of-the-art Heston model is taken as example.

Takashi Miyamori, Toshiba, Japan
Heterogeneous Multi & Many Core Processors for Multimedia Applications
In this presentation, we will introduce multi-core and many-core processors for multimedia applications such as image recognition and video codecs. These applications include a lot of parallelism at various levels inherently. Hardwired engines can provide good performance efficiency for fixed functions. Accelerators like SIMD-type array processors are suited for functions with high data parallelism. Multi-core processors are used for parallel execution of task-level processing. Furthermore, we will propose a many-core processor architecture that is organized by low-power embedded processor to achieve high-performance within constraints of consumer or automotive applications.

Tsuyoshi Isshiki, Tokyo Insituteof Technology, Japan
Trace-Driven Workload and Bus Traffic Simulation for MPSoC Architecture Evaluation
This talk focuses on the MPSoC architecture evaluation framework for estimating the performance of MPSoC applications with emphasis on bus traffic simulation. Our trace-driven workload model automatically generated from the application code accurately reflect the timing behavior of each processors in addition to the bus access timing. A variety of bus architectures with multiple memories, processors and DMACs can be simulated with near cycle accuracy with comparable speed of native SW execution.

Friday July 8

SESSION 15: Keynote

Philippe Magarshack, VP of ST-Microelectronics
Gaining 10x in power efficiency in the next Decade in Consumer Products
Analog/RF design is becoming digital, through accurate CAD modeling of RF effects, and compensation in digital. At the same time, Energy-efficient Digital is becoming Analog, thanks to elaborate CAD-enabled design techniques (well-biasing, power-switches, over-voltage, under-voltage, very high frequencies clocks, power distribution networks, timing margins reduction, process/voltage/Temperature compensation). In Analog/RF, improving the energy efficiency of consumer systems will be based on sensing continuously the System environment and tailoring the Emitted Power dynamically. Similarly in Digital, very-fine-grain closed-loop Dynamic-Voltage-Frequency-Scaling (DVFS) will enable to consume power only when and where required. Concurrently, heterogeneous co-design CAD methods will allow a holistic approach to Energy efficiency, taking into consideration not only the IC with its system design architecture, but also its packaging and power source, and possibly the antenna. Finally, a tailored advanced CMOS Process will provide tuned transistors and passive components to enable the Digital and Analog Design solutions needed, in conjunction with TSVs enabling optimized heterogeneous 3D stacking capabilities. A new generation of EDA tools, models and methods, providing a holistic view of the system, will enable the complete system optimization and a 10x power efficiency gain.

SESSION 16: Methodologies for MPSOC
(Mini-Keynote)

Bart Kienhuis, Compaan Design, Netherland
Using C-to-Dataflow for portable and efficient mapping on Heterogeneous MultiCore designs

Joachim Kunkel, Synopsys, USA
TBD

Rolf Ernst, Technical University of Brauschweig,
MPSoC for safety critical applications – from multicore to manycore
Last year we gave a short introduction on requirements and design methods for MpSoC in safety critical applications. The focus was on interference of safety critical and non-critical applications via shared resources and the corresponding requirements imposed by safety standards. In many-core systems interference is even stronger due to multi-hop NoCs and memory hierarchies. The talk gives an overview on first results of a research platform under development as part of the European ARTEMIS project RECOMP.

Frédéric Pétrot, Tima Laboratory, INP-Grenoble, France
An analytical model for Many-Functionally Asymmetric Core SoC Architectures
Amdahl‚ Åôs law is a fundamental tool for understanding the evolution of performance as a function of parallelism. Following a recent trend on the timing and power analysis of general purpose many-core chips using this law, we carry out an analysis aiming at many-core SoCs integrating processors sharing the same core instruction set but each potentially having additional extensions. For SoCs targeting well defined classes of applications, higher performances can be achieved by adding application specific extensions either through the addition of instructions in the core instruction set or through coprocessors leading to architectures with functionally asymmetric processors. This kind of architecture is becoming technically viable and advocated by several groups, but the theoretical study of its performance properties is yet to be performed. Using Amdahl‚Äôs law, this short talk aims at showing the performance and cost advantage of using extensions for many-core SoCs.

Sri Parameswaran, University of New South Wales, Australia
Security and Reliability in an MPSoC environment.
Security and Reliability of MPSoCs is an emerging area of concern in embedded systems. Security is jeopardized by Software attacks and Side Channel attacks. This talk examines two solutions: one a solution to thwart software attacks in MPSoCs; and, two an MPSoC solution for side channel attacks. We have implemented these solutions and will discuss the efficacy of these methods.

Koichiro Yamashita, Fujitsu, Japan
A software centric system for OS scheduling scheme in the upstream phase
Software parallelization in embedded system without thinking of specific hardware overhead can cause performance decrease. For this problem, we introduce evaluation methodology on real product based platform with OS by using ESL model. And we report one of the interesting result that the scheduling approach opposite to the past is effective for performance and power consumption.

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Biography

Kerry Bernstein
Kerry Bernstein is a Research Staff Member at the IBM T.J. Watson Research Center. Mr. Bernstein received the B.S degree in electrical engineering degree from Washington University in St.Louis in 1978, and has been with IBM for 32 years. He holds 105 US Patents, and is a co-author of 3 college textbooks and multiple papers on high speed / low power CMOS. His research interests are in the areas of emerging device / circuit architectures for future high performance computing; 3D integration; radiation effects in CMOS; and Silicon-on-Insulator (SOI) transistors. Mr. Bernstein is an IEEE Fellow.

Sungjoo Yoo
Sungjoo Yoo is currently an assistant professor at Department of EE, POSTECH, Korea. He received Ph.D. from Seoul National Univ. in 2000. He worked as researcher at TIMA laboratory, Grenoble France from 2000 to 2004. He was also with Samsung System LSI from 2004 to 2008, where he led system-on-chip architecture design team and was involved in architecture designs for mobile application processors and solid state disk. He joined POSTECH in 2008. His research interests include memory and storage hierarchy from cache, DRAM, phase-change RAM to solid state disk. He received Best Paper Award at International SoC Conference (ISOCC) in 2006 and Best Paper nominations at Design Automation Conference (DAC) in 2011 and Design Automation and Test in Europe (DATE) in 2002 and 2009.

Hsien-Hsin Sean Lee
Dr. Hsien-Hsin S. Lee is an Associate Professor in the School of Electrical and Computer Engineering at Georgia Institute of Technology. He has a Ph.D. degree in Computer Science and Engineering from the University of Michigan, Ann Arbor. His current research interests include computer architecture, memory hierarchy, low-power VLSI, cyber security, and 3D-IC technology. Prior to joining Georgia Tech in 2002, he spent 6 years as a senior processor architect and a research staff member at Intel Corporation designing Pentium III processor and conducted research for Itanium architecture and one year at Agere Systems as an architecture manager for their StarCore DSP architecture. Dr. Lee’s received the Horace H. Rackham Distinguished Dissertation Award from the University of Michigan, an NSF CAREER Award, a Department of Energy Early CAREER Award, and the Georgia Tech ECE Outstanding Jr. Faculty Award. He had co-authored 3 papers that won Best Paper Awards, one paper selected in 2010 IEEE MICRO Top Picks, and holds 4 U.S. patents. He is a senior member of both the ACM and the IEEE.

Hiroyuki Tomiyama
Hiroyuki Tomiyama received his Ph.D. degree in computer science from Kyushu University in 1999. From 1999 to 2001, he was a visiting postdoctoral researcher with the Center of Embedded Computer Systems, University of California, Irvine. Then, he worked for Institute of Systems & Information Technologies/KYUSHU as a researcher and Nagoya University as an associate professor. In 2010, he moved to Ritsumeikan University as a full professor. His research interests include system-level design methodology for embedded systems and MPSoC. He was General Co-Chair of MPSoC 2010, and is now Editor-in-Chief of IPSJ Transactions on System LSI Design Methodology.

David Kleidermacher
David Kleidermacher is Chief Technology Officer at Green Hills Software where he is responsible for technology strategy, platform planning, and solutions design. Kleidermacher is a leading authority in systems software and security, including secure operating systems, virtualization technology, and the application of high robustness security engineering principles to solve computing infrastructure problems. Kleidermacher earned his bachelor of science in computer science from Cornell University and has been with Green Hills Software since 1991

Raphaël David

Soo-Ik Chae
Ph. D. Electrical Engineering from Stanford University, Stanford, California, in 1987
“Defect Detection and Classification for VLSI Pattern Inspection”
M. S. Electrical engineering from Seoul National University, Seoul, Korea, in 1978
B. S. Electrical engineering from Seoul National University, Seoul, Korea, in 1976

Dongrui Fan
Dongrui Fan received Ph.D. degree of computer architecture in 2005 from Institute of Computing Technology, Chinese Academy of Sciences, and now he is an Associate Professor of the institute since 2006. Dongrui participated Godson-1 and Godson-2 micro-architecture designs from 2000. Currently, his research interests focus on multi-core/many-core architecture and low-power embedded micro-architecture design. He leads AMS research group and designed the new processor models -- Godson-X and Godson-T, which are research on the new generation Godson series chips. Dongrui is Technical Committee Member of Computer Architecture and System Software of China Computer Federation (CCF) and HiPEAC/IEEE/ACM member. He serves as a Program Committee Member of the International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT) in 2009 and 2010. He also serves as the Program Vice-Chair of Multi-Core and Parallel Systems in the International Conference on Parallel Processing (ICPP) in 2011, and several workshops. Dongrui leads or participates many Chinese national science projects and EU FP7 project. He published papers on MICRO, IPDPS, CF, EuroPar, Trans. on HiPEAC, etc.

Paolo Faraboschi
Paolo is a Distinguished Technologist in the Intelligent Infrastructure Lab of HP Labs. From 2004 to 2010, Paolo led a research group in Barcelona (Spain) focused on system-level simulation and modeling of next-generation computing systems (the COTSon simulator was released as open source in 2010). From 1995 to 2003 Paolo was the technical lead of the Custom-Fit Processors Project at HP Labs in Cambridge (MA). In that role, he was the principal architect of the the Lx/ST200 family of VLIW embedded processor cores (in partnership with STMicro), which as of 2009, shipped in over 70 million embedded devices. Paolo holds a Ph.D (Dottorato) in EECS (1993) and an M.S. (Laurea) in Electrical Engineering (1989) from the University of Genoa (Italy). He is an active member of the computer architecture community, and regularly serves in program and organization committees. He was Program Chair for HiPEAC'10, MICRO'41, MICRO'34, CASES'03; General Chair for MICRO'38 and CASES'05. He is a co-author of the book "Embedded Computing: a VLIW approach to architecture, compilers and tools", serves in the industrial advisory board of the HiPEAC European network of excellence, and is currently an Associated Editor of ACM TACO.

Patrick Blouet
Patrick Blouet is an electronic and computer science engineer. He holds a Master degree from ENSERB in 1981 in France. He started working in a large telecommunication company where he designed a number of medium size systems in the domain of private PABX. Then he moved in a startup doing system engineering. He developed numerous systems in the domain of hard real time, telecommunications, large multi-processing, image processing and storage. During this period, he used to run large projects with multiple partners. He then took the position of BU manager for all the imaging products in charge of marketing and technical strategy. He joined STMicroelectronics where he took the responsibility of development tools and applications for DSP’s. He developed the activities and was heavily involved in the creation of a large DSP R&D centre in Singapore. In addition, he took the marketing responsibility of DSP products for telecom applications. He then moved in the application processor division where he held the Architecture director position in charge of defining all the mobile products. He then moved to the Corporate partnerships and public affairs team at ST-Ericsson where is in charge of building and running collaborative projects at European, national and regional level.

John Goodacre

Michael Chang
VP of Engineering
Global Unichip Corp.
Mr. Chang possesses over twenty five years of designing ASIC and SoC experience, and has served many key R&D positions. .
Prior joining GUC, Mr. Chang has served as Sr. Director of ASIC Design in ESS, VP of VLSI design in Divio, and VP of R&D in Prolific

Bing Sheu
Bing SHEU obtained B.S. degree in EE from National Taiwan University, and his Ph.D. degree in EE from University of California, Berkeley. He joined EE faculty at University of Southern California (Los Angeles, CA) during 1985 – 1998, and was promoted to Full Professor in 1997 in Electrical Engineering with adjunct appointment in Biomedical Engineering. He moved to microelectronics and design automation industry in early 1999 and joined TSMC in 2006 as Director at R&D Design and Technology Platform. He served as Editor-in-Chief of IEEE Transactions on VLSI Systems in 1997 & 98, as Founding Editor-in-Chief of IEEE Transactions on Multimedia in 1998 & 99, as Society Vice President for Conferences in 1998, as President of IEEE Circuits and Systems Society in 2000, and on Editorial Board of Proceedings of the IEEE during 2005 – 2010. Dr. Sheu is recipient or co-recipient of IEEE Transactions on VLSI Systems Best Paper Award in 1995, IEEE Guillemin-Cauer Award in 1997, IEEE CAS Society Golden Jubilee Award in 2000, IEEE CAS Society Meritorious Service Award in May 2004, and Education Contribution Award from Ministry of Education (Taiwan) in 2006. Dr. Bing Sheu is a 1996 IEEE Fellow, and 1998 Senior Fulbright Scholar from US Information Agency. He is granted Honorary Chair Professor from National Chiao Tung University in 2003 and Honorary Chair Professor from National Taiwan University of Science and Technology in 2011.

Koji Inoue
Koji Inoue was born in Fukuoka, Japan in 1971. He received the B.E. and M.E. degrees in computer science from Kyushu Institute of Technology, Japan in 1994 and 1996, respectively. He received the Ph.D. degree in Department of Computer Science and Communication Engineering, Graduate School of Information Science and Electrical Engineering, Kyushu University, Japan in 2001. He is currently an associate professor of the Department of Advanced Information Technology, Kyushu University. His research interests low-power cache architectures, power-aware computing, high-performance computing, and secure computing.

Yukoh Matsumoto
Dr. Yukoh Matsumoto is the chief architect, and president and CEO of TOPS Systems Corp. Currently, he leads “3-D stacked heterogeneous Multi-Processor chip project” funded by NEDO, as well as “Ultra-Android project”, embedded software platform to utilize heterogeneous Multi-Core processors, funded by METI. In his 25 years of carrier, he has architected and designed over 10 advanced Multi-Core processors, x86 microprocessors, and DSPs. He funded TOPS Systems Corp. in 1999, and received the Takeda Techno-Entrepreneurship Award in 2001. Prior to it he has held several positions within Texas Instruments Research and Development organization and within V.M. Technology, a microprocessor start-up in Japan. He received the Dr. of Information Sciences (the Ph.D.) degree from the Graduate School of Tohoku University, Sendai, Japan, in 2007 and participated in the MOT (Management of Technology) program at the Graduate School of Engineering in Tokyo University from 2004 through 2005.

David Atienza
David Atienza received his MSc and PhD degrees in Computer Science and Engineering from Complutense University of Madrid (UCM), Spain, and Inter-University Micro-Electronics Center (IMEC), Belgium, in 2001 and 2005, respectively. Currently, he is Professor and Director of the Embedded Systems Laboratory (ESL) at EPFL, Switzerland, and Adjunct Professor at the Computer Architecture and Automation Department of UCM. His research interests focus on design methodologies for low-power embedded systems and high performance Systems-on-Chip (SoC), including new thermal management techniques for 2D and 3D Multi-Processor SoCs, design methods and architectures for wireless body sensor networks, dynamic memory management and memory hierarchy optimizations, as well as novel architectures for logic and Network-on-Chip (NoC) interconnects. In these fields, he is co-author of more than 150 publications in prestigious journals and international conferences. He has received a Best Paper Award at the IEEE/IFIP VLSI-SoC 2009 Conference, two Best Paper Award Nominations at the ICCAD 2006 and DAC 2004 conferences. He is an Associate Editor of IEEE Transactions on CAD (in the area of System-Level Design), IEEE Letters on Embedded Systems and Elsevier Integration: The VLSI Journal. He is also an elected member of the Executive Committee of the IEEE Council of Electronic Design Automation (CEDA) since 2008 and a GOLD member of the Board of Governors of IEEE Circuits and Systems Society (CASS) since 2010.

Yuan Xie
Yuan Xie received the B.S. degree in electronic engineering from Tsinghua University, Beijing, in 1997, and the M.S. and Ph.D. degrees in electrical engineering from Princeton University in 1999 and 2002, respectively. He is currently Associate Professor in Computer Science and Engineering department at the Pennsylvania State University. Before joining Penn State in Fall 2003, he was with IBM Microelectronic Division's Worldwide Design Center. Prof. Xie is a recipient of the National Science Foundation Early Faculty (CAREER) award, the SRC Inventor Recognition Award, IBM Faculty Award, and several Best Paper Award and Best Paper Award Nominations at IEEE/ACM conferences. He has published more than 100 research papers in journals and refereed conference proceedings, in the area of EDA, computer architecture, VLSI circuit designs, and embedded systems. His current research projects include: three-dimensional integrated circuits (3D ICs) design, EDA, and architecture; emerging memory technologies; low power and thermal-aware design; reliable circuits and architectures; and embedded system synthesis. He is currently Associate Editor for ACM Journal of Emerging Technologies in Computing Systems (JETC), IEEE Transactions on Very Large Scale Integration Systems (TVLSI), IEEE Transactions on Computer Aided Design of Integrated Circuits (TCAD), IEEE Design and Test of Computers, IET Computers and Digital Techniques (IET CDT).

Ahmed Jerraya

Tohru Ishihara
Tohru Ishihara received the B.E., M.E., and D.E. degrees in computer science from Kyushu University, Fukuoka, Japan, in 1995, 1997, and 2000, respectively. From 1997 to 2000, he was a Research Fellow of the Japan Society for the Promotion of Science. For the next three years, he was a Research Associate in the VLSI Design and Education Center, University of Tokyo. From 2003 to 2005, he was with Fujitsu Laboratories of America as a Research Staff of an Advanced CAD Technology Group. From 2005 to 2011, he was with Kyushu University as an Associate Professor. In April 2011 he joined Kyoto University, where he is currently with the Dept. of Communications and Computer Engineering. His research interests include low-power design methodologies and power management techniques for embedded systems. Dr. Ishihara is a member of the IEEE, ACM, IPSJ and IEICE. He was an executive committee member of the DATE conference from 2009 to present and an OC member of the ASP-DAC 2001, 2008, 2009 and 2011. He was in the TPC of the DATE 2007, 2008, and 2009, the ISQED 2008 and 2009, and the ISLPED from 2009 to present.

Yoshinori Takeuchi
Yoshinori Takeuchi is Associate Professor of Graduate School of Information Science and Technology at Osaka University. He received his B.E., M.E. and Dr. Eng. degrees from Tokyo Institute of Technology in 1987, 1989 and 1992, respectively. From 1996, he has been with the Osaka University. He was a visiting scholar in University of California, Irvine from 2006 to 2007. His research interests include System Level Design, VLSI design and VLSI CAD. He is a member of ACM, and Computer, CAS, SSC, and SP Society of IEEE.

Edith Beigne
Edith BEIGNE received a M.S. in microelectronics engineering from the National Polytechnic Institute of Grenoble in 1998. She joined in 1998 CEA-Leti working on asynchronous NoC and mixed-signal circuits focusing on high energy efficiency systems. She is now in charge of low power and variability research activities in advanced CMOS technologies specifically focusing on dynamically adaptive MPSoC architectures.

Youn-Long Lin
Dr. Youn-Long Lin is a professor with the Department of Computer Science, National Tsing Hua University, Taiwan. His research interest includes: high-level synthesis, video coding architecture design, and SOC design methodology. He received B.S. from National Taiwan Institute of Technology in 1982 and Ph.D. from the University of Illinois at Urbana-Champaign in 1987. He is a co-founder of Global UniChip Corp.

Yuichi Nakamura

Emil Matus
Dr. Emil Matus is senior scientist at Vodafone Chair Mobile Communication Systems where he is leading HW research group. He received his MS and PhD degrees in Electrical Engineering from University of Technology in Kosice. Prior to joining Vodafone chair in 2003 he was research associate at University of Technology in Kosice focused on wavelet transform and image compression. His current research interests include algorithms and many-core programmable architectures for communication signal processing.

Jenq-Kuen Lee
Jenq Kuen Lee received the B.S. degree in computer science from National Taiwan University in 1984. He received a Ph.D. in computer science from Indiana University in 1992, where he also received a M.S. (1991) in computer science. He was a key member of the team who developed the first version of the pC++ language and SIGMA system while at Indiana University. He was also a recipient of the most original paper award in ICPP '97 with the paper entitled "Data Distribution Analysis and Optimization for Pointer-Based Distributed Programs". In 2005, he received Taiwan MOEA funding to lead a research team to develop compilers for PAC VLIW DSP processors with distributed register files by collaborating with ITRI STC. The efforts were renewed in 2008 for another three years focusing on embedded multi-core compilers and applications. He is also a recipient of Google Research Award (Mountain View), 2009. He has also been a director for Taiwan MOE ESW (embedded system software) consortium since 2008. In 2010, he received a Taiwan MOEA economic contribution award (Deep Plow Award) for his contribution in embedded compiler research. His research interests are in optimizing compilers, embedded compilers, and computer architectures.

Alain Artieri
Alain Artieri is a Senior Fellow at ST-Ericsson, a 50/50 joint venture by STMicroelectronics and Ericsson, where he is responsible for the technology roadmap for application processors. Prior to taking on this role, he was a Senior Director at Qualcomm, in charge of the Multimedia & Graphics cores development. Before this assignment, he worked for more than 2 decades for STMicrolectronics where he founded the Set Top Box SoC product family in the early 90’s and Nomadik application processor product family in 2001. During his 26 years experience in the Semiconductor industry, he has developed advanced ICs and SoCs accross 12 CMOS generation with key contributions to multimedia architecture, power management architecture, low power design solutions and SoC architecture. He is now embarked on the exciting role of defining the right technologies for Mobile Computing. Alain Artieri graduated from “ENST Paris” in 1984.

Ruchir Puri
Ruchir Puri received M.Tech. degree in electrical engineering from Indian Institute of Technology (IIT), Kanpur, India in 1990, and a Ph.D. degree in electrical and computer engineering from University of Calgary, Alberta, Canada in 1994. From 1994 to 1995 he was a Member of Scientific Staff with Advanced System Design Tools group at NORTEL Research (BNR). He joined VLSI Design Automation group at IBM T. J. Watson Research Center, Yorktown Heights, NY in 1995, where he manages and leads a research group focused on Physical and Logic Synthesis. He has been working on design and automated synthesis solutions for IBM's high-performance and power-efficient microprocessors and ASICs in advanced CMOS technologies and has received several IBM awards for his work including an IBM Outstanding Technical Achievement award and IBM Execute Now award. He has also been an adjunct assistant professor in the Department of Electrical Engineering at Columbia University, New York where he taught VLSI design and Circuits.
Dr. Puri received 1993 ACM/IEEE Design Automation Scholarship and the 1992 and 1993 Alberta Microelectronics research scholarships for his doctoral research. He has served on program committees of most major VLSI Design Automation conferences, NSF and SRC panels and has been an invited speaker at numerous conferences such as ISSCC, DAC, and ICCAD. He is the inventor of 21 U.S. patents and has authored over 75 publications on the design and synthesis of low-power and high-performance circuits. He currently serves as Associate Editor of IEEE Transactions on Circuits and Systems I and in the past has served as Associate editor of the Transactions on Circuits and Systems II. He serves on ACM SIGDA Physical Design Technical committee. Ruchir was elected an IEEE Fellow in 2006 for contributions to automated logic and physical design of electronic circuits.

Lasse Harju
Dr. Lasse Harju is a SoC architect at ST-Ericsson. His current responsibilities cover SoC system control and power management topics, ranging from low-level circuit technologies to firmware implementations. Dr. Harju earned his PhD degree in 2006 from Tampere University of Technology in Finland. His academic work focused on programmable wireless baseband implementations.

Rudy Lauwereins
Rudy Lauwereins is vice president of imec, which performs world-leading research and delivers industry-relevant technology solutions through global partnerships in nano-electronics, ICT, healthcare and energy. He is responsible for imec’s Smart Systems Technology Office, covering energy efficient green radios, vision systems, (bio)medical and lifestyle electronics as well as wireless autonomous transducer systems and large area organic electronics. He is also a part-time Full Professor at the Katholieke Universiteit Leuven, Belgium, where he teaches Computer Architectures in the Master of Science in Electrotechnical Engineering program.
Before joining imec in 2001, he held a tenure Professorship in the Faculty of Engineering at the Katholieke Universiteit Leuven since 1993. He had obtained a Ph.D. in Electrical Engineering in 1989. Professor Lauwereins has authored and co-authored more than 380 publications in international journals, books and conference proceedings. He is a senior member of the IEEE.

Kees van Berkel
He received an M.Sc. degree (cum laude) in EE from the Delft University of Technology in 1980 and a PhD degree in CS from the Eindhoven University of Technology (TU/e, 1992); is fellow at ST-Ericsson; previously fellow at Philips Research, NXP Research, and ST-NXP Wireless; is a part-time professor in Computing Science at the TU/e since 1996; published about 50 papers, and about 25 patent (applications); pioneered asynchronous VLSI during the 90’s, published a book on Handshake Circuits, and contributed to their industrial application; co-architected the EVP, a vector DSP for modem and SDR applications, currently in production currently researches software defined radio, digital wireless communication, multi-core architectures, vector processors, and low power.

Chris Rowen
Dr. Chris Rowen is the founder, chief technical officer, and a member of the board of directors of Tensilica, Inc. He founded Tensilica in July 1997 to develop automatic generation of application-specific microprocessors for high-volume communication and consumer systems. He was a pioneer in the development of RISC architecture at Stanford in the early 1980s and helped start MIPS Computer Systems Inc. in 1984, where he serves in a variety of functions including as vice president for microprocessor development and as the manager for MIPS' European operations. When Silicon Graphics purchased MIPS, he became the technology and market development leader for Silicon Graphics Europe. In 1996, he was hired by Synopsys to be vice president and general manager of the Design Reuse Group. This experience helped him realize the limitations of current microprocessors for embedded design, which led him to the founding of Tensilica. He received a B.A. in physics from Harvard University and M.S. and Ph.D. in electrical engineering from Stanford University. He is well known as a speaker on complex technology and business issues, has authored the book, "Engineering the Complex SOC" (published by Prentice Hall in 2004) and numerous technical articles and conference papers, and he holds over two dozen US and international patents.

Yankin Tanurhan
Dr. Yankin Tanurhan is Vice President of Engineering for DesignWare® ARC Processor Cores and Non-Volatile Memory IP solutions at Synopsys. Before joining Synopsys, Dr. Yankin was Vice President and General Manager of Virage Logic's Processors, SoC Infrastructure and NVM Solutions business units. Prior to this he served as Vice President of Actel's Advanced Applications and System Solutions, where he lead Actel's new architecture design, IP and MPU business units, system and hardware tools and product validation departments. He was also responsible for leading Actel’s embedded FPGA, embedded processor and DSP activities.
Previously, Dr. Yankin served as the director of the department of electronic systems and microsystems of FZI (Forschungszentrum Informatik), and held senior positions at the Institute of Computer Aided Circuit Design and Informatik Forum in Germany, where he lead international hardware/software co-design projects.
Dr. Yankin has authored more than 100 papers in refereed publications. He holds a B.S. and M.S. in Electrical and Computer Engineering from Rheinisch Westfaellische Technische Hochschule (RWTH) in Aachen, Germany and a Dr. Ing. degree summa cum laude in Electrical Engineering from the University of Karlsruhe (TH) in Karlsruhe, Germany.

Martin Schoeberl
Martin Schoeberl is associate professor at the Technical University of Denmark, at the Department of Informatics and Mathematical Modelling. His research focus is on time-predictable computer architectures and on Java for hard real-time systems. He developed the time-pridictable Java processor JOP and led the research on a chip-multiprocessor version of JOP. This platform was developed within the EU project JEOPARD (Java Environment for Parallel Realtime Development). His current research focus is on time-predictable chip-multiprocessors for hard real-time systems.

Kees Goossens
Kees Goossens received his PhD from the University of Edinburgh in 1993 on hardware verification using embeddings of formal semantics of hardware description languages in proof systems. He worked for Philips/NXP Research from 1995 to 2010 on networks on chip for consumer electronics, where real-time performance and low cost are major constraints. He was part-time full professor at the Delft university of technology from 2007 to 2010, and is currently full professor at the Eindhoven university of technology, where his research focusses on composable (virtualised), predictable (real-time), low-power embedded systems.

Martti Forsell
Martti Forsell is a Chief Research Scientist of Computer Architecture and Parallel Computing at VTT, Oulu, Finland, as well as an Adjunct Professor in the Department of Electrical and Information Engineering at the University of Oulu. He received M.Sc., Ph.Lic., and Ph.D. degrees in computer science from the University of Joensuu, Finland in 1991, 1994, and 1997 respectively. Prior to joining VTT, he has acted as a lecturer, researcher, and acting professor in the Department of Computer Science, University of Joensuu. Dr. Forsell has a long background in parallel and sequential computer architecture and parallel computing research. He is the inventor of the first scalable high-performance CMP architecture armed with an easy-to-use general-purpose parallel application development scheme (consisting of a computational model, programming language, experimental optimizing compiler, and simulation tools) exploiting the PRAM-model, as well as a number of other TLP and ILP architectures, architectural techniques and development methodologies and tools for general purpose computing. At the application-specific front, he has acted as the main architect of the Silicon Hive CSP 2500 processor and programming methodology aimed for low-power digital front-end radio signal processing. He is a co-organizer of the Highly Parallel Processing on a Chip (HPPC) workshop series. His current research interests are processor and computer architectures, chip multi-processors, networks on chip, models of parallel computing, functionality mapping techniques, parallel languages, compilers, simulators, and performance, silicon area and power consumption modeling. He has published 90 scientific publications, holds one patent on processor architectures and programming methodology, and has participated to various research and development projects in cooperation with academia and industry. Recently has has been named as the leader of a large VTT funded project, REPLICA, aiming to remove the performance and programmability limitations of chip multiprocessor architectures with a help of a strong PRAM model of computation.

K. Charles Janac
K. Charles Janac is the Chairman, President and Chief Executive Officer of Arteris Holdings.   Arteris has pioneered the market for Network on Chip (NoC) interconnect IP and Tools for on-chip communications in System on Chip(SoC) type semiconductors.
Charlie has over 20 years experience building technology companies. He started his technology career as employee number two of Cadence Design Systems (originally SDA Inc.), a NYSE traded company. Subsequently, he served as CEO of HLD Systems, Smart Machines and Nanomix. Charlie also served as Entrepreneur-in-Residence at Infinity Capital, an early stage venture capital firm in Palo Alto, California and has worked for Exxon Chemical Company in technical and sales positions.
Born in Prague, Czech Republic, he holds both a B.S. and M.S. degrees in Organic Chemistry from Tufts University and an MBA from Stanford Graduate School of Business. He holds a patent in polymer film technology. Charlie, his wife Lydia, and their two children reside in Los Altos Hills, California.

Gerhard P. Fettweis

Marcello Coppola
Marcello Coppola graduated in Computer Science from the University of Pisa, Italy in 1992. He joined the Transputer architecture group in INMOS, Bristol (UK), doing research in multi-core communication together with the key technical people of Transputer technology with special focus on the C104 router. Later, he moved to the Advanced System Technology R&D group of STMicroelectronics, in which started and leaded different research programs. The first one on modeling, completed with SystemC2.0 language definition, OSCI standardization and SystemC deployment within STMicroelectronics. The last one, where he and his team developed the first industrial multiple-die Network-on-Chip called Spidergon STNoC, ended with the deployment company-wide of the technology and first integration in different 32nm multimedia and mobile SoCs. Currently, he is a Director in Home Entertainment & Displays Group, of STMicroelectronics, in Grenoble (France), and he is in charge of the advanced R&D for SoC interconnect, verification and modeling. His research interests include several aspects of design technologies for System-on-Chip, with particular emphasis to modeling, verification, network-on-chip, multi-core architecture and programming models. He's co-author and/or co-editor of different books and of over 50 technical articles. He is serving or has served as program and/or organizing member in numerous top international conferences and workshops. He has also served as reviewer for international conferences as well as journals and holds a number of patents with both the European and US patent offices.

Pieter van der Wolf
Pieter van der Wolf is a Senior Staff Product Architect at Synopsys. He received his MSc and PhD degrees in Electrical Engineering from Delft University of Technology. He was an Associate Professor at Delft University of Technology before joining Philips Research in 1996. In 2006 he joined NXP Semiconductors when it was spun out of Philips Electronics. In 2009 he joined VirageLogic, which was subsequently acquired by Synopsys. He has worked on a broad range of topics including (multi-) processor architectures and system design methodologies.

Kasahara Hironori
Dr. Hironori Kasahara is a Professor at Department of Computer Science and Engineering and Director of Advanced Multicore Processor Research Institute, Waseda University, Tokyo, Japan and a member of IEEE Computer Society Board of Governors. He received a Ph.D. degree from Waseda University in 1985, and was a visiting scholar in the University of California at Berkeley in 1985, a fulltime assistant professor in 1986, associate professor in 1988 and professor in 1997 at Waseda University. Also, he was a visiting researcher at the University of Illinois at Urbana-Champaign, Center for Supercomputing R&D in 1989-90.
He led several Japanese National Projects such as METI/NEDO Advanced Parallelizing Compiler, Multicore for Real-time Consumer Electronics, Leading Research for Low Power Manycores. He served as a member of MEXT Earth Simulator Architecture Advisory Board, Next Generation Supercomputer Evaluation Committee, High Performance Computing Infrastructure Committee and so on. He published more than 180 reviewed papers, 28 symposium papers, 129 technical reports, 154 Annual Convention Papers with 97 invited talks, 8 patents (25 patent applications) and 400 articles of news papers, TV, magazines, web news and so on. Also, he has served as a PC chair, a PC or a Publication Chair of many conferences supported by IEEE, ACM, IPSJ, such as SC, ICS, ASPLOS, PPoPP, ICPP, IPDPS, ICPADS, CONPAR, JSPP, LCPC and so on. He has received the IFAC World Congress Young Author Prize, the IPSJ Sakai Special Research Award, the Grand Prix runner-up prize at the 2008 LSI of the Year, Best Research Award at the Intel Asia Academic Forum and IEEE Computer Society Golden Core Member. His research interests include parallelizing compilers, multicore and manycore architectures.

Paul Heysters
Dr. Paul M. Heysters is CEO and co-founder of Recore Systems. He has more than 7 years experience working in the field of reconfigurable computing. In his career, he has worked for high-technology companies in both Europe and the USA, including Ericsson, Philips and Chameleon Systems. Before joining Recore Systems, he conducted PhD. research on coarse-grained reconfigurable computing at the University of Twente (The Netherlands) and worked collaboratively with industry organizations.
Paul is coordinator of the EU funded Cutting-edge Reconfigurable ICs for Stream Processing (CRISP) research consortium (www.crisp-project.eu). He is a member of the coordination board of the Sensor Technology Applied in Reconfigurable Systems (STARS) project (www.starsproject.nl). Moreover, he is a board member of the Dutch Shared EDA association.
Paul received his MSc degree in Computer Science from the University of Twente in 1998 and his doctorate in 2004 for his PhD thesis entitled “Coarse-Grained Reconfigurable Processors – Flexibility meets Efficiency.”

Kees Vissers
Kees Vissers graduated from Delft University in the Netherlands. He worked at Philips Research on processors, image processing, HDTV processing and Hardware – Software co-design. He was a visiting industrial fellow at CMU, working on High Level Synthesis, and a visiting industrial fellow at UC Berkeley working on streaming models of computation. He was a director of Architecture at Trimedia, CTO at Chameleon Systems, and consulted for Intel, Nvidia and Xilinx. He is today a distinguished engineer in the CTO office of Xilinx building teams that work on designing and programming systems that consists of processors and reconfigurable fabric. He has a quantitative approach to tools and architectures.

Kiyoung Choi
Kiyoung Choi is a professor of the Department of Electrical Engineering and Computer Science, Seoul National University. He received B.S. degree in electronics engineering from Seoul National University in 1978 and M.S. degree in electrical and electronics engineering from KAIST in 1980. He received Ph.D. degree in electrical engineering from Stanford University in 1989. He worked for Cadence Design Systems from 1989 to 1991. His research interests are in computer architecture, embedded systems design, low power design, and design automation.

Ian O'Connor
Ian O'Connor(IEEE S'95-M'98-SM'07, IEE S'87-M'98) is Professor for Heterogeneous and Nanoelectronics Systems Design in the Department of Electronic, Electrical and Control Engineering at Ecole Centrale de Lyon, France. He is currently head of the Heterogeneous Systems Design group at the Lyon Institute of Nanotechnology, of which he is also one of the vice-directors. Since 2008, he also holds a position of Adjunct Professor at Ecole Polytechnique de Montréal, Canada. His research interests include design methods and tools for physically heterogeneous systems on chip, and their application to novel system architectures based on non-conventional devices. He has authored or co-authored over 100 book chapters, journal publications and conference papers and has been workpackage leader or scientific coordinator for several national and European projects. He also serves as an expert with the french Observatory for Micro and Nano Technologies (OMNT), IFIP WG10.5 and Allistene.

Omar Hammami
Omar Hammami is Professor at ENSTA ParisTech/DGA.
His research interests are MPSOC automatic design methodologies (MPSOC synthesis, NOC synthesis), Applied mathematics and optimization for MPSOC, System engineering and embedded systems. He has been involved in numerous research and industrial projects. He is currently involved in 3D IC EDA tools development, 3D Multicore design and Multi-FPGA based Multicore. He is a consultant for several companies and startups involved in multicore SOC EDA and ASIC 2D chip designs.

Kunio Uchiyama
Kunio Uchiyama, Corporate Officer and Chief Scientist of Hitachi, Ltd., received the B.S., M.S. and Ph.D degrees from Tokyo Institute of Technology, Japan. Since 1978 he has been working for the Central Research Laboratory, Hitachi, Ltd., Tokyo, Japan, on design automations, mainframe computers, and microprocessors. He has been leading the research and development of SuperH microprocessors from the beginning of the 1990s. He also serves as a visiting professor of Waseda University. He was awarded by the national Medal of Honor with Purple Ribbon in 2004 for his contribution of high-performance low-power microprocessor development for digital consumer products.

Pierre Paulin
Dr. Pierre G. Paulin is director of System-on-Chip Platform Automation at STMicroelectronics, Ottawa, Canada. He is responsible for the platform programming tools of a large-scale multi-processor SoC fabric in ST. Previously, he was director of Embedded Systems Technologies for ST in Grenoble, France. Before this, he managed embedded software tools and high-level synthesis R&D with BNR, the research lab of Nortel Networks. His research interests include design automation technologies for multi-processor systems, embedded systems and system-level design. He obtained a Ph.D. from Carleton University, Ottawa, and B.Sc. and M.Sc. degrees from Laval University, Quebec. He won the best presentation award at DAC in 1986, and won the best paper award at ISSS-Codes in 2004. He is a member of the IEEE.

Nakajima Masaitsu
Dr. Masaitsu Nakajima received B.S. degree from Tokyo Institute of Technology in 1985 and joined Matsushita Electric Industrial Co., Ltd., (currently Panasonic Corporation), where He had been working on the development of 64bit RISC and a superscalar processor for parallel computer system, the ASICs for 3DO interactive multi-media player, Panasonic's proprietary 32bit embedded CPU AM33 series CPU core, and IPP, Instruction Parallel Processor, for UniPhier Platform. He received Ph.D. degree from Kobe University in 2007. He is interested in low power and high performance processor architecture, processor implementation, circuit design and design methodology. Currently, he is a general manager of processor core technology group, Digital Core Development Center, Panasonic and he takes responsibility for general purpose CPUs, media processors, low power techniques for UniPhier based SoCs.

Norbert Wehn
Norbert holds the chair for Microlectronic System Design in the department of Electrical Engineering and Information Technology at the University of Kaiserslautern. He has more than 200 publications in various fields of microelectronic system design and holds several patents. He is chairman of the European Design Automation Association, Chairman of the Research Center “Ambient Systems” at TU Kaiserslautern, associate editor of various journals and member of several scientific advisory boards. In 2003 he served as program chair for DATE 2003 and as general chair for DATE 2005 respectively. His special research interests are VLSI-architectures for mobile communication, forward error correction techniques, low-power, advanced SoC architectures and reliability issues in SoC.

Takashi Miyamori
Takashi Miyamori received the B.S. and M.S. degrees in electrical engineering from Keio University, Japan, in 1985 and 1987, respectively. In 1987, he joined Toshiba Corporation, where he was engaged in the research and development of microprocessors. He is currently a Chief Specialist and working on the development of configurable processor cores, media processors,image signal processing processors, and multi-core processors.

Tsuyoshi Isshiki
Tsuyoshi Isshiki has received B.E. and M.E. degrees from Tokyo Institute of Technology in 1990 and 1992, respectively, and received PhD in Computer Engineering from University of California at Santa Cruz in 1996.   He is currently an Associate Professor at Tokyo Institute of Technology, Dept. of Communications and Integrated Systems. His research interests include multimedia SoC designs, Multiprocessor SoC design methodology and its design tools.

Philippe Magarshack
Philippe Magarshack is Technology Group Vice-President and Central CAD & Design Solutions General Manager at STMicroelectronics in Crolles, France. He started his career at AT&T Bell Labs in Murray Hill, NJ, in 1984, as a designer for the first 32bit microprocessor family. In 1989 in joined Thomson-CSF in Grenoble, France, in charge of Military ASIC design methods and libraries. In 1994, he joined the Central R&D group at STMicroelectronics in Crolles, France, where he took several responsibilities in advanced CMOS design platform management. Magarshack now oversees ST's EDA and libraries strategy, enabling products in advanced CMOS and derivatives, and Smart Power.

Bart Kienhuis
Bart Kienhuis, PhD, received a MSEE from Delft University of Technology in 1994 and he received his Ph.D. from Delft University of Technology in 1999. During his Ph.D., he has worked at Philips Research in Eindhoven on a design methodology for high performance video architectures for consumer products. This has resulted in the Y-chart approach and the abstraction pyramid concepts. Both concepts had a clear impact on the embedded system design community. From 1999 until 2000, Bart Kienhuis was a Post Doc in the group of Prof. Edward A. Lee at the University of California at Berkeley, where he worked on the Ptolemy system. At Berkeley, he also started the Compaan project which has lead to an innovated technology behind Compaan Design BV. Bart is the founder and director of Compaan Design. He is also affiliated to Leiden University as an assistant professor in at the Computer Science department. He has served on many technical program committees of leading conference in embedded system design and compilers like CODES, CASES, DATE, EUROPAR, SCOPES.

Joachim Kunkel

Rolf Ernst
Rolf Ernst received a diploma in CS and a Dr.-Ing. in EE from the University of Erlangen-Nuremberg, Germany, in 81 and 87. From 88 to 89, he was with Bell Laboratories, Allentown, PA. Since 90, he has been a professor of electrical engineering at the Technische Universität Braunschweig, Germany, where he chairs a university institute of 65 researchers and staff. He was Head of the Department of Electrical Engineering from 1999 to 2001.
His research activities include embedded system design and design automation. The activities are currently supported by the German "Deutsche Forschungsgemeinschaft" (corresponds to the NSF), by the German BMBF, by the European Union, and by industrial contracts, such as from Intel, Daimler, Ford, Bosch, and Volkswagen. He gave numerous invited presentations and tutorials at major international events and contributed to seminars and summer schools in the areas of hardware/software co-design, embedded system architectures, system modeling and verification.
He is an IEEE Fellow and served as an ACM-SIGDA Distinguished Lecturer. He is a member of the German Academy of Science and Engineering, acatech.

Frédéric Pétrot
Frédéric Pétrot received the PhD degree in Computer Science from Université Pierre et Marie Curie (Paris VI), Paris, France, in 1994, where has been Assistant Professor in Computer Science until September 2004. From 1989 to 1996, F. Pétrot was one of the main contributors of the open source Alliance VLSI CAD system whose working team received the French Seymour Cray award in 1994. Since 1996, he headed the work on the definition and implementation of the Disydent environment, oriented toward the specification and implementation of multiprocessor SoCs. He joined TIMA in September 2004, and holds a professer position at the Ensimag, Institut Polytechique de Grenoble, France. Since 2007, he heads the System Level Synthesis group of TIMA, where his main focus is the architectural enhancement of MPSoCs and their programming.

Sri Parameswaran
Sri Parameswaran is a Professor in the School of Computer Science and Engineering at the University of New South Wales. He also serves as the Program Director for Computer Engineering. His research interests are in System Level Synthesis, Low power systems, High Level Systems and Network on Chips. He serves on the editorial boards of ACM Transactions on Embedded Computing Systems, the Eurasip Journal on Embedded Systems and the Design Automation of Embedded Systems. He has served on the Program Committees of Design Automation Conference (DAC), Design and Test in Europe (DATE), the International Conference on Computer Aided Design (ICCAD), the International Conference on Hardware/Software Codesign and System Synthesis (CODES-ISSS), and the International Conference on Compilers, Architectures and Synthesis for Embedded Systems (CASES).

Koichiro Yamashita
Koichiro Yamashita has received the M.E. degrees in computer science from Waseda University, and joined Fujitsu LTD in 1995. He had worked for parallel operating system on the vector-parallel super computing system (VPP series) for 5 years, and moved to electric device group (EDG) of Fujitsu LTD in 2001. In 2006, he moved to Fujitsu Laboratories. In 2009, he works as manager of mobile phone BU of Fujitsu LTD and senior researcher of platform technology labs of Fujitsu Laboratories concurrently. In 2009, he assumed the position of the chairman of SMP Working Group of Symbian Foundation.