MPSoC_Lectures

Lectures

Printer friendly Agenda:

Monday August 14--- Software day

Keynote:

Software and the Concurrency Revolution
Herb Sutter, Microsoft, USA
Although driven by the industry-wide hardware shift to multicore hardware architectures, concurrency is primarily a software revolution. We are now seeing the initial stages of the next major change in software development, as over the next few years the software industry brings concurrency pervasively into mainstream software development, just as it has done in the past for objects, garbage collection, generics and other technologies. This talk summarizes the issues involved, gives an overview of the impact, and describes what to expect over the coming decade.

Mini-keynotes:

Semantic Anchoring of Domain Specific Modeling Languages
Janos Sztipanovits, ISIS-Vanderbilt University, USA
Model analysis and model-based code generation require the precise specification of DSMLs. This is partly achieved by metamodeling languages and metamodels describing the abstract syntax (concepts, relationships and wellformedness rules) of DSMLs. While metamodeling and metaprogrammable tools have proved to be quite effective in software and systems engineering, it has become clear that the lack of formally specified semantics of DSML¬-s creates potential risk in a wide range of applications. The talk describes an infrastructure for semantic anchoring of DSML-s. The infrastructure includes a set of "semantic units" that provides reference semantics of basic behavioral categories and models of computations using the Abstract State Machines framework the Model Integrated Computing (MIC) tool suite.
Modular Communication-Centric MPSoC Architectures
Pieter van der Wolf, Philips Research Laboratories, The Netherlands
Philippe Kajfasz, Thales, Land and Joint Systems, France
Rolf Ernst, TU Braunschweig, Germany
Programming models and Software Architecture for MPSoC
Ahmed Jerraya, TIMA Laboratory, France
For the design of classic computers the Parallel programming concept is used to abstract HW/SW interfaces during high level specification of application software. The software is then adapted to an existing multiprocessor platform using a low level software layer that implements the programming model. Unlike classic computers, the design of heterogeneous MPSoC includes also the building of the processors and other kinds of hardware components required to execute the software. In this case, the programming model hides both hardware and software interfaces that may include sophisticated communication and synchronization concepts to handle parallel programs running on different processors. When the processors are heterogeneous, multiple software stacks may be required. Additionally, when specific Hardware peripherals are used, the development of Hardware dependent Software (HdS) requires a long, fastidious and error prone development and debug cycle. This presentation deals with parallel programming models to abstract both hardware and software Interfaces in the case of heterogeneous MPSoC design. Different abstraction levels will be needed. For the long term, the use of higher level programming models will open new vistas for optimization and architecture exploration like CPU/RTOS tradeoffs.
Metaphors for Concurrent Computation
Steven P. Levitan, University of Pittsburgh, Department of Electrical and Computer Engineering, USA
We review the traditional methods for decomposing problems for parallel execution (e.g., by Dimension, by Time, by Function and by Instance) and propose the use of higher level metaphors such as Interpretation, Transformation, Simulation, and Optimization to guide the software system architect for multi core applications.

In-depth presentations:

System-level exploration tools for MPSoC designs
Peter Flake, Imperas Inc., USA
We will present a unified, system design automation approach for the design and programming of multi-processor integrated circuits (MPICs). The proposed methodology combines hardware and software specializations, enabling a unified process to accelerate the overall production of next generation devices. We will present results from practical experiments with an early implementation of this approach and discuss avenues for future work for the MP design tool community.
Bandwidth, Bandwidth, Bandwidth
Paul Franzon, NC State University , Raleigh NC , USA
There are many applications where system performance is largely determined by bandwidth, to, from and within the core computational SoCs. Examples include DSP, cognition, networking, graphics and supercomputing. The key metrics to consider are Gbps/mw.$ and effective-Gops/Gbps. The first is optimized through the appropriate choice of technology, while the latter is mainly dependent on the hardware algorithm chosen. In this talk, both angles to maximizing system performance will be illustrated, including the use of emerging interconnect technologies, and bandwidth-effective hardware algorithms.
Maximizing parallelism in NP
Ran Giladi, EZchip technologies Ltd. & Ben-Gurion University, Israel
Software Defined Radio - A High Performance Embedded Challenge
Scott Mahlke, University of Michigan, USA
Wireless communication protocols today have a computationally demanding workload that has to be supported by mobile terminals with limited energy budgets. Traditionally these goals were satisfied by an ASIC solution. However, the need to interoperate between a wide range of protocols has lead to research into programmable hardware. This talk will analyze the computational requirements of wireless protocols. Several of the key characteristics that emerge from this analysis include small data types, and high degrees of vector and task parallelism. We conclude by proposing a strawman programmable architecture that takes advantage of these characteristics.
Reconfigurable Multiprocessor System-on-Chip for Embedded Applications
Thierry Collette, Head of Architectures and Design Unit, CEA LIST, France
For many years, instruction-level parallelism has been driving the design of embedded system architectures. However, only a limited amount of such parallelism is currently possible in VLIW or superscalar microprocessors, and this also limits system performances. Use of more coarse-grained parallelism is likely being the main trend of the future. The coarse-grained approach calls for executing multiple threads simultaneously within a single execution unit (SMT), or dispatching threads to separate logical processing units (CMP). Thread-level parallelism is also becoming necessary in embedded systems, to meet demand for greater computational efficiency. Within this framework, our talk will present a new CMP architecture that can support dynamic migration and pre-emption of tasks, thanks to a concurrent, prefetched configuration mechanism, while offering a specific data sharing solution. Our heterogeneous and reconfigurable multiprocessor platform also supports real-time execution and power management. Its tasks are controlled by a dedicated HW-RTOS that allows on-line dynamic scheduling of independent real-time and non-real-time tasks. Performance of this MPSOC architecture will be illustrated on embedded applications, as the MPEG-4 AVC encoder one.
Networks-0n-Chip (NoC) for 3D Architectures
Vijaykrishnan Narayanan, Computer Science and Engineering Department, Pennsylvania State U., USA
Three-Dimensional chips and Networks-on-Chip (NoC) are two solutions aimed at addressing the growing interconnect design complexity. This talk will present the design of a hybrid network-on-chip customized for a 3D Chip-Multiprocessor memory system. The talk will also briefly present other techniques that are being currently being explored for the interconnect fabric of 3D Chips.

Tuesday August 15--- Hardware day

Keynote:

Reinventing the Microprocessor for MPSOC
Chris Rowen, Tensilica, USA
The microprocessor has been with us for more than 30 years, and has evolved in response to available silicon technology and to electronic system requirements. Now, shifts in basic silicon scaling and embedded systems are forcing a significant shift in the architecture, the use and the design process for microprocessors. This talk outlines the evolution of architectural and micro-architectural features of processors over the past three decades and sketches the (often modest) performance benefit and the (often severe) silicon cost of generic processor innovations. Automatic processor generation offers four dimensions to the configuration of the microprocessor – instruction set, memory system, processor interface, and processor control functions. Together, these enable quick creating of an enormous range of optimized processors at various degrees of application specialization Moreover, these processors incorporate versatile inter-processor communications channels that permit order-of-magnitude improvement in communications bandwidth and energy efficiency. This shift from generic processors, often with hardwired logic accelerators, to configurable processor-based system design appears fundamental to improved design productivity and end-product efficiency for a wide range of embedded systems.

Mini-keynotes:

Bus Architecture Optimization Method Based on System-Level Profiling
Masaharu Imai, Graduate School of Information Science and Technology, Osaka University, Suita, Japan
Bus architecture optimization is one of the most important issues in IP-based SoC design because bus architecture heavily affects the system performance. This paper proposes an efficient bus architecture optimization method based on system-level profiling of target system, where the problem is formalized as a combinatorial optimization problem then a design space exploration method is proposed to find optimum solutions.
Automatic Instruction-Set Extensions
Paolo Ienne, EPFL, Switzerland
In the quest for the best compromise between the flexibility of software and the area and energy efficiency of dedicated hardware, most embedded processors for ASICs and FPGAs now offer designers the possibility to add special instructions dedicated to the specific application they will be required to run. In practically all cases, it is up to the designer to identify the best instructions to add for a given set of applications. This talk will quickly survey a few advances of the last years in trying to determine automatically what parts of a software application are best implemented in a special dedicated functional unit: from the basic selection of simple instructions, to the inclusion of local memory elements visible to the programmer, up to some issues in efficient implementation. The talk will also address another advantage of automatically identifying instruction-set extensions: automatic customization enables a quick feedback to algorithm designers on the different potentials for microarchitectural optimizations which variants of functionally equivalent algorithms offer.
Securing Next-generation Mobile Platforms: The User-to-Device Authentication Issue
Srivaths Ravi, NEC Laboratories America, USA
User authentication, which refers to the process of verifying the identity of a user, is becoming an important security requirement in mobile appliances such as cell phones, personal digital assistants, etc. This talk will motivate the need for biometrics based user authentication, highlight challenges involved in deploying biometric solutions in mobile appliances from a SoC designer's perspective, and outline a holistic HW/SW solution that we are evolving at NEC Labs to address them.
Emerging Challenges for MP-SoC Platforms
Pierre G. Paulin, Director, SoC Platform Automation, Advanced System Technology, STMicroelectronics, Ottawa, Canada
This presentation will address the key challenges for the SoC platforms of the future, based on emerging requirements for low-power, and increased system- and component-level latencies. We also touch on design for manufacturability and fault-tolerance issues which are becoming increasingly important for the 45nm process node and beyond.
Application-level Memory Optimization for MPSOC
Gabriela Nicolescu , Ecole Polytechnique de Montréal, Canada
Application-Specific Instruction-set Processors as a Cornerstone of Heterogeneous MPSoCs: What, Why, and How?
Gert Goossens, CEO, Target Compiler Technologies, Belgium
SoCs will soon integrate many tens of complex system functions, each with their own optimal balance of performance, flexibility, energy consumption, communication, and design time. The traditional model of a (configurable) general-purpose processor core with a number of hardware accelerators may no longer suffice. This presentation will discuss the importance of application-specific instruction-set processor (ASIP) cores as computational nodes in heterogeneous MPSoCs. To enable the successful introduction of multi-ASIP SoCs, new retargetable processor design tools are emerging, offering fast architectural exploration, hardware synthesis, software compilation, inter-ASIP communication, and verification. The tools must support the broadest possible range of architectures, from small microprocessors, over DSP dominated cores, to VLIW and vector processors.

In-depth presentations:

DaVinci (TM) technology for digital video applications
Deepu Talla, System Architect, Texas Instruments, USA
Digital video innovation is now possible with the ever-increasing capability of MPSoCs. DaVinci (TM) technology is the first complete offering to enable the rapid growth of digital video for both portable and wall-plugged devices. This talk will present the architecture of the first DaVinci (TM) products - TMS320DM644x processors built around dual processor cores, multiple accelerators, advanced memory controller and bus architecture, optimized peripheral set, and minimal power consumption.
Concurrent Exploration of Memory and Communication Architecture for MPSoCs
Nikil Dutt, Center for Embedded Computer Systems, UC Irvine, USA
Memory and communication architectures critically affect the power, performance and cost of MPSoC designs. Furthermore, the memory and communication architectures also critically affect each other, shaping the volume, sequencing and timing of on-chip memory traffic. Existing MPSoC exploration efforts typically decouple the memory and communication exploration steps, leading to inferior, or worse yet, infeasible MPSoC designs. We present an approach for concurrent exploration of memory and communication architectures and show the efficacy of this approach on some industrial MPSoC designs.
Dependability of VLSI Systems
Hiroto Yasuura, Director of System LSI Research Center, Kyushu University, Japan
Since SoCs are used in various fields of social systems, which are directly related with human lives, properties and privacies. Dependability of SoC including security and reliability is an important research issue of MPSoC. In this talk, several new features of SoC related with dependability are summarized.
Balancing Systems Architectures Empirically: Optimizing Computation and Communication Structures
Graham Hellestrand,VaST Systems Technology Corporation, USA
Embedded Parallel Processors for Cell Phones
Ulrich Ramacher, Infineon Technologies AG, Germany
Multiprocessors in Wireless Multimedia Devices
Mika Kuulusa, Nokia, Finland

Wednesday August 16--- Video/Communication day

Keynote:

Programming modern FPGA platforms
Ivo Bolsens, Vice President and Chief Technical Officer, Xilinx, USA
Modern FPGA platforms have capabilities that are well suited to assume a central role in the implementation of complex embedded systems. Today the design flow for FPGA has been largely characterized by a hardware centric approach. We will argue that there are many additional opportunities for mapping complex applications to these FPGA platforms. The requirement is the exposure of the high computational efficiency of FPGAs matched by high bandwidth concurrent memory access and rich on-chip interconnectivity, combined with complete programmability. These requirements make FPGAs well suited for efficient implementation of signal processing, packet processing and high performance computing applications. We will discuss proposals for different domain specific programming environments that rely on flexible and soft template architectures, represented by API' s that match the characteristics of a specific application domain. Next, high level design flows compile high level programming constructs on these soft architectures that efficiently harness the intrinsic hardware capabilities of the FPGA platform. The soft architecture abstracts the detailed hardware implementation and facilitates scalable solutions, matching differently resourced FPGA devices to differing performance and sophistication requirements for packet processing or signal processing or high performance computing. This approach leads to a new breadth of system centric programming tools for FPGAs.

Mini-keynotes:

Architectures for next-generation machine vision
Wayne Wolf, Princeton University, USA
We are working with Verificon Corporation and Yokogawa Electric on a next-generation computer vision system that performs several algorithms. The most time-consuming algorithm is optical flow, which is considerably more complex than the motion estimation algorithms used in video compression. This talk will describe the challenges introduced by this project and the techniques used to design a high-performance, low-cost architecture for this application.
An H.264/AVC Main Profile Video Codec Accelerator in a MpSOC Platform
Youn-Long Steve Lin, National Tsing Hua University & Global Unichip Corp., Taiwan
A Configurable Processor for Outer Modem Application
Norbert Wehn, University of Kaiserslautern, Germany
Channel coding is key for reliable transmission in wireless communication systems and many advanced channel coding techniques (e.g. Convolutional codes, Turbo-Codes, LDPC codes) exist. There is a great demand for flexibility due to varying QoS requirements and multi-standard support. On the other hand wireless communication is a cost sensitive market segment in which area, energy and performance are key issues. Hence we have to trade-off flexibility versus implementation costs. In this talk we will present a configurable application specific processor for outer modem applications which is capable of performing various channel coding schemes of existing and emerging standards.
Scalable processing through software threading
John Goodacre, ARM, UK
Next-generation microprocessor performance challenges
Olivier Franza, Central Technology Development Group, Intel Massachusetts, Inc., USA
Microprocessor performance improvements have relied on process technology scaling, aggressive design frequency increase, and evolutionary architectural advancements - like superscalar, out-of-order, and simultaneous multi-threading architectures. In recent years however, the advent of critical power constraints - "hitting the power wall" within the realm of mainstream air-cooling techniques- and the growing complexity of new process technology benefit exploitation - challenging interconnects, constraining design rules, increasing variability, escalating leakage - have constrained architects and designers to be more creative and explore new ways to improve microprocessor performance. Dual and multiple core designs, voltage and frequency scaling (VFS), and more complex power management designs are pioneering, but where will performance increase reside for next-generation microprocessors?
Evolving MPSoC solutions
Jan Madsen, Technical University of Denmark, Department of Informatics and Mathematical Modelling, Denmark
A key challenge of implementing an embedded systems application on a heterogeneous multiprocessor SoC platform is to find the right partitioning of the application onto the platform architecture. The right partitioning is dependent on the characteristics of the processors and the network connecting them, as well as the application. We present an evolutionary approach to solve the problem of mapping a set of task graphs onto a heterogeneous multiprocessor platform. The objective is to meet all real-time deadlines subject to minimizing system cost and power consumption, while staying within bounds on local memory sizes and interface buffer sizes. Our approach allows to explore the mapping onto a fixed platform architecture as well as to a flexible platform architecture where architectural changes are explored during the mapping. We demonstrate our approach through an exploration of a smart phone application.

In-depth presentations:

The SB3011 Multiprocessor System on a Chip for Software Defined Radio Handsets
John Glossner, CTO & EVP, Sandbridge Technologies Inc., USA
This presentation describes the Sandbridge Sandblaster real-time software defined radio platform. Specifically we describe the SB3011 system on a chip multiprocessor. We describe the software development system that enables real-time execution of communications and multimedia applications. We provide results for a number of interesting communications and multimedia systems including UMTS, DVB-H, WiMAX, WiFi, and NTSC video decoding. All results presented are from completely implemented systems from RF through baseband.
Challenges of MPSOC Communication, Computation and Design Flow
Jari Nurmi, Institute of Digital and Computer Systems, Tampere University of Technology, Finland
In this talk, means to address the application-specific computation requirements in MPSOC computation are reviewed, emphasizing the role of coarse-grain reconfigurable accelerators. Our latest solution is called BUTTER, merging floating-point capabilities into a processor array. On the communication side, the key is how to combine Network-on-Chip (NoC) and computation efficiently. Our solution is to use hierarchical approach to squeeze both global and local traffic to a single NoC scheme. The role of reconfigurability in NoCs is also addressed. Throughout the talk some issues regarding how MPSOCs should be designed will be raised. Also the future challenges of processor architectures capable to tolerate the increasing latencies are briefly visited. The talk is concluded to a dream MPSOC architecture.
Standardized APIs Facilitate Software and Hardware Development of Embedded Multicore Designs
Markus Levy, President, Embedded Microprocessor Benchmark Consortium, & President, The Multicore Association, USA
The number of cores per chip will roughly double with each processor generation. Furthermore, chips will exhibit increasingly higher degrees of heterogeneity in terms of cores, interconnect, hardware acceleration, and memory hierarchies. The industry must determine how to efficiently harness this processing capability. The necessity to move beyond parallel computing and SMP paradigms and towards heterogeneous embedded distributed systems will likely drive changes in how embedded software will be created. Thus, it will drive changes into development tools, run-time software, and languages. Programming such systems effectively will require new approaches. A number of barriers must be addressed to enable a better software paradigm. This session covers a variety of multicore development topics that will help address these barriers including standardized APIs for resource management, communication, and debug. In particular, the discussion will explain the current approaches being adopted by the Multicore Association. Another important topic in this area relates to the performance analysis of multicore platforms. Therefore, this discussion will also go into detail and provide an update on EEMBC's approach to multicore benchmarking.
Benchmarking System applications on Virtex5 FPGA platforms
Kees Vissers, Head of the DSP Systems and Applications research group, Xilinx Labs. USA
Todays FPGAs are a combination of processors, embedded memory, configurable I/O, hard blocks, and conventional lookup tables. In this talk I will introduce some of the novel technology of the new Virtex5 family. Virtex5 is implemented in new 65nm technology, combined with some architectural improvements. In conventional processor design there is a well established benchmark methodology. However in novel architectures, that include new logic, multiple processors and novel memory subsystems this is not easy. In this presentation I will show the result of HDTV system application benchmarks and routing benchmarks. I will show that the new Virtex5 fabric has improved performance for logic of 35% and that the synthesized clock speed increased 23% over Virtex4. I will show a complete breakdown of the actual memory sizes used in relevant applications and how this drives the choices in embedded memory. I will show the 36% improvement in the implementation of large logic multiplexors over Virtex4 and I will illustrate how this improves the implementation of soft Risc processors, called microblaze. Towards the end of the talk I will illustrate how benchmarking will drive novel synthesis and architectures choices for future Xilinx architectures. Results of actual power measurements will be correlated with the estimations.
Design challenges for wireless smart cameras
Marc Heijligers, Philips, The Netherlands
High-performance low-power video processing gives particular challenges in terms of system architecture, memory organisation, IC implementation, programming, protocols and data fusion techniques. The presentation will give an overview of a system architecture based on an SIMD processor template, and trends for the future.
A 90nm CMOS Low-Power GSM/EDGE Multimedia-Enhanced Baseband Processor with 380MHz ARM9 Core and Mixed-Signal Extensions
Steffen Buch, Infineon Technologies AG, Germany

Thursday August 17--- Interconnect/Networking day

In-depth presentations :

Profiling based architecture optimization for heterogeneous MPSoC
Rainer Leupers, RWTH Aachen University, Software for Systems on Silicon, Germany
Configurable/extensible processor cores are key building blocks of today´s heterogeneous MPSoC architectures. In the first part of this talk, we present a profiling-driven automated design flow for processor ISA configuration and performance estimation. Specifically, we address design tool requirements for forthcoming reconfigurable processor architectures. In the second part, we propose a high-level SW performance estimation methodology that allows to optimize the spatial and temporal task-to-processor mapping in a complete MPSoC/NoC virtual prototyping context. Finally, we give results from case studies and briefly discuss important future challenges in MPSoC compilation and simulation.
MP-SoC Software Development with Virtual Hardware
Eshel Haritan, CoWare Inc., USA

Mini-keynotes:

Quality of Service in an Uncertain World
Kees Goossens, Embedded Systems Architectures on Silicon (ESAS) group, IC Design sector, Natuurkundig Laboratorium (NatLab), Philips Research , The Netherlands
Embedded systems must be robust, i.e. not fail in unexpectedly or undesirably. Moreover, they must offer the highest possible (video, audio, etc.) quality to the user, given constraints arising from operating conditions (power, etc.) However, application demands, operating conditions, and manufacturing results vary increasingly. This talk discusses how we may offer quality of service to the user, given the variations in requested (application) and offered (hardware) performance.
Algorithms and Architecture for Next Generation GPUs
Donald S. Fussell, Department of Computer Sciences, The University of Texas at Austin, USA
We describe an approach to ray tracing dynamic scenes in real time that is intended to form the basis for a new generation of graphics processing units that can provide far greater realism than that available on current systems. To make this possible, we are developing a new ray tracing algorithm with a number of novel features. Our algorithm is intended to run on a fine-grained, multithreaded highly parallel graphics processing unit for which efficient cache utilization is crucial. We will describe our dynamic ray intersection scheduler and show how it allows us to make highly efficient use of small caches to support very high performance ray tracing on our target machine. Unlike previous means of scheduling ray-object intersections, our approach involves explicit sharing of cache space between ray and object information and explicit management of the loading of both types of information to optimize performance.
A fully scalable and optimised MPSoC AVC/H.264 HD Video Codec
Ian Walsh, MnD, France
NoC is the answer! (What was the question?)
Drew Wingard, CTO, Sonics Inc. USA
The academic interest in NoC seems to grow exponentially. However, real-world commercial applications seem late in arriving. This talk investigates key architectural issues that dominate MPSoC designs, and attempts to describe which NoC techniques should be applied as MPSoC applications and architectures evolve.
Hannu Tenhunen, School of Information Technology, Royal Institute of Technology (KTH), Sweden
QoS: is or isn't the main differentation point for a NoC?
Marcello Coppola, STMicroelectronics, France

In-depth presentations:

Statistical Interconnect Design for MPSOC
Wayne Burleson, ECE Dept., University of Massachusetts Amherst, USA
Interconnects play an increasing role in all aspects of MPSOC design, ranging from critical timing paths, to significant aspects of the area/power/energy budget, reliability and security issues, and an increasing portion of the overall design and verification effort. With technology advances has come increasing uncertainty in the form of process, temperature, voltage and workload variations. Statistical approaches have become necessary in most aspects of design in order to predict costs, performance and reliability measures.
This talk reviews recent advancements in this area focusing on on-chip interconnects in MPSOCs. New unified methods of analysis are proposed as well as architectural and circuit-level methods for mitigating the impact of statistical variation. The talk will be tailored to provide higher-level abstractions for MPSOC system designers.
Will the NoC swallow the SoC?
Ran Ginosar, Head of VLSI Systems Research Center Technion, Israel Institute of Technology, Israel
The NoC paradigm shift has already taken place, and increasingly many SoCs employ NoCs, or NoC-alike schemes. However, the transition should be made with caution: Not every networking concept is suitable for on-chip implementation. Effective NoCs are probably irregular, blocking, provide no service guarantee, employ minimal buffering, use simple routing, and do not balance traffic. They should be optimized for area and power, and the result may not be very elegant.
Recent QNoC research results and work-in-progress will be presented: Hot spots, irregular routing, fast serial links, caches on NoC, and sync/async router design.
Designing Network-On-Chip implementing a cost efficient QoS methodology to guaranty Multimedia Application performance
Alain Fanet, Arteris, France
Designing a System-On-Chip (SoC) for critical consumer application such as Multimedia, requires Quality-Of-Service (QoS) techniques to provide throughput guaranties. Often QoS is viewed as a global Network-On-Chip (NoC) characteristic when it is a property attached to few specific data flows such as the video stream in the previous example. So, designing a SoC with this fundamental difference as guideline enables guaranteed performance in a very cost-efficient implementation solution.
In the proposed presentation, ARTERIS will describe the methodology called “adaptive-QoS”. Some concept and principle will be reviewed to understand the idea of this methodology. Following a demonstration of designing a NoC for a Multimedia application as an example. Then, ARTERIS will describe its QoS tool box and strategy which is part of its commercial offering and use by R&D universities and the SoC industry.
A Reconfigurable NOC Platform for 4G Telecom applications
Jean-René Lèquepeys, Head of ASIC Design Department, LETI/DCIS/SCME, CEA Grenoble, France
Multi-core platforms are a reality... but where is the software support?
Rudy Lauwereins, IMEC, Belgium
The presentation starts by observing that both manufacturing non-recurring engineering (NRE) cost as well as design NRE cost are increasing rapidly. This leads to a need for flexible platforms accompanied by improved application mapping tools.
I then discuss the architectural evolution. Individual processors evolve towards supporting more instruction level parallelism (e.g. 2 dimensional VLIW processors) and more data level parallelism (e.g. extreme-SIMD processors with very wide data words). Platforms become more flexible and heterogeneous, and are more power efficient. Mounting variability, leakage and interconnect problems urge for disruptive technologies, using for example run-time calibration and monitoring to “live with what Mother Nature gives us”.
Next, I discuss why today's industrial design flows fall short for mapping complex applications on heterogeneous multi-core platforms. I introduce three additional steps that are required on top of current design flows: (1) software washing consisting of platform independent source-to-source code optimizations, (2) designer-guided functional and data parallelization and (3) platform dependent mapping. I motivate the need for co-development of algorithm, platform and mapping.
Finally, I present two industrial case studies. The first is a software defined radio baseband platform, covering 802.11a, b, g, n, UMTS, 802.16e, WiBro, 4G cellular, Bluetooth, Zigbee, DAB, DMB, DVB-H and GPS in an area of 10 mm2 in 90nm CMOS, with peak power consumption of 300mW for 802.11n 2-antenna MIMO WLAN. The second is a multi-format multimedia platform, covering MPEG-4, AVC and SVC encoding/decoding at up to HDTV quality in a scalable way. For both case studies, I present the architectural choices and the used design technology.
The presentation ends with a summary of the key observations.
The Diopsis Multi-Processor Tile of SHAPES
Pier Stanislao Paolucci, ATMEL Roma and INFN Roma, Italy
Diopsis is a family of RISC + VLIW DSP multiprocessor systems on chip combining high-performance floating point numerical processing, advanced control capability and system interfaces. By adding a Distributed Network Processor, we will create the elementary tile of SHAPES, a Scalable Software Hardware Architecture Platform for Embedded Systems aiming to Petaflops. Promising results reached on the C Compilation systems for the mAgicV VLIW DSP provide solid foundations for a model based and communication aware programming environment.

Friday August 18--- Business day: Understanding the Value Chain for MPSoC

Keynote:

A New Business Model to Face the Challenges in the Ubiquitous Era
Masao Nakaya, Executive General Manager of Product Technology Unit, Renesas, Japan
The revolution of the semiconductor industry is due to improvements in performance. Increased integration level has exceeded most customers' requirements. Therefore, customers' requirements are shifting to price reduction of LSI products and then LSI manufacturers are striving to reduce silicon cost and design cost. In order to increase profit more, Renesas seeks a fount of value from not only LSI hardware itself, but also software, system, and value chain.

Mini-keynotes:

Processors in FPGAs - Quo Vadis ?
FPGA representative: Yankin Tanurhan, Sr. Director, Applications and IP Solutions, Actel Corporation, USA
From Tape-out to the Fab
EDA representative: Raul Camposano, CTO, Sr. VP and GM, Synopsys, USA
To be able to manufacture a design at advanced technology nodes, it has to undergo several steps after tape out. Mask synthesis is accomplished by a series of complex tools including RET (resolution enhancement technologies such as optical proximity correction), lithography or mask verification, fracturing and mask data preparation. In addition, designs have to fulfill a series of "manufacturing" constraints which are getting more complex such as CMP (Chemical Mechanical Polishing) rules and critical area rules. This talk gives a brief overview of what needs to happen after tape out and how this is changing the traditional design flow (before tape out).
The Critical Role of Algorithmic Engines in Consumer SoCs
Analyst: Jacques Benkoski, US Venture Partners, USA