Trends, Challenges and Solutions for Neural Processing in Edge Applications
The neural network architectures used in embedded real-time applications are evolving quickly. Among other key innovations, transformers are a leading deep learning approach for natural language processing and other time-dependent, series data applications. Now, transformer-based deep learning network architectures are also being applied to vision applications with state-of-the-art results compared to CNN-based solutions. In the first part of the presentation, we introduce transformers and contrast them with the CNNs commonly used for vision tasks today. We will examine the key features of transformer model architectures and show performance comparisons between transformers and CNNs.
Advanced driver assistance systems (ADAS), surveillance, digital TVs and cameras and other emerging AI applications that implement complex neural network models are putting greater demands on compute and memory resources, often for safety-critical functions. In the second part of the presentation, we introduce the new NPX6 Neural Processing Unit, a scalable and flexible architecture that addresses a wide range of application requirements:
- Scales up to 96K MACs in a single NPU, delivering up to 250 tera operations per second (TOPS) at 1.3 GHz on 5nm processes in worst-case conditions, or up to 440 TOPS by using new sparsity features, which can increase the performance and decrease energy demands of executing a neural network
- Integrates hardware and software connectivity features that enable implementation of multiple NPU instances to achieve up to 3,500 TOPS of performance on a single SoC
- Supports a wide range of Neural Network models – from CNN and RNN/LSTM, to Transformers and Recommenders
We highlight some of the most advanced hardware and software optimizations developed to reduce bandwidth and latency for leading edge NN models mapped automatically to the NPX6 NPU, using leading edge benchmarks to illustrate the benefits.
Dr. Pierre G. Paulin is Senior Director of R&D for AI and Vision Processors at Synopsys. He is responsible for system-level applications, architecture design and S/W programming tools for NPUs and DSPs supporting classical and deep learning based solutions. Prior to this, he was director of System-on-Chip Platform Automation at STMicroelectronics in Canada, working on platform programming tools for multi-processor systems-on-a-chip, targeting computer vision, video codecs and network processors.
This followed his previous positions as director of Embedded Systems Technologies for STMicroelectronics in Grenoble, France, and manager of Embedded Software and High-level synthesis tools with Nortel Networks in Canada. His interests include AI, embedded vision, video processing, multi-processor systems, and system-level design.
He obtained a Ph.D. from Carleton University, Ottawa, and B.Sc. and M.Sc. degrees from Laval University, Quebec. He won the best paper award at ISSS-Codes in 2004. He is a member of the IEEE.
If you wish to modify any information or update your photo, please contact the Web Chair at the following address: