Speaker's Profile

Song Yao

DeePhi Tech, China

Bandwidth-Centric Deep Learning Processing through Software-Hardware Co-Design

Download Slides

Abstract

Bandwidth matters. The performance of deep learning computing platform largely depends on the memory system and the bandwidth. Sparsity and low precision are necessary to achieve high energy efficiency for deep learning inference. However, this cannot rely solely on hardware.

In this talk, we will introduce a software-hardware co-design methodology for accelerating deep learning algorithm which consists of compression, compilation, and hardware acceleration. A full-stack software development kit called DNNDK is proposed for developing compressed sparse neural networks. Two customized architectures called Aristotle and Descartes are also proposed to accelerate compressed neural networks. With the proposed methodology, even on FPGA, it is possible to achieve more than 10x energy efficiency compared with latest GPU product.

Biography

If you wish to modify any information or update your photo, please contact the Web Chair at the following address: arief.wicaksana@univ-grenoble-alpes.fr.

17th INTERNATIONAL FORUM ON MPSoC

for software-defined hardware

MPSoC 2017

For further information, please send email to Frédéric Pétrot

Send email