Lennart Bamberg
NXP
Rethinking AI/ML Acceleration at the Edge: Beyond TOPS
Abstract
Today’s AI/ML engines are dominantly assessed in terms of the peak Tera operations per second (TOPS). This talk outlines that — especially at the extreme edge — the TOPS do not correlate well with the acceleration AI/ML inference engines provide. We investigate what more than high TOPS a computer architecture requires for best-in-class acceleration of modern AI/ML models on resource-constrained devices. Instead of just considering the latency/speed as a metric to assess the quality of an AI/ML engine, we are investigating what other metrics are relevant to quantitatively and qualitatively assess edge-AI engines such as accuracy, ease-of-use, or cost. Through this talk, we want to trigger a discussion in the community on what simple yet relevant metric shall replace TOPS as the standard to compare the quality of embedded AI/ML inference engines. Is a COREMARK equivalent the answer for edge-AI?
Biography
Lennart Bamberg holds a position as Senior-Principal AI/ML Processor Architect and leads the Advanced Computer Architecture Group at NXP Semiconductors in Hamburg, Germany. He is in charge of the computer architecture for the NXPs high-end Neural Processing Unit (NPU) portfolio. Alongside his industry role, Lennart is also a Lecturer at the University of Bremen and the Technical University of Hamburg, where he teaches courses related to Edge-AI Hardware, Software, and Algorithms. Before joining NXP, Lennart held positions as a Principal Processor Architect at GrAI Matter Labs, served as an Invited Researcher at Georgia Tech, and worked as a Researcher at the University of Bremen. He earned his Ph.D. from the University of Bremen with highest honors in 2020.
If you wish to modify any information or update your photo, please contact Web Chair Hiroki Matsutani.