Gi-Ho Park
Sejong University, Korea
Lightweight Yet Powerful: Transforming Large AI Models with Model Compression Techniques for On-Device AI — Focus on Sparse Matrix Representation and Quantization
Abstract
Recently, large Artificial Intelligence (AI) models, such as Large Language Models (LLMs) and Visual Language Models (VLMs), have become prevalent in various human life applications. Larger AI models have been developed to achieve better performance based on the scaling law, and today, typical LLMs have over 100 billion parameters. While the performance of LLMs is impressive in various application areas, the computational requirements of large AI models are not suitable for mobile devices to provide on-device AI, which has the most impact on human life. To address these issues, model compression techniques like pruning, quantization, and knowledge distillation have been actively investigated. This talk discusses a new sparse matrix representation scheme for pruning and quantization developed for lightweight AI models.
Biography
Gi-Ho Park received the B.S., M.S., and Ph.D. degrees in Computer Science from Yonsei University, Seoul, Korea, in 1993, 1995, and 2000, respectively. He is currently a Professor in the Department of Computer Science and Engineering at Sejong University, Korea. Before joining Sejong University, he worked for Samsung Electronics as a Senior Engineer in Processor Architecture Lab., System LSI Division during 2002–2008. His research interests include advanced computer architectures, AI accelerator design, memory system design, System on Chip (SOC) design and low-power edge system design.
Download
gi-ho-park.pdfIf you wish to modify any information or update your photo, please contact Web Chair Hiroki Matsutani.