Gi-Ho Park

Sejong University, Korea

Lightweight Yet Powerful: Transforming Large AI Models with Model Compression Techniques for On-Device AI — Focus on Sparse Matrix Representation and Quantization

Abstract

Recently, large Artificial Intelligence (AI) models, such as Large Language Models (LLMs) and Visual Language Models (VLMs), have become prevalent in various human life applications. Larger AI models have been developed to achieve better performance based on the scaling law, and today, typical LLMs have over 100 billion parameters. While the performance of LLMs is impressive in various application areas, the computational requirements of large AI models are not suitable for mobile devices to provide on-device AI, which has the most impact on human life. To address these issues, model compression techniques like pruning, quantization, and knowledge distillation have been actively investigated. This talk discusses a new sparse matrix representation scheme for pruning and quantization developed for lightweight AI models.

Biography

Gi-Ho Park received the B.S., M.S., and Ph.D. degrees in Computer Science from Yonsei University, Seoul, Korea, in 1993, 1995, and 2000, respectively. He is currently a Professor in the Department of Computer Science and Engineering at Sejong University, Korea. Before joining Sejong University, he worked for Samsung Electronics as a Senior Engineer in Processor Architecture Lab., System LSI Division during 2002–2008. His research interests include advanced computer architectures, AI accelerator design, memory system design, System on Chip (SOC) design and low-power edge system design.

If you wish to modify any information or update your photo, please contact Web Chair Hiroki Matsutani.

Gi-Ho Park

Lightweight Yet Powerful: Transforming Large AI Models with Model Compression Techniques for On-Device AI — Focus on Sparse Matrix Representation and Quantization

Abstract

Biography

Contact

Active Pages