×
 
 Back to all events

Efficient Generative AI

 

28 Apr 2025, Monday6:15 PM - 7:45 PM (GMT +8:00) Kuala Lumpur, Singapore

 

32 Carpenter Street, 059911

0%

Overview

Generative AI, particularly large language models (LLMs) and diffusion models, has transformed how we interact with technology, enabling breakthroughs in text, image, and video generation. However, these models often come with significant computational and memory demands, making them challenging to deploy in real-time or on resource-constrained devices like edge platforms.

In this talk, we explore cutting-edge techniques designed to enhance the efficiency of generative AI without sacrificing performance. Through these advancements, we’ll highlight key design trade-offs, practical insights, and best practices for deploying next-generation generative AI models efficiently. Whether you’re a developer, researcher, or simply curious about the future of AI, this talk will provide a comprehensive look at how we’re pushing the boundaries of what’s possible with generative AI.

Schedule

Date: 28 Apr 2025, Monday
Time: 6:15 PM - 7:45 PM (GMT +8:00) Kuala Lumpur, Singapore
Location: 32 Carpenter Street, 059911

Speakers

Speaker's Profile:

Song Han, Associate Professor, Massachusetts Institute of Technology (Electrical Engineering & Computer Science)
Song Han

Song Han earned his PhD from Stanford, pioneering efficient AI computing techniques such as “Deep Compression” (pruning, quantization) and the “Efficient Inference Engine,” which first introduced weight sparsity to modern AI chips, making it one of the top-5 most cited papers in the 50-year history of ISCA (1953-2023). His innovations, including TinyML and hardware-aware neural architecture search (Once-for-All Network), have advanced AI model deployment on resource-constrained devices. His recent work on LLM quantization/acceleration (SmoothQuant, AWQ, StreamingLLM) has improved efficiency in LLM inference, adopted by NVIDIA TensorRT-LLM. He co-founded DeePhi (now part of AMD) and OmniML (now part of NVIDIA) and developed the open lecture series EfficientML.ai to share advances in efficient ML research. https://hanlab.mit.edu/songhan

Technology:
Industries: