Transferable Neural Architecture Search with Diffusion Models for the Real World


Hayeon Lee

Korea Advanced Institute of Science and Technology (KAIST)


Date: Friday, Sep 15, 2023, 14:00 - 15:00

Abstract:

Neural Architecture Search (NAS) is a promising AutoML technique for automating neural architecture design. However, its widespread applicability in real-world scenarios has been hindered by either excessive search time or limited generalization across different tasks. In this keynote, we present novel transferable NAS frameworks that transition the AutoML/NAS paradigm from a compute-intensive approach to a lightweight one by transferring knowledge from prior NAS tasks. Especially, this talk will focus on the introduction of a transferable task-guided Neural Architecture Generation (NAG) framework, which leverages diffusion models. The proposed NAG framework exhibits the ability to generate task-optimal architectures for a wide range of tasks, even those unseen during training. By leveraging prior knowledge from previous tasks and neural architecture distributions, the proposed NAG framework achieves remarkable efficiency in generating task-optimal neural architectures. These efforts allow us to rapidly search for computationally efficient and accurate AI models optimized for various edge devices and user datasets, thus enhancing the accessibility and practicality of AI for various real-world use cases.

Bio:

Hayeon Lee did her Ph.D. in computer science at KAIST in South Korea in the Aug. of this year (2023), under the supervision of Prof. Sung Ju Hwang. During her Ph.D., her research works received recognition for four spotlight presentations and one oral presentation from ICLR and NeurIPS. She has been honored with the Google Ph.D. Fellowship in 2022 and the AI/CS/EE Rising Stars Award in 2022 and 2023 from Google Explore Computer Science Research. She will join the FAIR team at Meta AI as a postdoc under the supervision of Yuandong Tian in Sep. 2023. Hayeon’s research interests include NAS, meta-learning, generative models, and LLM efficiency.