😊 About me
Resume: Resume (Updated on 6 Jan, 2026)
Contact: vanzl3386 [at] gmail.com (preferred), vanzl [at] u.nus.edu, 121090525 [at] link.cuhk.edu.cn
Hi! I am Zhenglin Wan (万政霖), a first-year CS Ph.D. student at HPC-AI Lab at National University of Singapore (NUS), advised by Prof. Yang You and supported by NUS Research Scholarship. Previously, I worked as a full-time researcher in Nanyang Technological University (NTU) with Prof. Bo An. I received my B.Sc from Chinese University of Hong Kong (CUHK) on 2025 Fall (completed undergrad in Shenzhen campus). I also have spent a long time working with Prof. Ivor Tsang at Centre for Frontier AI Research (CFAR), IHPC, A*STAR in Singapore, and have interned at Hong Kong Generative AI Research & Development Center (HKGAI), led by provost Prof. Yike Guo in HKUST.
I am a RL (Reinforcement Learning) believer. Previously I mainly studied improving RL or inverse RL from algorithmic side: such as improving the policy diversity (like EBC) and policy expressiveness (like GoRL). Recently, I am focusing on RL and LLM Agent. Specifically:
- The synergy/relationship between RL post-training and world model in LLM Agent.
- Efficiency in LLM Agent systems and RL post-training infrastructures.
I am grateful to have received help and guidance for my career from so many people—such as Flint, Song, David, Ivor, Bo, Xingrui, and many others. I also understand many truly talented students may not have the opportunities to reach their full potential. So, if you are an undergrad or master student and believe I could offer advice, information, or opportunities that might help with your career, feel free to reach out to me by email. Meanwhile, I am also a student mentor in CFAR, IHPC, A*STAR and HPC-AI Lab in NUS. If you need visiting, intern, RA opportunities or unofficial Ph.D applicaiton consultation, you are welcomed to drop me an email for a chat.
My honor to have mentored and worked with these talented individuals:
Mentees (research-based, from CFAR, IHPC, A*STAR or NUS HPC-AI Lab):
- Yaxin Zhou (Master's student @ CMU, America; Author of [[CaveAgent]])
- Jingxuan Wu (Master's student @ UNC, America; Author of [[OSCAR]])
- Xu Pan (Master’s student @ Wuhan University)
- Chubin Zhang (Master’s student @ Beijing University of Posts and Telecommunications; Author of [[GoRL]])
- Dongchu Xie (Undergrad @ CUHK(SZ))
- Xin Yan (Undergrad @ Beijing Normal University)
🔥 News
- 2026.01 🎉 We released our new work CaveAgent (a new product-inspired function calling paradigm for LLM Agent) ([Paper], [Source Code])!
- 2025.12 🎉 We released our new work GoRL for online Reinforcement Learning with generative policies. ([Paper], [Code])
- 2025.10 🎉 Our paper FM-IRL (Flow Matching for inverse RL) is on Arxiv. ([Paper], [Code]).
- 2025.10 🎉 OSCAR (training free technique for diverse image generation) is on Arxiv. ([Paper], [Code]).
- 2025.08 🎉🎉 Received my B.Sc from The Chinese University of Hong Kong with 1-st class honor, looking forward to new journey!
- 2025.05 🎉🎉 EBC is accepted by ICML 2025 (Code available).
- 2025.03 🎉🎉 One paper accepted by ICLR 2025 (generative models for robot learning workshop).
- 2024.12 🎉🎉 One paper accepted by AAMAS 2025 (oral).
- 2024.12 🎉🎉 One paper accepted by AAAI 2025 (oral).
- 2024.09 🎉🎉 One invention patent is published.
- 2024.09 🎉🎉 Awarded Academic Performance Scholarship (for top 5% students) for consecutive two years.
- 2024.05 🎉🎉 One invention patent is officially granted.
- 2023.12 🎉🎉 As tech co-founder, I co-founded enterprise “Metasequoia Intelligence” based in Shenzhen, China.
- 2021.09 🎉🎉 Awarded Zhejiang Guolong Inspirational and Diligentia Bowen Scholarship to support my undergraduate studies in CUHK(SZ).
- 2019.09 🎉🎉 Lucky to win the 1-st prize in Provincial Chinese Mathematics Olympiad (CMO). Thanks for this intellectually-rewarding experience.
📝 Selected Publications
Conference papers and Preprints
Please scroll down to view all publications. * denotes joint-first-author and equal contribution.
-
Technical Report
Technical ReportCore product technology adopted by the HKGAI and InnoHK. [website]
A new function calling paradigm for LLM agents featuring Stateful Runtime Management. -
ICML 2025
Forty-Second International Conference on Machine LearningEnables agent to efficiently learn diverse and high-performing policies via Quality Diversity (Inverse) Reinforcement Learning. -
preprint
preprintGeneric Framework for Online Reinforcement Learning with Generative Policy class (e.g., Diffusion Policy & Flow-Matching Policy). -
preprint
preprintEnables Flow-Matching model to generate diverse but alignment-respecting images in training-free manner. -
preprint
preprintReinforcing Self-Compression for Optical Agent Memory. -
AAMAS 2025 (oral)
The 24th International Conference on Autonomous Agents and Multiagent SystemsOral presentation
A new paradigm for imitation learning. -
preprint
preprintEquips Flow-Matching policies with exploratory strength when learning from demonstrations. -
AAAI 2025 (oral)
The 39th Annual AAAI Conference on Artificial IntelligenceOral presentation (4.65%)
Recommendate next Point-of-interest with sparse and noisy data via inverse RL. -
ICLR 2025 (workshop)
The Thirteenth International Conference on Learning RepresentationsQuality Diversity Imitation Learning techniques for robot to acquire variety of skills. -
preprint
preprintA brain-inspired framework for embodied agentic learning. -
preprint
Graph-based Reinforcement learning Approach for influential Node Detection in airport delay networkspreprintRL to solve combinatorial optimization problems in transportation systems. -
ADC 2024 (oral & best paper runner-up)
Australasian Database ConferenceOral presentation & Best Paper Runner-up
Map-Matching techniques with strong aware of spatial-temporal information.
Invention Patents
As these works are patented in China, all these names are directly translated from Chinese.
-
A Method, System, Terminal Device, and Storage Medium for Air Quality Spatial Inference (Granted)
Inventor: Jun Song, Yibo Xu, Yiwen Pan, Maohao Ran, Zhenglin Wan, Xiaoyun Yan, Yike Guo
-
A Single-UAV Atmospheric Pollutant Source Tracing Method Based on Gradient Ascent and Physical Kinematics (Public)
Inventor: Zhenglin Wan, Jun Song, Yibo Xu, Maohao Ran, Yike Guo
White Paper
As these works are presented in China, these names are directly translated from Chinese.
- White Paper on Cross-Border Economic Large Language Model
📖 Educations
-
Doctor of Philosophy (Ph.D)
National University of Singapore (NUS)
- Affiliation: Department of Computer Science, School of Computing
- Advisor: Prof. Yang You
2026.01 -

-
Bachelor of Science (B.Sc)
The Chinese University of Hong Kong (CUHK)
- 1-st class honor
- Major: Statistics & Data Science, GPA: 3.85/4.0, Rank: 7%
- Finished my undergraduate in CUHK-Shenzhen campus, while the degree is offered by CUHK.
2021.09 - 2025.08

💻 Internships and Work Experiences
-
Full-time Research Staff
Nanyang Technological University, Singapore
- Affiliation: College of Computing and Data Science (CCDS)
- Advisor: Prof. Bo An
2025.09 - 2026.01

-
Intern (Remote)
Hong Kong Generative AI Research & Development Center (HKGAI), HKUST
- Affiliation: HKGAI
- Director: Prof. Yike Guo
- Proposed a new product-inspired Agentic Function Calling paradigm ([[CaveAgent]]).
2025.05 - 2025.09

-
Intern Researcher
Agency for Science, Technology and Research (A*STAR), Singapore
- Affiliation: Centre for Frontier AI Research (CFAR), Institute of High Performance Computing (IHPC)
- Advisor: Prof. Ivor Tsang, Dr. Xingrui Yu
2024.07 - 2025.09 (3 months full-time + 11 months part-time)

-
Research Assistant
The Chinese University of Hong Kong, Shenzhen
- Affiliation: School of Data Science
- Advisor: Prof. Jianfeng Mao
2023.06 - 2024.06

-
Machine Learning Algorithm Engineer Intern
HUIYINTONG, Shenzhen, China
- Mentor: Dr. Jun Song
2023.07 - 2024.01
🎈 Services
- Reviewer of AAAI, ICLR, ICML, NeurIPS
🎖 Honors, Awards and Scholarships
-
NUS Research Scholarship (approx. ¥250000/year plus full tuition fee subsidy, for entire Ph.D studies in NUS)
-
1-st class honor undergraduate student awarded by The Chinese University of Hong Kong
-
Yearly Academic Scholarship: B Class (for GPA Top 3%, ¥40000)
-
Yearly Academic Scholarship: C Class (for GPA Top 5%, ¥20000)
-
Yearly Dean List Award (Outstanding 1-st class Performance, for 3 years)
-
Diligentia Bowen Scholarship (¥120000, Undergraduate Admission Scholarship for 1-st prize in Provincial CMO)
-
Zhejiang Guolong Inspirational Scholarship (¥120000, Undergraduate Admission Scholarship for top 0.5% students in Chinese College Entrance Exam)
-
1st-Prize in Chinese Mathematics Olympiad (CMO)-Chongqing Province
💬 Press/Media

- The co-author of the first White Paper on Cross-Border Economic Large Language Model in Shenzhen, China. 深圳卫视:深圳发布首个跨境经济大模型白皮书
Miscellaneous
- In my spare time, I’m an music enthusiast. I’ve been playing guitar for more than 10 years and began teaching myself the piano when I was 15. During my undergraduate, I played music in two bands: “Minor Blue” and “Major Pink.” See our photos:
-
I am also a 15-years chess player, with the honor of “National Level-3 Athlete”. I love the process of comprehensive planning, logical-thinking and reasoning. Visit my Lichess profile.
-
I love play basketball 🏀. Sports makes me energetic.
-
I play video games like League of Legends, where I achieved the “diamond” level as my historically highest honor. I also play 3A games like Elden Ring, Dark Souls, Nier Automata, and elder scrolls.
-
I have a deep interest in philosophy of mind, particularly Buddhism and Taoism, as paths to explore the fundamental nature of human existence. I am also intrigued by the potential integration of these philosophical insights with modern artificial intelligence.