😊 About me
Resume: Resume (Updated on 8 Oct, 2025)
Contact: vanzl3386 [at] gmail.com (preferred), vanzl [at] u.nus.edu, zhenglin.wan [at] ntu.edu.sg, 121090525 [at] link.cuhk.edu.cn
Zhenglin Wan (万政霖) received his B.Sc from Chinese University of Hong Kong on 2025 Fall. He is currently a research staff in Nanyang Technological University, advised by Prof. Bo An, and an incoming CS Ph.D student in Department of Computer Science, National University of Singapore (NUS). Previously, he spent a long time working with Prof. Ivor Tsang and collaborated with Prof. Ong Yew Soon at Centre for Frontier AI Research (CFAR), IHPC, A*STAR in Singapore.
I am a RL (Reinforcement Learning) believer. Previously I mainly studied improving RL or inverse RL from algorithmic side: such as improving the policy diversity (like EBC) and policy expressiveness (like GoRL). Recently, I am focusing on Agentic RL: RL to empower LLM agents with long-horizon human-like reasoning, planning and decision-making skills.
I am grateful to have received help and guidance for my career from so many people—such as Flint, David, Ivor, Bo, Xingrui, and many others. I also understand many truly talented students may not have the opportunities to reach their full potential. So, if you are an undergrad or master student and believe I could offer advice, information, or opportunities that might help with your career, feel free to reach out to me by email. Meanwhile, I am also a student mentor in CFAR, IHPC, A*STAR at Singapore. If you are interested in visiting or intern opportunities, you are welcomed to drop me an email for a chat.
Mentees (research-based, majority of them are co-mentored with scientists at CFAR, IHPC, A*STAR):
- Yaxin Zhou (Master's student @ CMU, America)
- Jingxuan Wu (Master's student @ UNC, America)
- Xu Pan (Master’s student @ Wuhan University)
- Chubin Zhang (Master’s student @ Beijing University of Posts and Telecommunications)
- Dongchu Xie (Undergrad @ CUHK(SZ))
- Xin Yan (Undergrad @ Beijing Normal University)
- He Ma (Undergrad @ CUHK(SZ))
Besides, I am ALWAYS open to collaborations, networking and intern opportunities, feel free to contact me by email! (vanzl3386 [at] gmail.com)
🔥 News
- 2025.12 🎉 Our paper GoRL for online Reinforcement Learning with generative policies is on ArXiv and github!
- 2025.10 🎉 Our paper FM-IRL (Flow Matching for inverse RL) is on ArXiv and github.
- 2025.10 🎉 OSCAR (training free technique for diverse image generation) is on ArXiv and github.
- 2025.09 🎉🎉 Awarded NUS Research Scholarship to support my Ph.D studies in National University of Singapore beginning from Jan, 2026.
- 2025.09 🎉🎉 Joined Nanyang Technological University as a research staff.
- 2025.08 🎉🎉 Received my B.Sc from The Chinese University of Hong Kong with 1-st class honor, looking forward to new journey!
- 2025.05 🎉🎉 EBC is accepted by ICML 2025 (Code available).
- 2025.03 🎉🎉 One paper accepted by ICLR 2025 (generative models for robot learning workshop).
- 2024.12 🎉🎉 One paper accepted by AAMAS 2025 (oral).
- 2024.12 🎉🎉 One paper accepted by AAAI 2025 (oral).
- 2024.09 🎉🎉 One invention patent is published.
- 2024.09 🎉🎉 Awarded Academic Performance Scholarship (for top 5% students) for consecutive two years.
- 2024.05 🎉🎉 One invention patent is officially granted.
- 2023.12 🎉🎉 As tech co-founder, I co-founded enterprise “Metasequoia Intelligence” based in Shenzhen, China.
- 2021.09 🎉🎉 Awarded Zhejiang Guolong Inspirational and Diligentia Bowen Scholarship to support my undergraduate studies in CUHK(SZ).
- 2019.09 🎉🎉 Lucky to win the 1-st prize in Provincial Chinese Mathematics Olympiad (CMO). Thanks for this intellectually-rewarding experience.
📝 Selected Publications
Conference papers and Preprints
Please scroll down to view all publications.
-
ICML 2025
Forty-Second International Conference on Machine LearningEnables agent to efficiently learn diverse and high-performing policies via Quality Diversity (Inverse) Reinforcement Learning. -
preprint
preprintGeneric Framework for Online Reinforcement Learning with Generative Policy class (e.g., Diffusion Policy & Flow-Matching Policy). -
preprint
preprintEquips Flow-Matching policies with exploratory strength when learning from demonstrations. -
preprint
preprintEnables Flow-Matching model to generate diverse but alignment-respecting images in training-free manner. -
AAMAS 2025 (oral)
The 24th International Conference on Autonomous Agents and Multiagent SystemsOral presentationA new paradigm for imitation learning. -
AAAI 2025 (oral)
The 39th Annual AAAI Conference on Artificial IntelligenceOral presentation (4.65%)Recommendate next Point-of-interest with sparse and noisy data via inverse RL. -
ICLR 2025 (workshop)
The Thirteenth International Conference on Learning RepresentationsQuality Diversity Imitation Learning techniques for robot to acquire variety of skills. -
preprint
preprintA brain-inspired framework for embodied agentic learning. -
preprint
Graph-based Reinforcement learning Approach for influential Node Detection in airport delay networkspreprintRL to solve combinatorial optimization problems in transportation systems. -
ADC 2024 (oral & best paper runner-up)
Australasian Database ConferenceOral presentation & Best Paper Runner-upMap-Matching techniques with strong aware of spatial-temporal information.
Invention Patents
As these works are patented in China, all these names are directly translated from Chinese.
-
A Method, System, Terminal Device, and Storage Medium for Air Quality Spatial Inference (Granted)
Inventor: Jun Song, Yibo Xu, Yiwen Pan, Maohao Ran, Zhenglin Wan, Xiaoyun Yan, Yike Guo
-
A Single-UAV Atmospheric Pollutant Source Tracing Method Based on Gradient Ascent and Physical Kinematics (Public)
Inventor: Zhenglin Wan, Jun Song, Yibo Xu, Maohao Ran, Yike Guo
White Paper
As these works are presented in China, these names are directly translated from Chinese.
- White Paper on Cross-Border Economic Large Language Model
📖 Educations
-
Doctor of Philosophy (Ph.D)
National University of Singapore (NUS)
- Affiliation: Department of Computer Science, School of Computing
- Advisor: TBD
2026.01 -

-
Bachelor of Science (B.Sc)
The Chinese University of Hong Kong (CUHK)
- 1-st class honor
- Major: Statistics & Data Science, GPA: 3.85/4.0, Rank: 7%
- Finished my undergraduate in CUHK-Shenzhen campus, while the degree is offered by CUHK.
2021.09 - 2025.08

💻 Internships and Work Experiences
-
Full-time Research Staff
Nanyang Technological University, Singapore
- Affiliation: College of Computing and Data Science (CCDS)
- Advisor: Prof. Bo An
2025.09 - 2026.01

-
Intern Researcher
Agency for Science, Technology and Research (A*STAR), Singapore
- Affiliation: Centre for Frontier AI Research (CFAR), Institute of High Performance Computing (IHPC)
- Advisor: Prof. Ivor Tsang, Dr. Xingrui Yu
2024.07 - 2025.09 (3 months full-time + 11 months part-time)

-
Research Assistant
The Chinese University of Hong Kong, Shenzhen
- Affiliation: School of Data Science
- Advisor: Prof. Jianfeng Mao
2023.06 - 2024.06

-
Machine Learning Algorithm Engineer Intern
HUIYINTONG, Shenzhen, China
- Mentor: Dr. Jun Song
2023.07 - 2024.01
🎈 Services
- Reviewer of AAAI, ICLR, ICML, NeurIPS
🎖 Honors, Awards and Scholarships
-
NUS Research Scholarship (approx. ¥250000/year plus full tuition fee subsidy, for entire Ph.D studies in NUS)
Janurary 2026-
-
1-st class honor undergraduate student awarded by The Chinese University of Hong Kong
August 2025
-
Yearly Academic Scholarship: B Class (for GPA Top 3%, ¥40000)
September 2024 -
Yearly Academic Scholarship: C Class (for GPA Top 5%, ¥20000)
September 2023 -
Yearly Dean List Award (Outstanding 1-st class Performance)
Consecutive three years: September 2022 - September 2025 -
Diligentia Bowen Scholarship (Undergraduate Admission Scholarship, ¥120000)
September 2021
-
Zhejiang Guolong Inspirational Scholarship (Undergraduate Admission Scholarship, ¥120000)
September 2021 -
1st-Prize in Chinese Mathematics Olympiad (CMO)-Chongqing Province
September 2019
💬 Press/Media

- The co-author of the first White Paper on Cross-Border Economic Large Language Model in Shenzhen, China. 深圳卫视:深圳发布首个跨境经济大模型白皮书
Miscellaneous
- In my spare time, I’m an music enthusiast. I’ve been playing guitar for more than 10 years and began teaching myself the piano when I was 15. During my undergraduate, I played music in two bands: “Minor Blue” and “Major Pink.” See our photos:
-
I am also a 15-years chess player, with the honor of “National Level-3 Athlete”. I love the process of comprehensive planning, logical-thinking and reasoning. Visit my Lichess profile.
-
I love play basketball 🏀. Sports makes me energetic.
-
I play video games like League of Legends, where I achieved the “diamond” level as my historically highest honor. I also play 3A games like Elden Ring, Dark Souls, Nier Automata, and elder scrolls.
-
I have a deep interest in philosophy of mind, particularly Buddhism and Taoism, as paths to explore the fundamental nature of human existence. I am also intrigued by the potential integration of these philosophical insights with modern artificial intelligence.