Haodong Yan Homepage

Biography

I am a PhD student in Intelligent Transportation at The Hong Kong University of Science and Technology (Guangzhou), where I work with Prof. Haoang Li in the IRPN Lab. My research focuses on foundation models for embodied intelligence, with current interests in Video-Action Models, World Models, and Video Generation.

Before starting my PhD, I received an M.Eng. in Mechanical Engineering and a second bachelor's degree in Computer Science and Technology from Xi'an Jiaotong University. I am interested in building models that can understand dynamics, predict future interactions, and support downstream reasoning and decision making for robotic systems.

News

2026 Our paper SCAR was accepted to CVPR 2026.
2025 Started a project on hand-object interaction video generation at IRPN Lab, HKUST (GZ).
2024 Began my PhD in Intelligent Transportation at HKUST (Guangzhou).
2024 GazeMoDiff was published at Pacific Graphics 2024, and our work on photometric bundle adjustment appeared at IROS 2024.
2023 Completed research internships with the University of Stuttgart and the Technical University of Munich.

Selected Publications

Open-world Hand-Object Interaction Video Generation Based on Structure and Contact-aware Representation

Haodong Yan, Hang Yu, Zhide Zhong, Weilin Yuan, Xin Gong, Zehang Luo, Chengxi Heyu, Junfeng Li, Wenxuan Song, Shunbo Zhou, Haoang Li

CVPR 2026

A scalable structure- and contact-aware representation for generating realistic hand-object interaction videos that generalize to open-world scenarios.

Paper Project Code

ReconVLA: Reconstructive Vision-Language-Action Model as Effective Robot Perceiver

Wenxuan Song, Ziyang Zhou, Han Zhao, Jiayi Chen, Pengxiang Ding, Haodong Yan, Yuxin Huang, Feilong Tang, Donglin Wang, Haoang Li

AAAI 2026 Outstanding Paper Award

A reconstructive vision-language-action model that improves robot perception by reconstructing task-relevant visual regions for downstream manipulation.

Paper Project Code

FlowVLA: Visual Chain of Thought-based Motion Reasoning for Vision-Language-Action Models

Zhide Zhong, Haodong Yan, Junfeng Li, Xiangchen Liu, Xin Gong, Tianran Zhang, Wenxuan Song, Jiayi Chen, Xinhu Zheng, Hesheng Wang, Haoang Li

arXiv 2025

A visual chain-of-thought framework for motion reasoning in VLAs that predicts future dynamics before generating the final action.

Paper Project

GazeMoDiff: Gaze-guided Diffusion Model for Stochastic Human Motion Prediction

Haodong Yan, Zhiming Hu, Syn Schmitt, Andreas Bulling

Pacific Graphics 2024

A multimodal diffusion framework that uses gaze to improve stochastic human motion prediction.

Paper Project Code

Physically-Based Photometric Bundle Adjustment in Non-Lambertian Environments

Cheng Lei, Junpeng Hu, Haodong Yan, Mariia Gladkova, Tianyu Huang, Yun-Hui Liu, Daniel Cremers, Haoang Li

IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2024

Photometric bundle adjustment with material and illumination awareness for challenging non-Lambertian scenes.

Paper Project

A Graph Embedded in Graph Framework with Dual-sequence Input for Efficient Anomaly Detection of Complex Equipment under Insufficient Samples

Haodong Yan, Fudong Li, Jinglong Chen, Zijun Liu, Jun Wang, Yong Feng, Xinwei Zhang

Reliability Engineering & System Safety, 2023

A graph-based anomaly detection framework designed for complex equipment under limited-data settings.

Paper

Memory-augmented Skip-connected Autoencoder for Unsupervised Anomaly Detection of Rocket Engines with Multi-source Fusion

Haodong Yan, Zijun Liu, Jinglong Chen, Yong Feng, Jun Wang

ISA Transactions, 2023

An unsupervised anomaly detection model that combines memory augmentation and multi-scale skip connections for rocket engine monitoring.

Paper Code

Virtual Sensor-based Imputed Graph Attention Network for Anomaly Detection of Equipment with Incomplete Data

Haodong Yan, Jun Wang, Jinglong Chen, Zijun Liu, Yong Feng

Journal of Manufacturing Systems, 2022

A graph-based framework that imputes missing sensor readings and performs anomaly detection on incomplete multi-sensor equipment data.

Paper

Education

PhD in Intelligent Transportation, The Hong Kong University of Science and Technology (Guangzhou)

Sep 2024 - Present

M.Eng. in Mechanical Engineering, Xi'an Jiaotong University

Sep 2021 - Jul 2024 · GPA 3.73/4.0 · Rank 1/281

B.Eng. in Mechanical Engineering, Xi'an Jiaotong University

Sep 2017 - Jan 2021

Second Bachelor's Degree in Computer Science and Technology, Xi'an Jiaotong University

Jul 2019 - Jul 2021

Experience & Awards

Internship, Institute for Visualisation and Interactive Systems, University of Stuttgart

Jul 2023 - Oct 2023

Remote Internship, Computer Vision Group, Technical University of Munich

Apr 2023 - Dec 2023

National Scholarship, Ministry of Education, China

2023 and 2022

National First Prize, Chinese University Students Mechanical Innovation Competition

2020