I am currently serving as an Algorithm Expert at DAMO Academy, Alibaba Group, based in Hangzhou, where I have had the privilege of receiving guidance from Prof. Qixing Huang and Prof. Tan Ping.
I earned my Bachelor’s degree (2010–2015) from Chongqing Medical University and completed my Ph.D. (2015–2020) at Fudan University, specifically at the Shanghai Key Lab of Medical Image Computing and Computer-Assisted Intervention, where i was supervised by Prof. Zhijian Song and co-supervised by Prof. Chenxi Zhang and Prof. Manning Wang.
My research passions lie in the domains of Computer Vision, 3D Vision, and Medical Image Processing. I have contributed more than 10+ publications to prestigious international AI conferences and journals, including CVPR, SIGGRAPH, NeurIPS, ECCV, ICRA, and TIP.
🔥 News
- 2025.02: 🎉 One paper accepted to IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2025).
- 2025.01: 🎉 Two paper accepted to International Conference on Learning Representations (ICLR 2025).
📝 Publications

LAM: Large Avatar Model for One-shot Animatable Gaussian Head
Yisheng He, Xiaodong Gu, Xiaodan Ye, Chao Xu, Zhengyi Zhao, Yuan Dong#, Weihao Yuan#, Zilong Dong, Liefeng Bo
- A large avatar model for generating the animatable Gaussian head avatar from one image, allowing animation and rendering without additional network.

Atlas Gaussians Diffusion for 3D Generation
Haitao Yang*, Yuan Dong*, Hanwen Jiang, Dejia Xu, Georgios Pavlakos, Qixing Huang
- A new 3D representation that can efficiently decode a sufficiently large and theoretically infinite number of 3D Gaussians for high-quality 3D generation.

Ctrl-Room: Controllable Text-to-3D Room Meshes Generation with Layout Constraints
Chuan Fang*, Yuan Dong*, Kunming Luo, Xiaotao Hu, Rakesh Shrestha, Ping Tan
- A two-stage method for 3D room generation from pure text input, which separates the geometric layout generation and appearance generation.

GPLD3D: Latent Diffusion of 3D Shape Generative Models by Enforcing Geometric and Physical Priors
Yuan Dong*, Qi Zuo*, Xiaodong Gu, Weihao Yuan, Zhengyi Zhao, Zilong Dong, Liefeng Bo, Qixing Huang
- A novel latent diffusion shape-generative model regularized by a quality checker that outputs a score of a latent code.

PanoContext-Former: Panoramic Total Scene Understanding with a Transformer
Yuan Dong*, Chuan Fang*, Liefeng Bo, Zilong Dong, Ping Tan
- A new fully 3D method for total scene understanding from a single RGB panorama in an end-to-end fashion, which will better preserve the intrinsic structure of the indoor scene.

High-Fidelity 3D Textured Shapes Generation by Sparse Encoding and Adversarial Decoding
Qi Zuo*, Xiaodong Gu*, Yuan Dong*, Zhengyi Zhao, Weihao Yuan, Lingteng Qiu, Liefeng Bo, Zilong Dong
- A novel sparse encoding and dense decoding paradigm that can achieve both high-fidelity single-class generation.

Yuan Dong, Chenxi Zhang , Dafeng Ji, Manning Wang, Zhijian Song
- A new framework for selecting an appropriate scan resolution and scan mode in image-guided neurosurgery.

2D-3D Point Set Registration Based on Global Rotation Search
Yinlong Liu*, Yuan Dong*; Zhijian Song; Manning Wang
- New bounds for a global rotation search in 2D-3D registration.
More Publications:
- AniGS: Animatable Gaussian Avatar from a Single Image with Inconsistent Gaussian Reconstruction, Lingteng Qiu, Shenhao Zhu, Qi Zuo, Xiaodong Gu, Yuan Dong, Junfei Zhang, Chao Xu, Zhe Li, Weihao Yuan, Liefeng Bo, Guanying Chen, Zilong Dong, CVPR 2025
- LaMP: Language-Motion Pretraining for Motion Generation, Retrieval, and Captioning, Zhe Li, Weihao Yuan, Yisheng HE, Lingteng Qiu, Shenhao Zhu, Xiaodong Gu, Weichao Shen, Yuan Dong, Zilong Dong, Laurence Tianruo Yang, ICLR 2025
- MoGenTS: Motion Generation based on Spatial-Temporal Joint Modeling, Weihao Yuan, Yisheng He, Weichao Shen, Yuan Dong, Xiaodong Gu, Zilong Dong, Liefeng Bo, NeurIPS 2024
- An Optimization Framework to Enforce Multi-View Consistency for Texturing 3D Meshes Using Pre-Trained Text-to-Image Models, Zhengyi Zhao, Chen Song, Xiaodong Gu, Yuan Dong, Qi Zuo, Weihao Yuan, Zilong Dong, Liefeng Bo, and Qixing Huang, ECCV 2024
- Videomv: Consistent multi-view generation based on large video generative model, Qi Zuo, Xiaodong Gu, Lingteng Qiu, Yuan Dong, Zhengyi Zhao, Weihao Yuan, Rui Peng, Siyu Zhu, Zilong Dong, Liefeng Bo, Qixing Huang, 2024
- PanoViT: Vision Transformer for Room Layout Estimation from a Single Panoramic Image, Weichao Shen, Yuan Dong, Zonghao Chen, Zhengyi Zhao, Yang Gao, Zhu Liu, 2022
📦 Datasets

PanoContext-Former: Panoramic Total Scene Understanding with a Transformer
Yuan Dong*, Chuan Fang*, Liefeng Bo, Zilong Dong, Ping Tan
- ReplicaPano is a new panoramic dataset that offers various ground truths, including photo-realistic panorama, depth maps, real-world 3D room layouts and 3D oriented object bounding boxes, and object meshes.

G-buffer Objaverse: High-Quality Rendering Dataset of Objaverse
Chao Xu, Yuan Dong, Qi Zuo, Junfei Zhang, Xiaodan Ye, Wenbo Geng, Yuxiang Zhang, Xiaodong Gu, Lingteng Qiu, Zhengyi Zhao, Qing Ran, Jiayi Jiang, Zilong Dong, Liefeng Bo
- High-Quality Rendering Dataset of Objaverse.
🎖 Honors and Awards
- 2024.05, GPLD3D is selectd as one of 90 Oral presentations (3.3% accepted paper) by CVPR2024
- 2019.09, National Scholarship (Top 1%)
📖 Educations
- 2015.06 - 2020.09, PhD, MICCAI, Fudan Univeristy, Shanghai.
- 2010.09 - 2015.06, Undergraduate, Chongqing Medical University, Chongqing.