I am a 2nd year Ph.D. student at the University of Virginia advised by Prof. Zezhou Cheng. I do research in 3D Computer Vision.
Before joining UVA, I have completed master of Computer Science and Engineering at the University of Michigan. I was a member of SLED lab. I obtained my bachelor’s degree in Applied and Computational Mathematics Science at the University of Washington where I worked with Prof. Yunhe feng on Responsible AI.
My research centers on scalable approaches to 3D computer vision, language-grounded robotics, and visual world models. I am broadly interested in how visual foundation models can acquire persistent, dynamic, and geometry-aware world representations from real-world multi-view, video, and 3D data.
I currently focus on two themes:
Developing vision-centric 3D/4D foundation models. I study how large-scale visual models can learn spatially grounded representations of static and dynamic environments, with an emphasis on multi-view perception, novel view synthesis, 3D scene understanding, and persistent visual memory.
Grounding language concepts in 3D environments and robotic actions. I explore how language can be connected to 3D scenes, object affordances, spatial relations, and embodied actions, enabling agents to reason, navigate, and interact in complex physical environments.
CVPR 2026 (Highlight)
ICLR 2026
3D-LLM/VLA @ CVPR 2025
3DV 2026
CVPR 2025
NeurIPS 2024
CVPR 2025
TMLR
ICRA 2024
Powered by Jekyll and Minimal Light theme.