Vision-Language-Action Navigation for Humanoids | Galaxy Yin (He Yin)

This work explores how to adapt large-scale vision-language-action (VLA) models for real robot navigation.

Project highlights:

Leveraged state-of-the-art VLA policies as expert priors for humanoid navigation.
Improved navigation performance with RL fine-tuning and iterative policy distillation.
Integrated updated policies into robot software pipelines and deployed via ROS.
Evaluated policy robustness under changing scene geometry and task objectives.

The goal is to combine foundation-model priors with task-specific control refinement for better embodied autonomy.