Vision-Language-Action Navigation for Humanoids
Adapting and deploying VLA models for embodied humanoid navigation.
This work explores how to adapt large-scale vision-language-action (VLA) models for real robot navigation.
Project highlights:
- Leveraged state-of-the-art VLA policies as expert priors for humanoid navigation.
- Improved navigation performance with RL fine-tuning and iterative policy distillation.
- Integrated updated policies into robot software pipelines and deployed via ROS.
- Evaluated policy robustness under changing scene geometry and task objectives.
The goal is to combine foundation-model priors with task-specific control refinement for better embodied autonomy.