🧪 LabVLA: Vision-Language-Action Model for Scientific Laboratories

LabVLA is the first VLA foundation model designed specifically for scientific laboratory environments. It combines a Qwen3-VL-4B vision-language backbone with a DiT flow-matching action expert to predict robotic action trajectories from laboratory camera views and language instructions.

📄 Paper • 💻 GitHub • 🤗 Model

Examples