Delves into training and applications of Vision-Language-Action models, emphasizing large language models' role in robotic control and the transfer of web knowledge. Results from experiments and future research directions are highlighted.
Presents an all-analog photoelectronic chip for high-speed vision tasks, addressing challenges in classical computation and proposing a hybrid optical-electrical framework.