Low-Cost Exoskeletons for Learning Whole-Arm Manipulation in the Wild

ICRA 2024

1Shanghai Jiao Tong University, 2Shanghai Artificial Intelligence Laboratory

Abstract

While humans can use parts of their arms other than the hands for manipulations like gathering and supporting, whether robots can effectively learn and perform the same type of operations remains relatively unexplored. As these manipulations require joint-level control to regulate the complete poses of the robots, we develop AirExo, a low-cost, adaptable, and portable dual-arm exoskeleton, for teleoperation and demonstration collection. As collecting teleoperated data is expensive and time-consuming, we further leverage AirExo to collect cheap in-the-wild demonstrations at scale. Under our in-the-wild learning framework, we show that with only 3 minutes of the teleoperated demonstrations, augmented by diverse and extensive in-the-wild data collected by AirExo, robots can learn a policy that is comparable to or even better than one learned from teleoperated demonstrations lasting over 20 minutes. Experiments demonstrate that our approach enables the model to learn a more general and robust policy across the various stages of the task, enhancing the success rates in task completion even with the presence of disturbances.

AirExo

We introduce AirExo, an open-source, portable, adaptable, inexpensive (approximately $300 per arm), and robust exoskeleton system. The system is initially developed for Flexiv Rizon arms, and it can be quickly modified for different robotic arms, such as UR5, Franka and Kuka.

After calibration with a dual-arm robot, AirExo can achieve precise joint-level teleoperations of the robot for teleoperated demonstration collection.

Moreover, contributed to its portable property, AirExo enables in-the-wild data collection for dexterous manipulation without needing a robot. Humans can wear AirExo, conduct manipulation in the wild, and collect demonstrations at scale. The one-to-one joint mapping also reduces the barriers of transferring policies trained on human-collected data to robots.

This breakthrough capability not only simplifies data collection but also extends the reach of whole-arm manipulation into unstructured environments, where robots can learn and adapt from human interactions. In the future, we are excited to see our AirExo collecting large-scale demonstrations in unstructured environments and facilitating robot learning.

Learning in the Wild

AirExo serves as a natural bridge for the kinematic gap between humans and robots. To address the domain gap between images, our approach involves a two-stage training process. In the first stage, we pre-train the policy using in-the-wild human demonstrations and actions recorded by the exoskeleton encoders. During this phase, the policy primarily learns the high-level task execution strategy from the large-scale and diverse in-the-wild human demonstrations. Subsequently, in the second stage, the policy undergoes fine-tuning using teleoperated demonstrations with robot actions to refine the motions based on the previously acquired high-level task execution strategy.

Experimental Results

We evaluate the performance of different methods on the "Gather Balls" task. After applying our in-the-wild learning framework, with the assistance of in-the-wild demonstrations, ACT can achieve the same level of performance as 50 teleoperated demonstrations with just 10 teleoperated demonstrations. This demonstrates that our learning framework with in-the-wild demonstrations makes the policy more sample-efficient for teleoperated demonstrations.

 Links to full videos in this experiment.
# Teleoperated Demonstrations # In-the-Wild DemonstrationsMethod Video
50-VIP + NN Link
50-VC-1 + NN Link
50-MVP + NN Link
50-VINN Link
50-ConvMLP Link
50-BeT Link
50-ACT Link
10-VINN Link
10-ACT Link
1050ACT Link
10100ACT Link

We also evaluate the performance of different methods on the "Grasp from the Curtained Shelf" task. After training with our in-the-wild learning framework, ACT exhibits a significant improvement in success rates in the "grasp" and "throw" stages. It achieves even higher success rates, surpassing those obtained with the original set of 50 teleoperated demonstrations lasting more than 20 minutes, using only 10 such demonstrations lasting approximately 3 minutes. This highlights that our proposed in-the-wild framework indeed enables the policy to learn a better strategy, effectively enhancing the success rates in the later stages of multi-stage tasks.

 Links to full videos in this experiment.
# Teleoperated Demonstrations # In-the-Wild DemonstrationsMethod Video
50-VINN Link
50-ACT Link
10-VINN Link
10-ACT Link
1050ACT Link
10100ACT Link

We then evaluate the policy performance when adding some disturbances in the experimental environment. The results demonstrate that our in-the-wild learning framework can leverage diverse in-the-wild demonstrations to make the learned policy more robust and generalizable to various environmental disturbances.

 Links to full videos in this experiment.
# Teleoperated Demonstrations # In-the-Wild DemonstrationsMethod Video
10-ACT Link
10100ACT Link

BibTeX

@article{
    fang2023low,
    title = {Low-Cost Exoskeletons for Learning Whole-Arm Manipulation in the Wild},
    author = {Fang, Hongjie and Fang, Hao-Shu and Wang, Yiming and Ren, Jieji and Chen, Jingjing and Zhang, Ruo and Wang, Weiming and Lu, Cewu},
    journal = {arXiv preprint arXiv:2309.14975},
    year = {2023}
}