|
Robot Parkour Learning
Ziwen Zhuang*, Zipeng Fu*, Jianren Wang, Chris Atkeson, Sören Schwertfeger, Chelsea Finn, Hang Zhao
CoRL 2023 (Oral)
Best Systems Paper Award Finalist (top 3)
webpage |
pdf |
abstract |
bibtex |
arXiv |
code |
video
Parkour is a grand challenge for legged locomotion that requires robots to overcome various obstacles rapidly in complex environments. Existing methods can generate either diverse but blind locomotion skills or vision-based but specialized skills by using reference animal data or complex rewards. However, autonomous parkour requires robots to learn generalizable skills that are both vision-based and diverse to perceive and react to various scenarios. In this work, we propose a system for learning a single end-to-end vision-based parkour policy of diverse parkour skills using a simple reward without any reference motion data. We develop a reinforcement learning method inspired by direct collocation to generate parkour skills, including climbing over high obstacles, leaping over large gaps, crawling beneath low barriers, squeezing through thin slits, and running. We distill these skills into a single vision-based parkour policy and transfer it to a quadrupedal robot using its egocentric depth camera. We demonstrate that our system can empower two different low-cost robots to autonomously select and execute appropriate parkour skills to traverse challenging real-world environments.
@inproceedings{zhuang2023robot,
author = {Zhuang, Ziwen and Fu, Zipeng and
Wang, Jianren and Atkeson, Christopher and
Schwertfeger, Sören and Finn, Chelsea and
Zhao, Hang},
title = {Robot Parkour Learning},
booktitle = {Conference on Robot Learning ({CoRL})},
year = {2023}
}
|
|
Deep Whole-Body Control: Learning a Unified Policy for Manipulation and Locomotion
Zipeng Fu*, Xuxin Cheng*, Deepak Pathak
CoRL 2022 (Oral)
Best Systems Paper Award Finalist (top 4)
webpage |
pdf |
abstract |
bibtex |
arXiv |
OpenReview |
video
An attached arm can significantly increase the applicability of legged robots to several mobile manipulation tasks that are not possible for the wheeled or tracked counterparts. The standard control pipeline for such legged manipulators is to decouple the controller into that of manipulation and locomotion. However, this is ineffective and requires immense engineering to support coordination between the arm and legs, error can propagate across modules causing non-smooth unnatural motions. It is also biological implausible where there is evidence for strong motor synergies across limbs. In this work, we propose to learn a unified policy for whole-body control of a legged manipulator using reinforcement learning. We propose Regularized Online Adaptation to bridge the Sim2Real gap for high-DoF control, and Advantage Mixing exploiting the causal dependency in the action space to overcome local minima during training the whole-body system. We also present a simple design for a low-cost legged manipulator, and find that our unified policy can demonstrate dynamic and agile behaviors across several task setups.
@inproceedings{fu2022deep,
author = {Fu, Zipeng and Cheng, Xuxin and
Pathak, Deepak},
title = {Deep Whole-Body Control: Learning a Unified Policy
for Manipulation and Locomotion},
booktitle = {Conference on Robot Learning ({CoRL})},
year = {2022}
}
|
|
Coupling Vision and Proprioception forNavigation of Legged Robots
Zipeng Fu*, Ashish Kumar*, Ananye Agarwal, Haozhi Qi, Jitendra Malik, Deepak Pathak
CVPR 2022
Best Paper at Multimodal Learning Workshop
webpage |
pdf |
abstract |
bibtex |
arXiv |
code |
video
We exploit the complementary strengths of vision and proprioception to develop a point-goal navigation system for legged robots, called VP-Nav. Legged systems are capable of traversing more complex terrain than wheeled robots, but to fully utilize this capability, we need a high-level path planner in the navigation system to be aware of the walking capabilities of the low-level locomotion policy in varying environments. We achieve this by using proprioceptive feedback to ensure the safety of the planned path by sensing unexpected obstacles like glass walls, terrain properties like slipperiness or softness of the ground and robot properties like extra payload that are likely missed by vision. The navigation system uses onboard cameras to generate an occupancy map and a corresponding cost map to reach the goal. A fast marching planner then generates a target path. A velocity command generator takes this as input to generate the desired velocity for the walking policy. A safety advisor module adds sensed unexpected obstacles to the occupancy map and environment-determined speed limits to the velocity command generator. We show superior performance compared to wheeled robot baselines, and ablation studies which have disjoint high-level planning and low-level control. We also show the real-world deployment of VP-Nav on a quadruped robot with onboard sensors and computation.
@inproceedings{fu2021coupling,
author = {Fu, Zipeng and Kumar, Ashish and
Agarwal, Ananye and Qi, Haozhi and
Malik, Jitendra and Pathak, Deepak},
title = {Coupling Vision and Proprioception
for Navigation of Legged Robots},
booktitle = {{CVPR}},
year = {2022}
}
|
|
Minimizing Energy Consumption Leads to the Emergence of Gaits in Legged Robots
Zipeng Fu, Ashish Kumar, Jitendra Malik, Deepak Pathak
CoRL 2021
webpage |
pdf |
abstract |
bibtex |
arXiv |
OpenReview |
video
Legged locomotion is commonly studied and expressed as a discrete set of gait patterns, like walk, trot, gallop, which are usually treated as given and pre-programmed in legged robots for efficient locomotion at different speeds. However, fixing a set of pre-programmed gaits limits the generality of locomotion. Recent animal motor studies show that these conventional gaits are only prevalent in ideal flat terrain conditions while real-world locomotion is unstructured and more like bouts of intermittent steps. What principles could lead to both structured and unstructured patterns across mammals and how to synthesize them in robots? In this work, we take an analysis-by-synthesis approach and learn to move by minimizing mechanical energy. We demonstrate that learning to minimize energy consumption is sufficient for the emergence of natural locomotion gaits at different speeds in real quadruped robots. The emergent gaits are structured in ideal terrains and look similar to that of horses and sheep. The same approach leads to unstructured gaits in rough terrains which is consistent with the findings in animal motor control. We validate our hypothesis in both simulation and real hardware across natural terrains.
@inproceedings{fu2021minimizing,
author = {Fu, Zipeng and Kumar, Ashish and Malik, Jitendra and Pathak, Deepak},
title = {Minimizing Energy Consumption Leads to the Emergence of Gaits in Legged Robots},
booktitle = {Conference on Robot Learning (CoRL)},
year = {2021}
}
|
|
RMA: Rapid Motor Adaptation for Legged Robots
Ashish Kumar, Zipeng Fu, Deepak Pathak, Jitendra Malik
RSS 2021
webpage |
pdf |
abstract |
bibtex |
arXiv |
video
Successful real-world deployment of legged robots would require them to adapt in real-time to unseen scenarios like changing terrains, changing payloads, wear and tear. This paper presents Rapid Motor Adaptation (RMA) algorithm to solve this problem of real-time online adaptation in quadruped robots. RMA consists of two components: a base policy and an adaptation module. The combination of these components enables the robot to adapt to novel situations in fractions of a second. RMA is trained completely in simulation without using any domain knowledge like reference trajectories or predefined foot trajectory generators and is deployed on the A1 robot without any fine-tuning. We train RMA on a varied terrain generator using bioenergetics-inspired rewards and deploy it on a variety of difficult terrains including rocky, slippery, deformable surfaces in environments with grass, long vegetation, concrete, pebbles, stairs, sand, etc. RMA shows state-of-the-art performance across diverse real-world as well as simulation experiments.
@inproceedings{kumar2021rma,
author = {Kumar, Ashish and Fu, Zipeng and Pathak, Deepak and Malik, Jitendra},
title = {RMA: Rapid Motor Adaptation for Legged Robots},
booktitle = {Robotics: Science and Systems (RSS)},
year = {2021}
}
Media Coverage: Facebook AI |
TechCrunch |
Forbes |
Washington Post |
CNet |
Synced Review |
UC Berkeley |
CMU
|
 |
Emergence of Theory of Mind Collaboration in Multiagent Systems
Luyao Yuan, Zipeng Fu, Linqi Zhou, Kexin Yang, Song-Chun Zhu
Emergent Communication Workshop
NeurIPS 2019
pdf |
abstract |
bibtex |
arXiv |
code
Currently, in the study of multiagent systems, the intentions of agents are usually ignored. Nonetheless, as pointed out by Theory of Mind (ToM), people regularly reason about other’s mental states, including beliefs, goals, and intentions, to obtain performance advantage in competition, cooperation or coalition. However, due to its intrinsic recursion and intractable modeling of distribution over belief, integrating ToM in multiagent planning and decision making is still a challenge. In this paper, we incorporate ToM in multiagent partially observable Markov decision process (POMDP) and propose an adaptive training algorithm to develop effective collaboration between agents with ToM. We evaluate our algorithms with two games, where our algorithm surpasses all previous decentralized execution algorithms without modeling ToM.
@article{yuan2019emergencecollaboration,
author = {Yuan, Luyao and Fu, Zipeng and Zhou, Linqi and Yang, Kexin and Zhu, Song-Chun},
journal= {Emergent Communication Workshop at NeurIPS},
year = {2019}
}
|
 |
Emergence of Pragmatics from Referential Game between Theory of Mind Agents
Luyao Yuan, Zipeng Fu, Jingyue Shen, Lu Xu, Junhong Shen,Song-Chun Zhu
Emergent Communication Workshop
NeurIPS 2019
pdf |
abstract |
bibtex |
arXiv |
code
Pragmatics studies how context can contribute to language meanings. In human communication, language is never interpreted out of context, and sentences can usually convey more information than their literal meanings. However, this mechanism is missing in most multi-agent systems, restricting the communication efficiency and the capability of human-agent interaction. In this paper, we propose an algorithm, using which agents can spontaneously learn the ability to “read between lines” without any explicit hand-designed rules. We integrate theory of mind (ToM) in a cooperative multi-agent pedagogical situation and propose an adaptive reinforcement learning (RL) algorithm to develop a communication protocol. ToM is a profound cognitive science concept, claiming that people regularly reason about other’s mental states, including beliefs, goals, and intentions, to obtain performance advantage in competition, cooperation or coalition. With this ability, agents consider language as not only messages but also rational acts reflecting others hidden states. Our experiments demonstrate the advantage of pragmatic protocols over non-pragmatic protocols. We also show the teaching complexity following the pragmatic protocol empirically approximates to recursive teaching dimension (RTD).
@article{yuan2019emergencepragmatics,
author = {Yuan, Luyao and Fu, Zipeng and Shen, Jingyue and Xu, Lu and Shen, Junhong and Zhu, Song-Chun},
journal= {Emergent Communication Workshop at NeurIPS},
year = {2019}
}
|
 |
Unsupervised Incremental Structure Learning of Stochastic And-Or Grammars with Monte Carlo Tree Search
Luyao Yuan, Jingyue Shen, Zipeng Fu, Song-Chun Zhu
Preprint 2019
pdf |
abstract |
bibtex |
code (And-Or-Graph Learning Library)
Stochastic And-Or grammars form a compact representation of probabilistic contextfree grammars. They explicitly model compositionality and reconfigurability in a hierarchical manner and can be utilized to understand the underlying structures of different kinds of data (e.g., language, image, or video). In this paper, we proposed an unsupervised AndOr grammar learning approach that iteratively searches for better grammar structure and parameters to optimize the grammar compactness and data likelihood. To handle the complexity of grammar learning, we developed an algorithm based on the Monte Carlo Tree Search to effectively explore the search space. Also, our method enables incremental grammar learning. Experimental results show that our approach significantly outperforms previous greedy-search-based approaches, and our incremental learning results are comparable to previous batch learning results.
@article{yuan2019stochastic,
author = {Yuan, Luyao and Shen, Jingyue and Fu, Zipeng and Zhu, Song-Chun},
journal= {Preprint},
year = {2019}
}
|
 |
Machine Learning for Glass Science and Engineering: A Review
Han Liu, Zipeng Fu, Kai Yang, Xinyi Xu, Mathieu Bauchy
Journal of Non-Crystalline Solids 2019
pdf |
abstract |
bibtex
The design of new glasses is often plagued by poorly efficient Edisonian “trial-and-error” discovery approaches. As an alternative route, the Materials Genome Initiative has largely popularized new approaches relying on artificial intelligence and machine learning for accelerating the discovery and optimization of novel, advanced materials. Here, we review some recent progress in adopting machine learning to accelerate the design of new glasses with tailored properties.
@article{liu2019machine,
author = {Liu, Han and Fu, Zipeng and Yang, Kai and Xu, Xinyi and Bauchy, Mathieu},
journal= {Journal of Non-Crystalline Solids},
year = {2019}
}
|
 |
Adversarial Attack Against Scene Recognition System for Unmanned Vehicles
Xuankai Wang, Mi Wen, Jinguo Li, Zipeng Fu and Rongxing Lu
ACM TURC 2019
pdf |
abstract |
bibtex
Unmanned scene recognition means that unmanned vehicles can collect environmental data from equipped sensors and make decisions through algorithms, in which deep learning has become one of key technologies. Especially, with the discovery of adversarial examples against deep learning, the research on offensive and defensive against adversarial examples illustrates that the deep learning model for unmanned scene recognition also has the safety vulnerability. However, as far as we know, few studies have tried to explore the adversarial example attack in this field. Therefore, we try to address this problem by generating adversarial examples againist scene recognition classification model through experiments. In addition, we also try to improve the adversarial model robustness by the adversarial training. Extensive experiments have been conducted, and experimental results show that adversarial examples have an efficient attack effect on the neural network for scene recognition.
@article{wang2019adversarial,
author = {Wang, Xuankai and Li, Jinguo and Fu, Zipeng and Lu, Rongxing},
journal= {ACM TURC},
year = {2019}
}
|
 |
Energy Theft Detection With Energy Privacy Preservation in the Smart Grid
Donghuan Yao, Mi Wen, Xiaohui Liang, Zipeng Fu, Kai Zhang, Baojia Yang
IEEE IoT Journal 2019
pdf |
abstract |
bibtex
As a prominent early instance of the Internet of Things in the smart grid, the advanced metering infrastructure (AMI) provides real-time information from smart meters to both grid operators and customers, exploiting the full potential of demand response. However, the newly collected information without security protection can be maliciously altered and result in huge loss. In this paper, we propose an energy theft detection scheme with energy privacy preservation in the smart grid. Especially, we use combined convolutional neural networks (CNNs) to detect abnormal behavior of the metering data from a long-period pattern observation. In addition, we employ Paillier algorithm to protect the energy privacy. In other words, the users’ energy data are securely protected in the transmission and the data disclosure is minimized. Our security analysis demonstrates that in our scheme data privacy and authentication are both achieved. Experimental results illustrate that our modified CNN model can effectively detect abnormal behaviors at an accuracy up to 92.67%.
@article{yao2019energy,
author = {Yao, Donghuan and Liang, Xiaohui and Fu, Zipeng and Zhang, Kai and Yang, Baojia},
journal= {IEEE Internet of Things Journal},
year = {2019}
}
|
|