Nvidia Developer 09月29日
NeRD:一种改进机器人模拟的神经网络方法
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

文章介绍了Neural Robot Dynamics (NeRD)框架,一种用于机器人模拟的神经网络方法。NeRD旨在解决传统分析动力学在处理复杂接触、运动链和不可微分模型时的局限性。该框架通过使用可微分的模型来预测长时域内的稳定状态,捕捉复杂的富接触物理,并能泛化到不同任务、环境和控制器,从而缩小模拟与现实之间的差距。NeRD可以作为现有物理引擎(如Newton)的即插即用后端,通过在真实数据上进行微调来进一步提升准确性,为机器人动力学提供持续改进的平台。

💡 **NeRD的核心优势在于其可微分的神经网络模型**:与依赖简化接触和忽略运动链的传统分析动力学不同,NeRD能够精确预测复杂机器人系统的长期稳定状态。它通过学习数据驱动的动力学模型,克服了传统方法在处理不可微分模型时的难题,为更真实的机器人模拟奠定了基础。

🚀 **强大的泛化能力与缩小的sim-to-real差距**:NeRD模型经过训练后,不仅能在特定任务和环境中表现出色,还能泛化到未见过但属于同一机器人的不同任务、环境和低级控制器。通过在真实世界数据上进行微调,NeRD能够显著缩小模拟与现实之间的差距,使得在模拟环境中训练的策略能够更可靠地迁移到实际机器人上。

🔌 **即插即用的集成与现有环境的复用**:NeRD可以作为现有物理引擎(如Newton)的替换后端,通过简单的配置即可集成。这意味着研究人员和工程师无需重新构建整个仿真环境,只需切换物理求解器,即可利用NeRD带来的先进模拟能力,极大地提高了开发效率和工作流程的灵活性。

🔄 **持续改进的机器人动力学模型**:文章展望了未来机器人模拟的生命周期,即机器人将配备预训练的神经网络动力学模型,并能随着与现实世界的交互不断进行微调。这种持续学习的能力使得模型能够适应机器人的磨损、环境变化等因素,提供一个不断优化的数字孪生,用于训练更通用的机器人技能。

Modern robotics requires more than what classical analytic dynamics provides because of simplified contacts, omitted kinematic loops, and non-differentiable models. Neural Robot Dynamics (NeRD) tackles these hurdles by: 

    Using expressive, differentiable models that predict stable states over long horizons.Capturing complex contact-rich physics.Generalizing across tasks, environments, and controllers, narrowing the sim-to-real gap. Fine-tuning on real data.

Unlike task-specific neural simulators, NeRD serves as a drop-in backend within physics engines like Newton, enabling teams to reuse existing policy-learning environments by simply switching the physics solver. This hybrid of analytical modules with robot-centric neural modeling paves the way for robots whose dynamics continually improve through both simulation and real-world experience.

In this post, we explore how NeRD overcomes longstanding simulation challenges, providing the foundation for modern robotics in physics engines like Newton

What is NeRD?

NeRD is a neural simulation framework. NeRD models are a learned embodiment of specific dynamics models that can predict future states of articulated rigid bodies (e.g., robots with multiple joints) in contact with the environment. 

Once trained, NeRD models can:

    Provide stable and accurate predictions over hundreds to thousands of simulation steps.Generalize to different tasks, environments, and low-level controllers for a particular robot. Be fine-tuned from real-world data to bridge sim-to-real gaps. 

NeRD models may be trained on data from any simulator. Once trained, they can be deployed as drop-in replacements for analytic solvers such as those found in modular frameworks like Newton. This enables users to reuse existing policy-learning environments and activate NeRD as a new physics backend through a single-line switch. 

Start using NeRD in Newton. View our research on arXiv or explore our project page.  

Vision for the future of robotic simulation

As robotic technologies advance, we envision a lifecycle where each robot is equipped with a neural dynamics model pretrained from analytical simulations. Such a neural dynamics model can be continuously fine-tuned as the robot interacts with the real world, enabling it to account for wear-and-tear of the robot and environmental changes. 

The neural dynamics model of the robot can be embedded into a hybrid simulation system, where neural dynamics simulate the robot while analytical dynamics are used for other parts of the scene (e.g., obstacles). These continuously-improved neural robot dynamics provide a better replica of real-world dynamics for facilitating the learning of versatile robotic skills in a digital twin powered by this continuously updated simulator. 

Figure 1. An envisioned lifecycle of a robot in the future

How does neural robot dynamics work?

NeRD is characterized by two key innovations that achieve generalizability and long-horizon prediction accuracy—a hybrid prediction framework and robot-centric input parameterization. NeRD models replace the time integration (solver) portion of a traditional simulator. In frameworks like Newton, where collision detection is decoupled from the solver, we can combine analytic collision detection in conjunction with our learned model. 

This hybrid framework enables NeRD to leverage intermediate simulation quantities (i.e., robot state, contact information, and joint-space torques) to describe the full simulation state, providing necessary information to evolve the robot dynamics regardless of the applications (e.g., tasks, scenes, and controllers). This is in contrast with previous approaches that only take robot state and task-specific actions as inputs, thus overfitting to the tasks used for training. 

Second, NeRD uses a robot-centric parameterization of inputs to enable the learned dynamics model to spatially generalize. Specifically, the robot state and contact-related quantities are transformed into the robot’s base frame before they are passed as input to the NeRD model, as shown in Figure 2(c). 

Such a robot-centric state representation enables NeRD to perform reliable predictions at unseen spatial robot locations encountered during robot motion, enhancing the long-horizon accuracy of the model.

Training dataset and network architecture

The training datasets for NeRD are generated in a task-agnostic manner using data from a simulator. For each robot instance, we collect 100K random trajectories, each consisting of 100 timesteps. These trajectories were generated using randomized initial states of the robot, random joint-torque sequences within the robot’s motor torque limits, and optionally, randomized environment configurations (shown in Figure 3). We model NeRD using a causal transformer architecture, specifically a lightweight implementation of the GPT-2 transformer, where the model takes the simulation states from the most recent 10 steps as input.

If you’d like to use NeRD, check out our open source code available on GitHub

Once a model is trained, we integrate it into a modular physics engine such as Newton. It serves as an interchangeable solver for the simulator, replacing the existing analytical dynamics and contact solvers. Developers can then use this NeRD-integrated simulator the same way they have before and reuse existing policy-learning environments.

Figure 3. Dataset used for training a NeRD model for an ANYmal quadruped robot 

Alt text: We use randomly-generated trajectories to train a NeRD model.

What are the benefits of training robots with NeRD?

Training robots with NeRD enables highly stable, accurate, and generalizable simulation, accelerating policy learning and bridging the sim-to-real gap for reliable real-world deployment.

Stability and accuracy

The trained NeRD model can accurately predict the dynamics of a chaotic system, such as a double pendulum, over a hundred time steps. A single NeRD model is also capable of simulating different contact configurations (e.g., different heights and orientations of the ground plane). Figure 4 shows a side-by-side comparison of the NeRD-integrated simulator and a ground-truth analytical simulator with a Featherstone solver.

Figure 4. Comparison of the analytical simulator and NeRD on a double pendulum with various configurations of the ground plane

Learning robotic policies exclusively in a NeRD-integrated simulator

NeRD’s efficiency and generalizability across tasks, controllers, and space enable large-scale robotic policy learning for diverse downstream tasks. We pre-train a NeRD model for an ANYmal robot and then train a forward-walking policy and a sideways-walking policy using the PPO reinforcement-learning algorithm inside the NeRD-integrated simulator, without access to the ground-truth analytical simulator. 

The learned policies can then be transferred in zero-shot to the ground-truth analytical simulator with minimal performance loss (<0.1% error in accumulated reward for 1000-step trajectories). Figures 5 and 6 show a side-by-side visualization of NeRD-trained policies executed in both the NeRD-integrated simulator and the ground-truth analytical simulator.

Figure 5. Comparison of an analytical simulator and a NeRD model for an ANYmal robot with an RL policy for forward walking at 1 m/s
Figure 6. Comparison of an analytical simulator and a NeRD model for an ANYmal robot with an RL policy for sideways walking at 1 m/s

Zero-shot sim-to-real transfer

The accuracy of the NeRD model was also validated on a 7-DoF Franka robot arm, where we performed zero-shot sim-to-real transfer for a go-to-pose (reach) policy trained exclusively in the NeRD-integrated simulator (Figure 7).

Figure 7. Zero-shot sim-to-real transfer of a go-to-pose policy trained exclusively in a NeRD-integrated simulator

Fine-tuning NeRD models from real-world data

Inherent differentiability of the NeRD models enables them to be fine-tuned rapidly from real-world data. We fine-tune a pre-trained NeRD model for a cube-tossing task using a real-world cube-tossing dataset. The fine-tuned NeRD model significantly improves ‌dynamics accuracy compared to the analytical simulator (shown in Figure 8)

Figure 8. Fine-tuning a NeRD model on real-world data better matches real-world cube-tossing dynamics. The light-green cubic frames illustrate the real-world cube trajectory

Summary

Neural Robot Dynamics (NeRD) is a neural-network-based robotic simulation framework designed to accurately predict the dynamics of complex, articulated robots over long periods. Unlike traditional robotic simulators that use simplified models and struggle with modern robot complexities, NeRD learns robot-specific dynamics directly from data, enabling stable, generalizable, and precise simulations. 

A single trained NeRD model generalizes to diverse tasks, environments, and controllers for a given robot and can be fine-tuned with real-world data to reduce the simulation-to-reality gap, making it a highly adaptable and advanced solution for robotic simulation.

Future directions

Developing effective neural simulators for modeling complex real-world robot dynamics is an active area of research. To achieve generalizable and fine-tunable neural dynamics models for robotics, this research can be extended in several exciting directions:

Robots with more complex structures and higher degrees of freedom 

Learning a neural simulator for more complicated robots (e.g., humanoid robots) can significantly improve simulation efficiency and accelerate downstream applications (e.g., learning whole-body controllers for humanoids).

Fine-tuning from partially-observable real-world data

Real-world robot data is often only partially observable due to sensor limitations. For example, contact points may not be precisely known. Investigating methods to fine-tune pre-trained NeRD models from partially observable real-world data can improve the accuracy of predicting real-world dynamics, thereby better bridging sim-to-real gaps.

Simulating robotic manipulation

Our development of the NeRD framework has thus far focused primarily on locomotion tasks. Supporting the simulation of manipulation tasks is a natural extension of this work that can further broaden its applicability.

Get started using NeRD

The NeRD models are trained using the simulation module available in Newton. View the GitHub README.md for instructions on how to use NeRD.

    Start by downloading Newton, an open-source, extensible physics engine to write GPU-accelerated, kernel-based programs for simulation, AI, robotics, and machine learning.Download NeRD’s open source code and view the README for instructions.Learn more about the details of Neural Robot Dynamics from the NeRD paper on arXiv.

Stay tuned for the release of the training and inference code for NeRD, enabling you to simulate a dynamic robot using a neural physics solver.

Learn more about the research being showcased at CoRL and Humanoids, happening September 27-October 2 in Seoul, Korea.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Neural Robot Dynamics NeRD 机器人模拟 神经网络 动力学模型 Sim-to-real 物理引擎 Machine Learning Robotics AI
相关文章