DeepMind is mostly known for its work in deep reinforcement learning, especially in mastering complicated games and predicting protein structures. And now, it is taking its next step in robotics research.
According to a blog post on DeepMind’s website, the company has acquired the rigid-body physics simulator MuJoCo and has made it freely available to the research community. MuJoCo is now one of several open-source platforms for training artificial intelligence agents used in robotics applications. Its free availability will have a positive impact on the work of scientists who are struggling with the costs of robotics research. It can also be an important factor for DeepMind’s future, both as a science lab seeking artificial general intelligence and as a business unit of one of the largest tech companies in the world.
Simulating the real world
Simulation platforms are a big deal in robotics. Training and testing robots in the real world is expensive and slow. Simulated environments, on the other hand, allow researchers to train multiple AI agents in parallel and at speeds that are much faster than real life. Today, most robotics research teams carry out the bulk of training their AI models in simulated environments. The trained models are then tested and further finetuned on real physical robots.
The past few years have seen the launch of several simulation environments for reinforcement learning and robotics.
MuJoCo, which stands for Multi-Joint Dynamics with Contact, is not the only game in town. There are other physics simulators such as PyBullet, Roboschool, and Isaac Gym. But what makes MuJoCo stand out from others is the fine-grained detail that has gone into simulating contact surfaces. MuJoCo performs a more accurate modeling of the laws of physics, which is shown in the emergence of physical phenomena such as Newton’s Cradle.
MuJoCo also has built-in features that support the simulation of musculoskeletal models of humans and animals, which is especially important in bipedal and quadruped robots.
The increased accuracy of the physics environment can help reduce the differences between the simulated environment and the real world. Called the “sim2real gap,” these differences cause a degradation in the performance of the AI models when they are transferred from simulation to the real world. A smaller sim2real gap reduces the need for adjustments in the physical world.
Making MuJoCo available for free
Before DeepMind open-sourced MuJuCo, many researchers were frustrated with its license costs and opted to use the free PyBullet platform. In 2017, OpenAI released Roboschool, a license-free alternative to MuJoCo, for Gym, its toolkit for training deep reinforcement learning models for robotics and other applications.
“After we launched Gym, one issue we heard from many users was that the MuJoCo component required a paid license… Roboschool removes this constraint, letting everyone conduct research regardless of their budget,” OpenAI wrote in a blog post at the time.
A more recent paper by researchers in Cardiff University states, “The cost of a Mujoco institutional license is at least $3000 per year, which is often unaffordable for many small research teams, especially when a long-term project depends on it.”
DeepMind’s blog refers to a recent article in PNAS that discusses the use of simulation in robotics. The authors recommend better support for the development of open-source simulation platforms and write, “A robust and feature-rich set of four or five simulation tools available in the open-source domain is critical to advancing the state of the art in robotics.”
“In line with these aims, we’re committed to developing and maintaining MuJoCo as a free, open-source, community-driven project with best-in-class capabilities,” DeepMind’s blog post states.
It is worth noting, however, that license fees account for a very small part of the costs of training AI models for robots. The computational costs of robotics research tend to rise along with the complexity of the application.
MuJoCo only runs on CPUs, according to its documentation. It hasn’t been designed to leverage the power of GPUs, which have many more computation cores than traditional processors.
A recent paper by researchers at the University of Toronto, Nvidia, and other organizations highlights the limits of simulation platforms that work on CPUs only. For example, Dactyl, a robotic hand developed by OpenAI, was trained on a compute cluster comprising around 30,000 CPU cores. These kinds of costs remain a challenge with CPU-based platforms such as MuJoCo.
DeepMind’s view on intelligence
DeepMind’s mission is to develop artificial general intelligence (AGI), the flexible kind of innate and learned problem-solving capabilities found in humans and animals. While the path to AGI (and whether we will ever reach it or not) is hotly debated among scientists, DeepMind has a clearly expressed view on it.
In a paper published earlier this year, some of DeepMind’s top scientists suggested that “reward is enough” to reach AGI. According to DeepMind’s scientists, if you have a complex environment, a well-defined reward, and a good reinforcement learning algorithm, you can develop AI agents that will acquire the traits of general intelligence. Richard Sutton, who is among the co-authors of the paper, is one of the pioneers of reinforcement learning and describes it as “the first computational theory of intelligence.”
The acquisition of MuJoCo can provide DeepMind with a powerful tool to test this hypothesis and gradually build on top of its results. By making it available to small research teams, DeepMind can also help nurture talent it will hire in the future.
MuJoCo can also boost DeepMind’s efforts to turn in profits for its parent company, Alphabet. In 2020, the AI lab recorded its first profit after six years of sizeable costs for Alphabet. DeepMind is already home to some of the brightest scientists in AI. And with autonomous mobile robots such as Boston Dynamics’ Spot slowly finding their market, DeepMind might be able to develop a business model that serves both its scientific goal and its owner’s interests.