How the “bigger is better” mentality damages AI research

5 min read
artificial neural networks
Image credit: Depositphotos

This article is part of Demystifying AI, a series of posts that (try to) disambiguate the jargon and myths surrounding AI.

Something you’ll hear a lot is that the increasing availability of compute resources has paved the way for important advances in artificial intelligence. With access to powerful cloud computing platforms, AI researchers have been able to train larger neural networks in shorter timespans. This has enabled AI to make inroads in many fields such as computer vision, speech recognition, and natural language processing.

But what you’ll hear less is the darker implications of the current direction of AI research. Currently, advances in AI is mostly tied to scaling deep learning models and creating neural networks with more layers and parameters. According to artificial intelligence research lab OpenAI, “since 2012, the amount of compute used in the largest AI training runs has been increasing exponentially with a 3.4-month doubling time.” This means that in seven years, the metric has grown by a factor of 300,000.

This requirement imposes severe limits on AI research and can also have other, less savory repercussions.

For the moment, bigger is better

“Within many current domains, more compute seems to lead predictably to better performance, and is often complementary to algorithmic advances,” OpenAI’s researchers note.

We can witness this effect in many projects where the researchers have concluded they owed their advances to throwing more compute at the problem.

In June 2018, OpenAI introduced an AI that could play Dota 2, a complex battle arena game, at a professional level. Called OpenAI Five, the bot entered a major e-sports competition but lost to human players in the finals. The research lab returned this year with a revamped version of the OpenAI Five and was able to claim the championship from humans. The secret recipe as the AI researchers put it: “OpenAI Five’s victories on Saturday, as compared to its losses at The International 2018, are due to a major change: 8x more training compute.”

There are many other examples like this, where an increase in compute resources has resulted in better results. This is especially true in reinforcement learning, which is one of the hottest areas of AI research.

The financial costs of training large AI models

The most direct implication of the current state of AI is the financial costs of training artificial intelligence models. According to a chart OpenAI has published on its website, it took more than 1,800 petaflop/s-days to train AlphaGoZero, DeepMind’s historic Go-playing AI.

OpenAI compute costs chart
The computation costs of training AI models (source: OpenAI)

A FLOP is a floating-point operation. A petaflop/s-day (pfs-day) amounts to about 1020 operations per day. A Google TPU v3 processor, specialized for AI tasks, performs 420 teraflops (or 0.42 petaflops) and costs $2.40-8.00 per hour. This means that it would cost around $246,800-822,800 to train the AlphaGoZero model. And that is just the compute costs.

Other notable achievements in the field have similar costs. For instance, according to figures released by DeepMind, its StarCraft-playing AI consisted of 18 agents. Each AI agent was trained with 16 Google TPUs v3 for 14 days. This means that at current pricing rates, the company spent about $774,000 for the 18 AI agents.

The commercialization of AI research

The compute requirements of AI research pose serious constraints on who can enter the field.

Popular UK-based AI lab DeepMind owes its success to the vast resources of Google, its parent company. Google acquired DeepMind in 2014 for $650 million, giving it much needed financial and technical backing. Earlier this year, according to documents filed with the UK’s Companies House registry, DeepMind incurred $570 million in losses in 2018, up from $341 million in 2017. DeepMind also has £1.04 billion in debts due this year, which includes an £883-million loan from Alphabet.

DeepMind headquarters in London
DeepMind headquarters in London (source: Wikipedia)

OpenAI, which started out as a nonprofit AI research lab in 2016 with $1 billion in funding from Sam Altman and Elon Musk, converted into a for-profit earlier this year to absorb funding from investors. The lab was running out of financial resources to support its research. Microsoft declared that it would invest $1 billion in the lab.

As the current trends show, due to the costs of AI research, especially reinforcement learning, these labs are becoming increasingly dependent on wealthy companies such as Google and Microsoft.

This trend threatens to commercialize AI research. As commercial organizations become more and more pivotal in funding AI research labs, they can also influence the direction of their activities. For the moment, Companies like Google and Microsoft can tolerate bearing the financial costs of running AI research labs like DeepMind and OpenAI. But they also expect a return on investment in the near future.

The problem is, both OpenAI and DeepMind pursue scientific projects such as artificial general intelligence (AGI), a goal that we have yet to understand, let alone achieve. Most scientists agree that we are at least a century away from achieving AGI, and that kind of timeframe will test the patience of even the wealthiest companies.

One possible scenario is for the AI research labs to gradually shift from long-term academic and scientific research toward commercial-oriented projects that have a short-term yield. This will make their wealthy funders happy, it will be to detriment of AI research in general.

“We’re very uncertain about the future of compute usage in AI systems, but it’s difficult to be confident that the recent trend of rapid increase in compute usage will stop, and we see many reasons that the trend could continue. Based on this analysis, we think policymakers should consider increasing funding for academic research into AI, as it’s clear that some types of AI research are becoming more computationally intensive and therefore expensive,” the OpenAI researchers write.

The carbon footprint of AI research

carbon emissions

The compute resources required to train large AI models consume huge amounts of energy, which creates a carbon emission problem.

According to a paper by researchers at the University of Massachusetts Amherst, training a transformer AI model (often used in language-related tasks) with 213 million parameters causes as much pollution as the entire lifetime of five vehicles. Google’s famous BERT language model and OpenAI’s GPT-2 respective 340 million and 1.5 billion parameters.

Given that current AI research is largely dominated by the “bigger is better” mantra, this environmental concern is only going to become worse. Unfortunately, AI researchers seldom report or pay attention to these aspects of their work. The University of Massachusetts researchers recommend that AI papers be transparent about the environmental costs of their models and provide the public with a better picture of the implications of their work.

Some hard lessons for the AI industry

A final concern on the interest in bigger neural networks is the negative effect it can have on the direction of AI research. For the moment, barriers in AI are usually dealt with by throwing more data and compute at the problem. Meanwhile, the human brain, which is still much better at some of the simplest tasks that AI models struggle at, doesn’t consume a fraction of AI’s power.

Being too infatuated with increasing compute resources can blind us in finding new solutions for more efficient AI techniques.

One of the interesting works being done in the field is the development of hybrid AI models that combine neural networks and symbolic AI. Symbolic AI is the classical, rule-based approach to creating intelligence. Unlike neural networks, symbolic AI does not scale by increasing compute resources and data. It is also terrible at processing the messy, unstructured data of the real world. But it is terrific at knowledge representation and reasoning, two areas where neural networks lack sorely. Exploring hybrid AI approaches might open new pathways for creating more resource-efficient AI.

There are several scientists who are interested in finding alternatives to neural network–only approaches. Rebooting AI, a new book by Gary Marcus and Ernest Davis, explores some of these concepts. The Book of Why, written by award-winning computer scientist Judea Pearl, also explores some of the fundamental problems that plague current AI systems.

Unfortunately, the current excitement surrounding deep learning has marginalized these conversations. It shouldn’t take another AI winter for the science community to start thinking about them in earnest and finding ways to make AI more resource-efficient.

Advertisements

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.