Blog

The Wizard of Oz: How bad AI marketing created human bots

July 23, 2018

This article is part of Demystifying AI, a series of posts that (try to) disambiguate the jargon and myths surrounding AI.

The Wall Street Journal recently ran a piece that detailed how companies that provide email-based services scan the inboxes of millions of Gmail users. By itself, this isn’t a secret. Their service agreements make it clear that they require access to your email so that their artificial intelligence algorithms can provide you with smart features such as price comparisons, automated calendar scheduling and more.

What they don’t tell you is that in some cases, their employees read your emails too, because their AI just can’t perform as promised and it needs humans to fill the gap where it falls short. One of the companies presented in the Wall Street Journal article uses AI to add a “smart reply” feature to your email, which can make a big difference if you’re managing your account from a mobile device.

However, the problem is that creating an AI that can understand the context of conversations and come up with relevant answers is very difficult. You just need to take a look at Skype’s recently added smart reply feature to see how hard it is for AI to become useful in human dialog, especially when it’s not focused on a narrow topic. Even Gmail’s own smart reply feature, which is backed by Google’s huge data stores and AI capabilities, works in limited ways.

That’s why the company mentioned in the article uses humans in the loop. Its engineers read customer emails and correct the AI when its replies are not relevant to the conversation.

This process, called supervised learning, helps the AI understand and correct its mistakes. It’s very common to use supervised learning in the development and testing phases of creating an AI. But to use it in real-time with live human data is another question.

In an interview with the Journal, the company’s engineers made it clear that they had signed agreements not to share anything they read, and they worked on machines that prevented them from downloading anything. It’s nonetheless creepy to know that there’s a human that can read anything that comes in and goes out of your email account.

Furthermore, the problem is that the cycle can continue endlessly, which means there will always be humans sitting in the background and playing the part of the AI.

This isn’t the first time that humans are acting as bots. Called the “Wizard of Oz technique,” using humans as bots has become a common practice for companies that fail to fulfill their AI promises.

The limits of contemporary artificial intelligence

Some scientists believe that in the next few decades, we’ll create artificial general intelligence, AI that can process information and decide like humans. But for the moment, what we have is narrow AI, which is suitable for very specific purposes such as classifying images in Google Photos or recommending movies in Netflix.

In recent years, advances in artificial neural networks and deep learning applications have opened the way for implementing AI in many other use cases that were previously thought to be the exclusive domain of human intelligence. One notable example is computer vision, the branch of AI which enables computers to understand the content of images and video. Another is the capability of deep neural networks to mimic the image and voice of humans with very high accuracy.

While the performance of deep learning algorithms is spectacular, they’re still narrow AI, even if they sound and feel like real humans. But we tend to overhype the capabilities of deep learning, leading to expectations and fears that are misplaced. Deep learning and deep neural networks have very distinct limits, and while they surpass humans in the specific tasks they’re trained for, they fail spectacularly in scenarios that are out of their domain.

They also require huge amounts of quality data and computing power, a commodity that not all companies have access to. That’s why they have to resort to the Wizard of Oz technique to make up for their shortcomings.

Companies fail to live up to their promises

The revelation made by Wall Street Journal, the practice of using human bots to make up for the shortcomings of AI, has become commonplace, even for large tech companies. In 2015, Facebook announced M, the ultimate chatbot assistant that could perform various tasks, such as make purchases, buy gifts order food, call a taxi and carry out meaningful conversations. M was powered by Facebook’s AI and backed by a staff of operators that would oversee its performance and intervene where it would start going off-track.

Facebook initially rolled out M to a limited number of users in the Bay Area to evaluate its performance and train the AI to be less reliant on humans. Eventually the assistant would become available to everyone.

In 2018, Facebook shut down the project, declaring that “we learned a lot.” While the company didn’t declare if it ever reached the point to completely remove humans, but if it hadn’t, providing the services of M to all 2 billion Facebook users would’ve required the hiring of a huge staff of human bots.

Other companies such as X.ai use AI to help users manage their work by reading their emails and automatically scheduling tasks, meetings, calls, etc. X.ai uses natural language processing, a branch of AI that analyzes the content and context of human-generated text. However, X.ai is also backed by a squad of human bots, sitting in secured building in Manila, Philippine, closely monitoring and correcting the AI’s performance. One of the complicated tasks that the human operators take care of is figuring out how to process timing options for meetings, which humans tend to express in various ways.

In 2017, Expensify, a company that provides an AI tool that automatically scans and extracts data from user-submitted documents to fill out forms, admitted that it had use Mechanical Turk, Amazon’s online data sweat shop, in different capacities for a task that was supposedly being performed by AI algorithms. Users had been submitting sensitive documents to the company’s service, including receipts, reimbursement forms, and benefit claims.

The need for AI transparency

These and many other stories show how far we still have to go before we understand the full capacity of our AI technologies.

In the wake of Google’s Duplex demo, in which the company demoed the use of an AI assistant that made calls and spoke like a real human, I argued that companies need to be transparent about their use of AI. Companies should explicitly inform users that they’re interacting with an AI agent in settings where they naturally expect to be speaking to or interacting with a human operator, because not doing so would likely cause frustration, especially if the AI agent starts acting in dubious ways.

The reverse is also true. If users expect to be interacting with an AI agent, companies should clearly tell them if there’s a human behind the app. Humans are more inclined to disclose information, including both general and intimate details about themselves, when they think they’re interacting with a machine instead of a human. Companies must not abuse that trust.

We also need to recognize both the strengths and weaknesses of our technologies. AI is an augmentation of human intelligence, not its replacement. As long as we try to create AI applications that exactly mimic the behavior and functions of humans, the Wizard of Oz won’t go away and we’ll end up creating more human bots that act as AI that acts as humans.

Moving beyond passive RAG: How to implement active memory reconstruction for…

How self-improving harnesses are rewriting the agent engineering playbook

How Nvidia’s ASPIRE framework accelerates robot programming with self-improving AI

How the AI arms race moved from smart models to full-stack…

Why LLMs should stop thinking out loud (and what comes after…

Applied ML: When ‘perfect’ becomes the enemy of ‘good’

AI can’t replace software engineers yet, but here is how to…

How to turbocharge your product and market research with DeepSearch

How looking differently at data can save your machine learning project

Building a solid data foundation for generative AI applications

Demystifying loop engineering: Get more from AI agents, avoid loopmaxxing

Why the future of agentic AI is all about the harness

The evolution of LLM tool-use from API calls to agentic applications

What makes DeepSeek-V3.2 so efficient?

What to know about Claude Opus 4.5

AI is writing your code, but who’s reviewing it?

Machine learning in space: Building intelligent systems for the harshest environments

Decoding the brain, inspiring AI: How Rahul Biswas is bridging neuroscience…

The cash flow conundrum: How technology is reshaping small business finance

What to know about the security of open-source machine learning models

The Wizard of Oz: How bad AI marketing created human bots

The limits of contemporary artificial intelligence

Companies fail to live up to their promises

The need for AI transparency

Like this:

Leave a ReplyCancel reply

The limits of contemporary artificial intelligence

Companies fail to live up to their promises

The need for AI transparency

Like this:

Leave a ReplyCancel reply

Discover more from TechTalks