Blog

Can AI assistants protect you against malicious AI?

November 26, 2018

White robot on blurred background hacking and accessing private — Source: Depositphotos

“Now that we realize our brains can be hacked, we need an antivirus for the brain.” Those were the words of Yuval Noah Harari, famous historian and outspoken critic of Silicon Valley. The sentence, which was part of a recent interview by Wired’s Nick Thompson with Harari and former Google design ethicist Tristan Harris, was a reference to how tech companies use AI algorithms to manipulate user behavior in profitable ways.

For instance, if you’re watching NBA game recap videos, YouTube will recommend more NBA videos. The more videos you watch, the more ads YouTube can show you, and the more money it makes from ad impressions. This is basically the business model that all “free” apps use. They try to keep you glued to the screen with little regard to what the impact will be on your mental and physical health.

And they use the most advanced technologies and the most brilliant minds to achieve that goal. For instance, they use deep learning and other AI techniques to monitor your behavior and compare it to that of millions of other users to provide you with super-personalized recommendations that you can hardly resist.

So yes, your brain can be hacked. But how do you build the antivirus that Harari is speaking about? “It can work on the basis of the same technology,” Harari said. “Let’s say you have an AI sidekick that monitors you all the time, 24 hours a day, what you write, what you see, everything. But this AI is serving you, has this fiduciary responsibility. And it gets to know your weaknesses, and by knowing your weaknesses, it can protect you against other agents trying to hack you into exploiting your weaknesses.”

While Harari was laying out “AI sidekick” concept, Harris, who is a veteran engineer, nodded in approval, which says something about how realistic the idea is.

For example, if you have a weakness for, say, funny cat videos and you can’t stop yourself from watching them, you AI sidekick should intervene if it “feels” that some malignant artificial intelligence system is trying to exploit and would show you a message about a blocked threat, Harari explains.

To sum it up, Harari’s AI sidekick needs to accomplish the following:

It must be able to monitor all your activities
It must be able to identify your weaknesses and know what’s good for you
It must be able to detect and block an AI agent that is exploiting your weaknesses

In this post, we want to see what it would take to create the AI sidekick Harari suggests and whether it’s possible with contemporary technology.

An AI sidekick that monitors all your activities

Harari’s first requirement for the protective AI sidekick is that it sees everything you do. This is a fair premise since as we know, contemporary AI is widely different from human intelligence and too reliant on quality data.

A human “sidekick”—say a parent or an older sibling—would be able to tell right from wrong based on their own personal life experiences. They have an abstract model of the world and a general perception of the consequences of human actions. For instance, they will be able to predict what’s going to happen if you watch too much TV and do to little exercise.

Unlike humans, AI algorithms start with a blank slate and have no notion of human experiences. The current state-of-the-art artificial intelligence technology is deep learning, an AI technique that is especially good at finding patterns and correlations in large data sets. As a rule of thumb, the more quality data you give a deep learning algorithm, the better it will become at classifying new data and making predictions.

Now, the question is, how can you create a deep learning system that can monitor everything you do. Currently, there is none. With the explosion of cloud and internet of things (IoT), tech companies, cybercriminals and government agencies have many new ways to open windows into our daily lives, collect data and monitor our activities. However, fortunately, none of them have access to all our personal data.

Google has a very broad view of your online data, including your search and browsing history, the applications you install on your android devices, your Gmail data, your Google Docs content and your YouTube viewing history. However, Google doesn’t have access to your Facebook data, which includes your friends, your likes, clicks and other engagement preferences. Facebook has access to some of the sites you visit, but it doesn’t have access to your Amazon shopping and browsing data. Thanks to its popular Echo smart speaker, Amazon knows a lot about your in-home activities and preferences, but it doesn’t have access to your Google data.

The point is, even though you’re giving away a lot of information to tech companies, no single company has access to all of it. Plus, there’s still a lot of information that hasn’t been digitized. For instance, an example that Harari brings up frequently is how AI might be able to quantify your reaction to a certain image by monitoring the changes in your pulse rate when you view the image. But how will they do that? Harari says that tech companies won’t necessarily need a wearable device to capture your heart rate and they can do it with a hi-res video feed of your face and by monitoring the changes to your retina. But that hasn’t happened yet.

Also, a lot of the online activities we perform are influenced by our experiences in the physical world, such as conversations we have with colleagues or things we heard in class. Maybe it was a billboard I saw while waiting for the bus or a conversation between two people that I absently heard while sitting in the metro. It might have to do with the quality of sleep I had the previous night or the amount of carbs I had for breakfast.

Now the question is, how do we give an AI agent all our data? With current technology, you’ll need a combination of hardware and software. For instance, you’ll need a smart watch or fitness tracker to enable your AI sidekick to monitor your vital signs as you carry out different activities. You’ll need an eye tracking headgear that can enable your AI sidekick to trace your gaze and scan your field of vision to find correlations between your vital signs and what you’re seeing.

Your AI assistant will also have to live in your computing devices, your smartphone and laptop. It’ll then be able to record relevant data about all the activities you’re carrying out online. Putting all this data together, your AI sidekick will be better positioned to identify problematic patterns of behavior.

There are two problems with these requirements. First, the costs of the hardware will effectively make the AI sidekick only available to a limited audience, likely the rich elite of Silicon Valley who understand the value of such an assistant and are willing to bear the financial costs. However, as studies have shown, the people who are most at risk are not the rich elite, but the poorer people who have access to low-priced mobile screens and internet and are less educated about the adverse effects of screen time. They won’t be able to afford the AI sidekick.

The second problem is storing all the data you collect about the user. Having so much information in one place can give you great insights into your behavior. But it will also give anyone who gains unauthorized access to it incredible leverage to use it for evil purposes. Who will you trust with your most sensitive data? Google? Facebook? Amazon? None of those companies have a positive record of having the best of their users’ interests in their mind. Harari does mention that your AI sidekick has a fiduciary duty. But which commercial company is willing to pay for the costs of storing and processing your data without getting something in return?

Should the government hold your data? And what’s to prevent government authorities from not using it for evil purposes such as surveillance and manipulation. We might want to try using a combination of blockchain and cloud service to make sure that only you have full control over your data, and we can use decentralized AI models to prevent any single entity from having exclusive access to the data. But that still doesn’t remove the costs of storing the data.

The entity can be a non-profit that is backed with huge funding from government and the private sector. Alternatively it can opt for a monetized business model. Basically, this means that you’ll have to pay a subscription cost to have the service store and process your data, but that will make the AI sidekick even more expensive and less accessible to the underprivileged classes that are more vulnerable.

Final verdict: An AI sidekick that can collect all your data is not impossible, but it’s very hard and costly and will not be available to everyone.

An AI sidekick that can detect your weaknesses

This is where Harari’s proposition hits its biggest challenge. How can your sidekick distinguish what’s good or bad for you? The short answer is: It can’t.

Current blends of artificial intelligence are considered narrow AI, which means they’re optimized for performing specific tasks such as classifying images, recognizing voice, detecting anomalous internet traffic or suggesting content to users.

Distinguishing human weaknesses is anything but a narrow task. There are too many parameters, too many moving parts. Every person is unique in their own right, influenced by countless parameters and experiences. A repeat task that might prove harmful for one person might be beneficial to another person. Also, weaknesses might not necessarily present themselves in repeat actions.

Here’s what deep learning can do for you: It can find patterns in your actions and predict your behavior. That’s how AI-powered recommendation systems keep you engaged on Facebook, YouTube and other online applications.

For instance, your AI sidekick can learn that you’re very much interested to food diet videos, or that you read too much liberal or conservative news sources. It might even be able correlate those data points to all the other information, such as the profiles of your classmates or colleagues. It might relate your actions to other experiences you encounter during the day, such as seeing an ad on a bus stop. But distinguishing patterns doesn’t necessarily lead to “detecting weaknesses.” It can’t tell which behavior patterns are harming you, especially since many show themselves in the long run and can’t be necessarily related to changes in your vital signs or other distinguishable actions.

That’s the kind of stuff that requires human judgement, something that deep learning is sorely lacking. Detecting human weakness is in the domain of general AI, also known as human-level or strong artificial intelligence. But general artificial intelligence is still the stuff of myth and sci-fi novels and movies, even if some parties like to overhype the capabilities of contemporary AI.

Theoretically, you can hire a bunch of humans to label repeat patterns and flag the ones that are proving to be detrimental to the users. But that would require a huge effort involving cooperation between engineers, psychologists, anthropologists and other experts, because mental health trends differ between different populations based on history, culture, religion and many other factors.

What you’ll have at best is an AI agent that can detect your behavior patterns and show them to you—or a “human sidekick” who will be able to distinguish which ones can harm you. In itself, this is a pretty interesting and productive use of current recommendation systems. In fact, there are several researchers working on AI that can follow ethics codes and rules as opposed to seeking maximum engagement.

An AI sidekick that can prevent other AI from hacking your brain

Blocking AI algorithms that are taking advantage of your absent weaknesses will be largely contingent on knowing those weakness. So, if you can accomplish goal number two, achieving the third goal will not be very hard.

But we’ll have to specify for our assistant what exactly “hacking your brain” is. For instance, if you watch a single cat video, it doesn’t matter, but if you watch three consecutive videos or spend 30 minutes watching cat videos, then your brain has been hacked.

Therefore, blocking brain hacking attempts by malicious AI algorithms might not be as straightforward as blocking malware threats. But for instance, your AI assistant can warn you that you’ve spent the past 30 minutes doing the same thing. Or better yet, it can warn your human assistant and let them decide whether it’s time to interrupt your current activity.

Also, your AI sidekick can inform you, or your trusted human assistant, that it thinks the reason you’ve been searching and reading reviews for a certain gadget for a certain amount of time might somehow be related to several offline or online ads you’ve seen earlier, or a conversation you might’ve had by the water cooler at work.

This could give you insights to influences you’ve absently picked up and you might not be aware of. It can also help in areas where influence and brain hacking doesn’t involve repeat actions. For instance, if you’re going to buy a certain item for the first time, your AI sidekick can warn you that you’ve been bombarded with ads about that specific item in the past few days and suggest that you rethink before you make the purchase.

Your AI sidekick can also give you a detailed report of your behavioral patterns, such as iOS’s new Screen Time feature, which tells you how much time you spent staring at your phone and which apps you used. Likewise, your AI assistant can tell how different topics are occupying your daily activities.

But making the ultimate decision of which activities to block or allow is something that you or a trusted friend of relative will have to do.

Final verdict

Harari’s idea for an AI sidekick is an interesting idea. At its heart, it suggests to upend current AI-based recommendation models to protect users against brain hacking. However, as we saw, there are some real hurdles as to creating such a sidekick.

First, creating an AI system that can monitor all your activities is costly. And second, protecting the human mind against harm is something that requires human intelligence.

That said, I don’t suggest that AI can’t help protect you against brain hacking. If we look at it from the augmented intelligence perspective, there might be a middle ground that can both accessible to everyone and help better equip all of us against AI manipulation.

The idea behind augmented intelligence is that AI agents are meant to complement and enhance humans skills and decisions, not to fully automate them and remove humans from the cycle. This means that your AI assistant is meant to educate you about your habits and let a human (whether it’s yourself, a sibling, friend or parent) decide what is best for you.

With this in mind, you can create an AI agent that needs less data. You can strip the wearables and smart glasses that will record everything you do offline and limit your AI assistant to monitor online activities on your mobile devices and computers. It can then give your reports on your habits and behavioral patterns and help you in making the best decisions. This will make the AI assistant much more affordable and accessible to a broader audience, even thought it might not be able to provide as much insights as it could with wearable data access. You’ll still have to account for the costs of storage and processing, but the costs will be much lower and probably something that can be covered with a government grant focused on population health.

AI assistants can be a good tool in helping detect brain hacking and harmful online behavior. But they can’t replace human judgement. It’ll be up to you and your loved ones to decide what’s best for you.

Beyond vibe coding: How Codev 3.0 engineers the AI-powered dev team

How Cursor’s Composer 2.5 uses self-distillation to beat the frontier LLMs…

Vertical integration as AI infrastructure: What 21D’s full arch implant system…

Why sandboxing OpenClaw doesn’t stop data exfiltration

Google brings multi-token prediction Gemma 4 LLMs

Applied ML: When ‘perfect’ becomes the enemy of ‘good’

AI can’t replace software engineers yet, but here is how to…

How to turbocharge your product and market research with DeepSearch

How looking differently at data can save your machine learning project

Building a solid data foundation for generative AI applications

Why the future of agentic AI is all about the harness

The evolution of LLM tool-use from API calls to agentic applications

What makes DeepSeek-V3.2 so efficient?

What to know about Claude Opus 4.5

OpenAI’s GPT-5: A reality check for the AI hype train

AI is writing your code, but who’s reviewing it?

Machine learning in space: Building intelligent systems for the harshest environments

Decoding the brain, inspiring AI: How Rahul Biswas is bridging neuroscience…

The cash flow conundrum: How technology is reshaping small business finance

What to know about the security of open-source machine learning models

Can AI assistants protect you against malicious AI?

An AI sidekick that monitors all your activities

An AI sidekick that can detect your weaknesses

An AI sidekick that can prevent other AI from hacking your brain

Final verdict

Like this:

Leave a ReplyCancel reply

An AI sidekick that monitors all your activities

An AI sidekick that can detect your weaknesses

An AI sidekick that can prevent other AI from hacking your brain

Final verdict

Like this:

Leave a ReplyCancel reply

Discover more from TechTalks