Hidden Commands in Images Exploit AI Chatbots and Steal Data

Hidden commands in images can exploit AI chatbots, leading to data theft on platforms like Gemini through a new image scaling attack.
A newly discovered vulnerability in AI systems could allow hackers to steal private information by hiding commands in ordinary images. This discovery came from cybersecurity researchers at Trail of Bits, according to which they have found a way to trick AI models by exploiting a common feature: image downscaling. This attack, which has been named an “image scaling attack.”
AI models often automatically reduce the size of large images before processing them. This is where the vulnerability lies. The researchers found a way to create high-resolution images that appear normal to a human eye but contain hidden instructions that become visible only when the image is shrunk by the AI. This “invisible” text, a type of prompt injection, can then be read and executed by the AI without the user’s knowledge.
The researchers demonstrated the attack’s effectiveness on several AI systems, including Google’s Gemini CLI, Gemini’s web interface, and Google Assistant. In one instance, they showed how a malicious image could trigger the AI to access a user’s Google Calendar and email the details to an attacker, all without any confirmation from the user.
To help others understand and defend against this new threat, the research team created a tool called Anamorpher. The name is inspired by anamorphosis, an art technique that makes a distorted image appear normal when viewed in a specific way. The tool can be used to create these special images, allowing security professionals to test their own systems.
Researchers recommend a few simple but effective ways to protect against such attacks. One key solution is to always show the user a preview of the image as the AI model sees it, especially in command-line and API tools.
Most importantly, they advise that AI systems should not automatically allow sensitive actions triggered by commands within images. Instead, a user should always have to give clear, explicit permission before any data is shared or a task is performed.
HackRead