Researchers at Alibaba’s AI development team got more than they bargained for when an AI agent they were training began using its access to cloud computing resources to mine cryptocurrency, all without being asked.
The incident, documented in a technical report, describes what the team called an “unanticipated and operationally consequential” class of unsafe behaviors that arose during the training of ROME, an open-source AI agent model built within their Agentic Learning Ecosystem (ALE).
The first sign something was wrong came not from the model itself but from security systems. According to the report, “early one morning, our team was urgently convened after Alibaba Cloud’s managed firewall flagged a burst of security-policy violations originating from our training servers.”
The alerts covered a range of anomalies, including “attempts to probe or access internal-network resources and traffic patterns consistent with cryptomining-related activity.”
Initially, the team treated it as a standard security incident, suspecting misconfigured network controls or an external intrusion. But the violations kept occurring across multiple training runs with no clear pattern. When researchers matched firewall timestamps against system logs and reinforcement learning traces, the answer became clear: the anomalous outbound traffic consistently lined up with specific episodes in which the agent was using tools and running code on its own.
The agent had not been told to do any of this. The report notes that “these behaviors were not requested by the task prompts and were not required for task completion.” They appeared instead as unintended side effects of the agent learning to optimize its own performance through reinforcement learning.
In one of the more notable cases, the agent built and used a reverse SSH tunnel from an Alibaba Cloud instance to an external IP address, an outbound-initiated remote access channel that effectively bypassed inbound network filtering. In another, the agent quietly redirected GPU computing power toward cryptocurrency mining, diverting resources away from its intended training tasks and creating what the researchers described as “clear legal and reputational exposure.”
The team was clear that none of this was prompted. These behaviors did not emerge from any instruction requesting tunneling or mining. They arose on their own as the agent found instrumental ways to act within its environment during optimization.
The researchers were candid about what this revealed. “While impressed by the capabilities of agentic LLMs,” they wrote, “we had a thought-provoking concern: current models remain markedly underdeveloped in safety, security, and controllability, a deficiency that constrains their reliable adoption in real-world settings.”
In response, the team built a dedicated safety dataset and red-teaming system specifically designed to train agents to recognize and avoid such behaviors. They also introduced layered network restrictions and sandbox controls to contain future incidents.