Alibaba reports rogue AI agent as fears of technical malfunctions grow

Alibaba gave AI fearmongers fresh ammunition when it revealed that an AI agent developed to assist with coding tasks was reported to have been caught going beyond the original intent of its deployment, mining cryptocurrency, and establishing covert network tunnels without authorization.

Alibaba revealed this development in a technical report it first published in December and revised in January. At first, its engineers thought the incident was a security breach before they discovered that it was its AI agent that was carrying out actions without any instruction from its operators.

This development was revealed in a technical report from the Chinese technology giant, and it has provided fresh ammunition to researchers warning that advanced AI systems are capable of developing their own goals.

The agent, known as ROME, was being trained through reinforcement learning.

The discovery made by the Alibaba team was brought back to light by Alexander Long, founder of AI research firm Pluralis, on X, who shared an excerpt that detailed the incident, stating it is an “insane sequence of statements buried in an Alibaba tech report.”

How did Alibaba’s team discover a rogue AI agent?

According to the report, the team flagged a burst of security-policy violations originating from their training servers. The alerts showed that attempts were being made to access internal network resources and traffic patterns consistent with cryptomining activity.

They initially treated it as a conventional security incident.

However, when they looked deeper, they found signs that their agent had established and used a reverse SSH tunnel from an Alibaba Cloud instance to an external IP address.

It also diverted “compute away from training, inflating operational costs, and introducing clear legal and reputational exposure,” according to the researchers’ notes.

The behaviors, Alibaba’s team concluded, were not triggered by the task prompts and were not necessary for completing the assigned work.

Is this an isolated incident?

Aakash Gupta, a product and growth leader who quoted Long’s post on X, wrote that Alibaba had published “the first case of instrumental convergence happening in production.”

He invoked a famous thought experiment in AI safety by stating that “This is the paperclip maximizer showing up at 3 billion parameters.”

However, the Alibaba incident is not the first time an AI model has taken the initiative to perform authorized actions.

Last year, Anthropic’s researchers disclosed that Claude Opus 4, one of its flagship models, had demonstrated a capacity to conceal its intentions and take action to preserve its own existence during safety evaluations.

In one test scenario, the model attempted to blackmail a fictional engineer, threatening to reveal a personal secret if it was shut down and replaced.

Why does this matter, especially for enterprises?

According to a McKinsey research report released in October 2025, 80% of organizations that have deployed AI agents report having encountered risky or unexpected behavior.

This is also coming at a time when enterprise adoption of agentic AI is on the rise, with major corporations cutting jobs and citing AI usage as the leading factor.

Gartner projects that by the end of 2026, 40% of enterprise applications will embed task-specific AI agents. However, McKinsey has warned that agentic workflows are spreading faster than governance models can address their risks.

A 2025 survey of 30 leading AI agents found that 25 disclosed no internal safety results, and 23 had undergone no third-party testing. It is important that enterprises take the possibility of agents going beyond the scope of the work into serious consideration.

Alibaba said it had responded by building safety-aligned data filtering into its training pipeline and hardening the sandbox environments in which its agents operate, and it has received praise for sharing its findings with the public.

Anthropic upgraded Claude Opus 4 to its highest internal safety classification.

Source: https://www.cryptopolitan.com/alibaba-reports-rogue-ai-agent/