New Anthropic research shows Claude Code autonomy nearly doubled in 3 months, with experienced users granting more independence while maintaining oversight. (ReadNew Anthropic research shows Claude Code autonomy nearly doubled in 3 months, with experienced users granting more independence while maintaining oversight. (Read

Anthropic Study Reveals AI Agents Run 45 Minutes Autonomously as Trust Builds

2026/02/19 04:03
4 min read
For feedback or concerns regarding this content, please contact us at crypto.news@mexc.com

Anthropic Study Reveals AI Agents Run 45 Minutes Autonomously as Trust Builds

Felix Pinkston Feb 18, 2026 20:03

New Anthropic research shows Claude Code autonomy nearly doubled in 3 months, with experienced users granting more independence while maintaining oversight.

Anthropic Study Reveals AI Agents Run 45 Minutes Autonomously as Trust Builds

AI agents are working independently for significantly longer periods as users develop trust in their capabilities, according to new research from Anthropic published February 18, 2026. The study, which analyzed millions of human-agent interactions, found that the longest-running Claude Code sessions nearly doubled from under 25 minutes to over 45 minutes between October 2025 and January 2026.

The findings arrive as Anthropic rides a wave of investor confidence, having just closed a $30 billion Series G round that valued the company at $380 billion. That valuation reflects growing enterprise appetite for AI agents—and this research offers the first large-scale empirical look at how humans actually work with them.

Trust Builds Gradually, Not Through Capability Jumps

Perhaps the most striking finding: the increase in autonomous operation time was smooth across model releases. If autonomy were purely about capability improvements, you'd expect sharp jumps when new models dropped. Instead, the steady climb suggests users are gradually extending trust as they gain experience.

The data backs this up. Among new Claude Code users, roughly 20% of sessions use full auto-approve mode. By the time users hit 750 sessions, that number exceeds 40%. But here's the counterintuitive part—experienced users also interrupt Claude more frequently, not less. New users interrupt in about 5% of turns; veterans interrupt in roughly 9%.

What's happening? Users aren't abandoning oversight. They're shifting strategy. Rather than approving every action upfront, experienced users let Claude run and step in when something needs correction. It's the difference between micromanaging and monitoring.

Claude Knows When to Ask

The research revealed something unexpected about Claude's own behavior. On complex tasks, the AI stops to ask clarifying questions more than twice as often as humans interrupt it. Claude-initiated pauses actually exceed human-initiated interruptions on the most difficult work.

Common reasons Claude stops itself include presenting users with choices between approaches (35% of pauses), gathering diagnostic information (21%), and clarifying vague requests (13%). Meanwhile, humans typically interrupt to provide missing technical context (32%) or because Claude was running slow or excessive (17%).

This suggests Anthropic's training for uncertainty recognition is working. Claude appears calibrated to its own limitations—though the researchers caution it may not always stop at the right moments.

Software Dominates, But Riskier Domains Emerge

Software engineering accounts for nearly 50% of all agentic tool calls on Anthropic's public API. That concentration makes sense—code is testable, reviewable, and relatively low-stakes if something breaks.

But the researchers found emerging usage in healthcare, finance, and cybersecurity. Most actions remain low-risk and reversible—only 0.8% of observed actions appeared irreversible, like sending customer emails. Still, the highest-risk clusters involved sensitive security operations, financial transactions, and medical records.

The team acknowledges limitations: many high-risk actions may actually be red-team evaluations rather than production deployments. They can't always tell the difference from their vantage point.

What This Means for the Industry

Anthropic's researchers argue against mandating specific oversight patterns like requiring human approval for every action. Their data suggests such requirements would create friction without safety benefits—experienced users naturally develop more efficient monitoring strategies.

Instead, they're calling for better post-deployment monitoring infrastructure across the industry. Pre-deployment testing can't capture how humans actually interact with agents in practice. The patterns they observed—trust building over time, shifting oversight strategies, agents limiting their own autonomy—only emerge in real-world usage.

For enterprises evaluating AI agent deployments, the research offers a concrete benchmark: even power users at the extreme end of the distribution are running Claude autonomously for under an hour at a stretch. The gap between what models can theoretically handle (METR estimates five hours for comparable tasks) and what users actually permit suggests significant headroom remains—and that trust, not capability, may be the binding constraint on adoption.

Image source: Shutterstock
  • anthropic
  • ai agents
  • claude code
  • artificial intelligence
  • machine learning
Market Opportunity
Intuition Logo
Intuition Price(TRUST)
$0,07192
$0,07192$0,07192
-0,93%
USD
Intuition (TRUST) Live Price Chart
Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact crypto.news@mexc.com for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.

You May Also Like

Why AVAX Traders Are Watching $11.50 and $8.00 Right Now

Why AVAX Traders Are Watching $11.50 and $8.00 Right Now

Avalanche gained 2.77% on March 4, reaching $9.64 by 15:50 UTC on volume of 327,810 AVAX, the highest single-hour reading on the chart. The move came after six
Share
Ethnews2026/03/05 00:16
Fed Makes First Rate Cut of the Year, Lowers Rates by 25 Bps

Fed Makes First Rate Cut of the Year, Lowers Rates by 25 Bps

The post Fed Makes First Rate Cut of the Year, Lowers Rates by 25 Bps appeared on BitcoinEthereumNews.com. The Federal Reserve has made its first Fed rate cut this year following today’s FOMC meeting, lowering interest rates by 25 basis points (bps). This comes in line with expectations, while the crypto market awaits Fed Chair Jerome Powell’s speech for guidance on the committee’s stance moving forward. FOMC Makes First Fed Rate Cut This Year With 25 Bps Cut In a press release, the committee announced that it has decided to lower the target range for the federal funds rate by 25 bps from between 4.25% and 4.5% to 4% and 4.25%. This comes in line with expectations as market participants were pricing in a 25 bps cut, as against a 50 bps cut. This marks the first Fed rate cut this year, with the last cut before this coming last year in December. Notably, the Fed also made the first cut last year in September, although it was a 50 bps cut back then. All Fed officials voted in favor of a 25 bps cut except Stephen Miran, who dissented in favor of a 50 bps cut. This rate cut decision comes amid concerns that the labor market may be softening, with recent U.S. jobs data pointing to a weak labor market. The committee noted in the release that job gains have slowed, and that the unemployment rate has edged up but remains low. They added that inflation has moved up and remains somewhat elevated. Fed Chair Jerome Powell had also already signaled at the Jackson Hole Conference that they were likely to lower interest rates with the downside risk in the labor market rising. The committee reiterated this in the release that downside risks to employment have risen. Before the Fed rate cut decision, experts weighed in on whether the FOMC should make a 25 bps cut or…
Share
BitcoinEthereumNews2025/09/18 04:36
Russia’s Central Bank Prepares Crackdown on Crypto in New 2026–2028 Strategy

Russia’s Central Bank Prepares Crackdown on Crypto in New 2026–2028 Strategy

The Central Bank of Russia’s long-term strategy for 2026 to 2028 paints a picture of growing concern. The document, prepared […] The post Russia’s Central Bank Prepares Crackdown on Crypto in New 2026–2028 Strategy appeared first on Coindoo.
Share
Coindoo2025/09/18 02:30