Together AI adds enterprise-grade autoscaling, RBAC, observability dashboards, and self-healing node repair to GPU Clusters as company pursues $1B funding roundTogether AI adds enterprise-grade autoscaling, RBAC, observability dashboards, and self-healing node repair to GPU Clusters as company pursues $1B funding round

Together AI Upgrades GPU Clusters With Autoscaling and Self-Healing Features

2026/03/11 01:34
3 min read
For feedback or concerns regarding this content, please contact us at crypto.news@mexc.com

Together AI Upgrades GPU Clusters With Autoscaling and Self-Healing Features

Lawrence Jengar Mar 10, 2026 17:34

Together AI adds enterprise-grade autoscaling, RBAC, observability dashboards, and self-healing node repair to GPU Clusters as company pursues $1B funding round.

Together AI Upgrades GPU Clusters With Autoscaling and Self-Healing Features

Together AI has rolled out a significant infrastructure upgrade to its GPU Clusters platform, adding autoscaling, role-based access control, full-stack observability, and self-healing node repair capabilities. The enhancements arrive as the AI cloud company reportedly pursues $1 billion in fresh funding, according to reports from earlier this month.

The timing isn't coincidental. Enterprise customers running distributed training workloads across hundreds of GPUs need more than raw compute—they need infrastructure that doesn't require babysitting.

Autoscaling Targets GPU Waste

The new autoscaling feature, powered by the Kubernetes Cluster Autoscaler, monitors for GPU-constrained workloads and automatically provisions or decommissions nodes based on real-time demand. For teams running variable inference workloads or bursty training jobs, this means no more paying for idle hardware during quiet periods.

Static GPU provisioning has been a persistent pain point. Organizations either overprovision (expensive) or underprovision (performance bottlenecks during demand spikes). Together's approach lets clusters expand during peak load and contract when demand subsides.

Self-Healing Addresses Hardware Reality

GPU hardware fails. In large fleets, it's not a question of if but when. For distributed training, a single unstable node can invalidate hours of compute time.

Together's solution: self-serve health checks that users can trigger before launching major training jobs. Tests range from basic DCGM diagnostics to multi-node NCCL and InfiniBand bandwidth tests. When a node does fail, a three-click self-repair process automatically cordons, drains, and recreates the node—bringing clusters back to healthy status within minutes rather than hours.

Acceptance tests now run automatically during provisioning. Clusters won't be marked ready until they pass.

Enterprise Access Controls

The RBAC implementation introduces "Projects" as isolation boundaries for teams. Two default roles split responsibilities cleanly: Admins get full control plane access for cluster creation and deletion, while Members can access GPU worker nodes and run workloads without touching infrastructure provisioning.

This matters for organizations where platform engineers need to lock down infrastructure while giving ML researchers freedom to experiment.

Observability Gets Native

Every GPU Cluster project now includes a dedicated Grafana instance with pre-built dashboards. Telemetry covers GPU utilization via DCGM metrics, InfiniBand and NIC-level networking data, storage I/O performance, and Kubernetes orchestration health. The feature is currently in private preview.

Market Context

Together AI has been building momentum in the GPU-as-a-service space. The company launched self-service GPU infrastructure in September 2025 and introduced Instant GPU Clusters at NVIDIA GTC 2025 in March of that year. The platform supports NVIDIA Hopper (H100) and Blackwell (B200) GPUs, with Instant Clusters scaling up to 64 GPUs and Dedicated Clusters reaching 1,000 GPUs.

With a reported $7.5 billion market cap and a potential billion-dollar funding round in progress, Together is positioning itself as a serious alternative to hyperscaler GPU offerings—targeting teams that want bare-metal performance without the operational overhead of managing their own hardware.

The new features are available immediately to existing Together GPU Clusters customers.

Image source: Shutterstock
  • together ai
  • gpu infrastructure
  • ai computing
  • cloud infrastructure
  • enterprise ai
Market Opportunity
NodeAI Logo
NodeAI Price(GPU)
$0.02953
$0.02953$0.02953
-4.37%
USD
NodeAI (GPU) Live Price Chart
Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact crypto.news@mexc.com for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.

You May Also Like

Why Is Crypto Market Up Today? 5 Key Reasons Behind the Rally

Why Is Crypto Market Up Today? 5 Key Reasons Behind the Rally

The post Why Is Crypto Market Up Today? 5 Key Reasons Behind the Rally appeared on BitcoinEthereumNews.com. The crypto market is rallying today, with Bitcoin climbing
Share
BitcoinEthereumNews2026/03/11 04:47
‘Alien Earth’ Composer Jeff Russo Dives Into Score For FX Series

‘Alien Earth’ Composer Jeff Russo Dives Into Score For FX Series

The post ‘Alien Earth’ Composer Jeff Russo Dives Into Score For FX Series appeared on BitcoinEthereumNews.com. FX’s Alien: Earth — Pictured: Timothy Olyphant as Kirsh. Courtesy of Patrick Brown/FX The following contains certain spoilers for Alien: Earth! When it came time to marry picture and music for FX’s Alien: Earth, series creator Noah Hawley did what he’s done for close to 20 years: call up Jeff Russo. “[He] said, ‘I’m adapting the Alien IP, for television. What do you think, musically?’” Russo recalls over Zoom. “We started talking and I began writing music for it. It seemed like…not a foregone conclusion, but a conversation that was being had.” A founder of Tonic and a previous member of Low Stars, the composer has scored all of Hawley’s film and television projects since The Unusuals (2009). “Everything I’ve learned about making music for storytelling, I learned by doing with him,” Russo adds. “He really knows what he wants. And when you have a confident filmmaker that is also open to artistic collaboration, it’s the best of all the worlds.” The first small screen translation of the nearly 50-year-old franchise known for straddling horror, sci-fi, and action genres, Alien: Earth takes place two years before the events of the 1979 original and nearly six decades before Aliens. “We talk a lot about trying to figure out what the underlying property is making our audience feel,” Russo explains. “Trying to create a unique narrative and way of telling the story, but at the same time, making the audience feel that same feeling. In this case, there’s that feeling of dread. There’s that tense, eerie feeling created with such a deft hand in Alien. And then [came Aliens, which was] such a great action piece. So how are we going to take those two ideas and sort of mix them together, have that be something unique and different, while eliciting the…
Share
BitcoinEthereumNews2025/09/18 07:23
Adoption Leads Traders to Snorter Token

Adoption Leads Traders to Snorter Token

The post Adoption Leads Traders to Snorter Token appeared on BitcoinEthereumNews.com. Largest Bank in Spain Launches Crypto Service: Adoption Leads Traders to Snorter Token Sign Up for Our Newsletter! For updates and exclusive offers enter your email. Leah is a British journalist with a BA in Journalism, Media, and Communications and nearly a decade of content writing experience. Over the last four years, her focus has primarily been on Web3 technologies, driven by her genuine enthusiasm for decentralization and the latest technological advancements. She has contributed to leading crypto and NFT publications – Cointelegraph, Coinbound, Crypto News, NFT Plazas, Bitcolumnist, Techreport, and NFT Lately – which has elevated her to a senior role in crypto journalism. Whether crafting breaking news or in-depth reviews, she strives to engage her readers with the latest insights and information. Her articles often span the hottest cryptos, exchanges, and evolving regulations. As part of her ploy to attract crypto newbies into Web3, she explains even the most complex topics in an easily understandable and engaging way. Further underscoring her dynamic journalism background, she has written for various sectors, including software testing (TEST Magazine), travel (Travel Off Path), and music (Mixmag). When she’s not deep into a crypto rabbit hole, she’s probably island-hopping (with the Galapagos and Hainan being her go-to’s). Or perhaps sketching chalk pencil drawings while listening to the Pixies, her all-time favorite band. This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy Center or Cookie Policy. I Agree Source: https://bitcoinist.com/banco-santander-and-snorter-token-crypto-services/
Share
BitcoinEthereumNews2025/09/17 23:45