AI has become central to how organisations improve their customer experience and operational performance. While large language models (LLMs) have proved their valueAI has become central to how organisations improve their customer experience and operational performance. While large language models (LLMs) have proved their value

When less is more: why small language models deserve a bigger role in enterprise AI

5 min read

AI has become central to how organisations improve their customer experience and operational performance. While large language models (LLMs) have proved their value across many enterprise use cases, their scale, cost and complexity mean they are not necessarily the right answer to every problem. 

Small language models (SLMs), particularly those trained on proprietary enterprise data, offer a compelling alternative. They enable organisations to build AI solutions that are differentiated while being more sustainable, easier to govern and better aligned with regulatory expectations. Not only that, they are more cost-effective to run and often more accurate for focused tasks, making them a practical way to accelerate AI adoption without overengineering. 

That does not mean that SLMs will replace LLMs. Instead, it is about recognising that different models suit different needs. In practice, this often means using smaller, domain-specific models to fine-tune AI for particular functions, workflows or decisions. By embedding domain knowledge directly into the model, organisations can deliver more precise and business-relevant outcomes, without sacrificing the flexibility of larger models elsewhere. 

The many benefits of SLMs 

One of the clearest advantages of SLMs is how well they support privacy-sensitive tasks. Their smaller size and lower compute requirements mean they can be deployed on local infrastructure or private servers, rather than relying on external cloud providers. This reduces the need to move sensitive data outside the organisation, lowering the risk of exposure and giving teams greater control over access and usage. For highly regulated sectors such as healthcare, financial services and government, where confidentiality is essential, SLMs can be a smart alternative to larger, cloud-dependent LLMs. 

SLMs also offer a more sustainable approach to AI. As AI workloads grow, large models are placing increasing strain on energy and water resources, with training alone consuming vast amounts of electricity. Smaller, task-specific models provide a far more efficient alternative. Research from UNESCO and UCL shows that SLMs can reduce energy consumption by up to 90% without sacrificing performance, thanks to their lower parameter counts and reduced compute requirements. 

Finally, governance is another area where smaller models stand out. SLMs are easier to audit, monitor and explain, making it simpler for organisations to meet regulatory requirements such as Europe’s GDPR and HIPAA in the US. Because they can be trained for specific tasks, SLMs also allow organisations to embed their own policies and controls directly into model behaviour, while benefiting from lower training costs, reduced hardware demands and improved accuracy on focused datasets. 

In addition to these clear wins, SLMs bring a host of technical benefits that all organisations can appreciate: lower training and equipment costs, for example, as well as accuracy when trained on focused datasets. 

Do all these check marks for SLMs mean we should throw out LLMs? Absolutely not. 

The case for a hybrid approach 

A hybrid, multi-model strategy brings together the strengths of both model types. LLMs remain well suited to complex, open-ended tasks that require broad contextual understanding, while SLMs excel at narrow, clearly defined problems. Used together, they allow organisations to optimise performance, control costs and reduce environmental impact. 

As enterprises scale their AI programmes, these trade-offs are becoming more visible. Sharing proprietary data with third-party LLM providers may feel excessive for simple tasks, while hosting large models internally is costly and can quickly undermine return on investment. At the same time, sustainability commitments are harder to maintain as AI workloads grow. Many organisations are also discovering that some of their most valuable use cases are narrow in scope but critical to the business, making them ill-suited to general purpose models. 

This is where SLMs add real value. When blended thoughtfully with LLMs, they provide a more focused and efficient way to address these challenges. 

Making SLMs work in practice  

Successfully deploying SLMs requires careful planning across the full AI lifecycle. Access to high-quality, appropriately sized datasets is essential, particularly when tuning models for domain-specific use cases. Strong data and model operations are equally important to ensure they remain accurate, relevant and aligned with changing business needs. 

Choosing the right model for each task is also essential. SLMs perform best in focused domains, while LLMs are better suited to broader or more context-rich applications. A hybrid approach allows organisations to match each model type to the problem at hand.  

Effective orchestration is the final piece of the puzzle. Organisations running both SLMs and LLMs need intelligent routing mechanisms that determine how each query should be handled. Deciding whether a request is best served by a specialised SLM or a general-purpose LLM is key to delivering consistent, high-quality AI experiences. 

Small but mighty 

SLMs offer organisations a practical way to begin scaling enterprise AI. They deliver faster, safer and more cost-efficient performance, while supporting sustainability and responsible AI goals. For business and technology leaders beginning to see the limits of an LLM-only strategy, a hybrid approach that combines the strengths of both model types may prove to be the smarter path forward. 

Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact service@support.mexc.com for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.

You May Also Like

Trading time: Tonight, the US GDP and the upcoming non-farm data will become the market focus. Institutions are bullish on BTC to $120,000 in the second quarter.

Trading time: Tonight, the US GDP and the upcoming non-farm data will become the market focus. Institutions are bullish on BTC to $120,000 in the second quarter.

Daily market key data review and trend analysis, produced by PANews.
Share
PANews2025/04/30 13:50
Ethereum Fusaka Upgrade Set for December 3 Mainnet Launch, Blob Capacity to Double

Ethereum Fusaka Upgrade Set for December 3 Mainnet Launch, Blob Capacity to Double

Ethereum developers confirmed the Fusaka upgrade will activate on mainnet on December 3, 2025, following a systematic testnet rollout beginning on October 1 on Holesky. The major hard fork will implement around 11-12 Ethereum Improvement Proposals targeting scalability, node efficiency, and data availability improvements without adding new user-facing features. According to Christine Kim, the upgrade introduces a phased blob capacity expansion through Blob Parameter Only forks occurring two weeks after Fusaka activation. Initially maintaining current blob limits of 6/9 target/max, the first BPO fork will increase capacity to 10/15 blobs one week later. A second BPO fork will further expand limits to 14/21 blobs, more than doubling total capacity within two weeks. Strategic Infrastructure Overhaul Fusaka prioritizes backend protocol improvements over user-facing features, focusing on making Ethereum faster and less resource-intensive. The upgrade includes PeerDAS implementation through EIP-7594, allowing validator nodes to verify data by sampling small pieces rather than downloading entire blobs. This reduces bandwidth and storage requirements while enhancing Layer 2 rollup scalability. The upgrade builds on recent gas limit increases from 30 million to 45 million gas, with ongoing discussions for further expansion. EIP-7935 proposes increasing limits to 150 million gas, potentially enabling significantly higher transaction throughput. These improvements complement broader scalability efforts, including EIP-9698, which suggests a 100x gas limit increase over two years to reach 2,000 transactions per second. Fusaka removes the previously planned EVM Object Format redesign to reduce complexity while maintaining focus on essential infrastructure improvements. The upgrade introduces bounded base fees for blob transactions via EIP-7918, creating more predictable transaction costs for data-heavy applications. Enhanced spam resistance and security improvements strengthen network resilience against scalability bottlenecks and attacks. Technical Implementation and Testing Timeline The Fusaka rollout follows a conservative four-phase approach across Ethereum testnets before mainnet deployment. Holesky upgrade occurs October 1, followed by Sepolia on October 14 and Hoodi on October 28. Each testnet will undergo the complete BPO fork sequence to validate the blob capacity expansion mechanism. BPO forks activate automatically based on predetermined epochs rather than requiring separate hard fork processes. On mainnet, the first BPO fork launches December 17, increasing blob capacity to 10/15 target/max. The second BPO fork activates January 7, 2026, reaching the final capacity of 14/21 blobs. This automated approach enables flexible blob scaling without requiring full network upgrades. Notably, node operators face release deadlines ranging from September 25 for Holesky to November 3 for mainnet preparation. The staggered timeline, according to the developers, allows comprehensive testing while giving infrastructure providers sufficient preparation time. Speculatively, the developers use this backward-compatible approach to ensure smooth transitions with minimal disruption to existing applications. PeerDAS implementation reduces node resource demands, potentially increasing network decentralization by lowering barriers for smaller operators. The technology enables more efficient data availability sampling, crucial for supporting growing Layer 2 rollup adoption. Overall, these improvements, combined with increased gas limits, will enable Ethereum to handle higher transaction volumes while maintaining security guarantees. Addressing Network Scalability Pressures The Fusaka upgrade addresses mounting pressure for Ethereum base layer improvements amid criticism of Layer 2 fragmentation strategies. Critics argue that reliance on rollups has created isolated chains with limited interoperability, complicating user experiences. The upgrade’s focus on infrastructure improvements aims to enhance base layer capacity while supporting continued Layer 2 growth. The recent validator queue controversy particularly highlights ongoing network scalability challenges. According to a Cryptonews report covered yesterday, currently, over 2M ETH sits in exit queues facing 43-day delays, while entry queues process in just 7 days.Ethereum Validator Queue (Source: ValidatorQueue) However, Vitalik Buterin defended these delays as essential for network security, comparing validator commitments to military service requiring “friction in quitting.” The upgrade coincides with growing institutional interest in Ethereum infrastructure, with VanEck predicting that Layer 2 networks could reach $1 trillion market capitalization within six years. Fusaka’s emphasis on data availability and node efficiency supports Ethereum’s evolution toward seamless cross-chain interoperability. The upgrade complements initiatives like the Open Intents Framework, where Coinbase Payments recently joined as a core contributor. The initiative, if successful, will address the $21B surge in cross-chain crime. These coordinated efforts aim to unify the fragmented multichain experience while maintaining Ethereum’s security and decentralization principles
Share
CryptoNews2025/09/19 16:37
VectorUSA Achieves Fortinet’s Engage Preferred Services Partner Designation

VectorUSA Achieves Fortinet’s Engage Preferred Services Partner Designation

TORRANCE, Calif., Feb. 3, 2026 /PRNewswire/ — VectorUSA, a trusted technology solutions provider, specializes in delivering integrated IT, security, and infrastructure
Share
AI Journal2026/02/05 00:02