In a significant shift for the AI industry, experts are moving away from the prevailing mindset that larger models equate to smarter outcomes. Over the past three years, the focus has been on scaling artificial intelligence systems—chasing parameter counts into the trillions. However, as the industry evolves, the emphasis is now on delivering reliable and deterministic outcomes, especially for enterprises. Red Hat has positioned itself at the forefront of this change, arguing that the most powerful technologies are those that are distributed, open, and specifically designed for their intended purposes.
Small language models (SLMs) are emerging as a key component of this transformation. While the distinction between SLMs and large language models (LLMs) has garnered attention, the architectural role that these models serve is becoming more critical. SLMs offer functional sovereignty that is essential for enterprises seeking to streamline operations. This shift represents a transition from a world dominated by conversational AI to one defined by agentic AI, where specialized models will perform the actual work of businesses.
As companies prepare for this new era, the question of how many AI agents they will employ in their operations is becoming paramount. Just as businesses once questioned the necessity of an email address in 1995 or a website in 2005, by 2026, they will likely be asking, “How many agents do I have running?” The future suggests that there may be more AI agents than people in the workforce, enabling firms to deploy a diverse range of specialized agents. These include customer-facing agents capable of solving complex logistics, workflow agents that automate inter-departmental processes, and headless agents that manage API calls for tasks such as inventory reconciliation and payment processing.
However, creating a sustainable fleet of agentic models will require a strategic approach, particularly in choosing the right infrastructure. Red Hat emphasizes that relying on third-party cloud services is not a sustainable solution. Instead, SLMs are positioned as a necessary tool for enterprises looking to scale effectively. They enable low-latency execution and deterministic reliability, both of which are critical for business automation.
SLMs offer several advantages over their larger counterparts. While high-parameter frontier models may provide impressive capabilities, they often lack the speed and efficiency required for agile business operations. For example, research indicates that even a 350 million-parameter model fine-tuned on quality synthetic data can outperform larger models in specific tasks such as tool-calling and API orchestration. This highlights the importance of specialization over sheer scale when it comes to developing a robust agentic backend.
One of the challenges enterprises face with AI implementation is non-determinism, where the same input might yield different outputs. SLMs can mitigate this risk through architectural control, making it easier to ensure consistent, reliable results. By employing constrained decoding methods such as JSON Schema or Context-Free Grammars, the model can be restricted in its token selection, helping to ensure that responses are valid and reliable. These techniques allow SLMs to achieve over 98% validity in structured tasks, a significant improvement for workflows that demand precision.
Data sovereignty also plays a crucial role in this evolving landscape. In a world where AI models will manage sensitive information such as customer relationships and proprietary code, relinquishing that data to a third-party provider can pose significant risks. By operating SLMs in-house or within a controlled hybrid cloud environment, enterprises can retain ownership of their intellectual property, maintaining a “zero trust” architecture that keeps sensitive data secure. This approach is particularly important for industries with stringent regulatory requirements, including healthcare, finance, and government.
Looking ahead, the AI landscape is poised for a dramatic transformation. As enterprises transition from generative AI—primarily focused on conversation and content creation—to agentic AI that actively takes action, the focus will shift from the sheer size of AI models to the reliability and security of the infrastructure that supports them. The traditional “black box” cloud models may no longer suffice as businesses increasingly recognize the necessity for sovereignty, speed, and precision in their AI operations.
Red Hat firmly believes that the path forward is one defined by openness and adaptability. By leveraging curated small language models that can be fine-tuned, served, and orchestrated with the Red Hat AI portfolio, companies can effectively integrate AI into their core business functions. As the industry proceeds at a rapid pace, the imperative is clear: stop chasing the giants and start constructing a robust backbone for the future of AI, which is small, fast, and grounded in open hybrid cloud technology.
See also
Bank of America Warns of Wage Concerns Amid AI Spending Surge
OpenAI Restructures Amid Record Losses, Eyes 2030 Vision
Global Spending on AI Data Centers Surpasses Oil Investments in 2025
Rigetti CEO Signals Caution with $11 Million Stock Sale Amid Quantum Surge
Investors Must Adapt to New Multipolar World Dynamics





















































