Microsoft’s Machine Teaching Reveals AI Agents Need Team-Based Practice for True Autonomy

Salesforce’s new agent testing tool and Machine Teaching at AMESA drive $1.2M efficiency gains, highlighting the urgent need for AI agents to practice like teams for true autonomy.

Staff

Published

27 December, 2025

Salesforce’s recent introduction of an agent testing and builder tool, alongside Jeff Bezos’s new AI venture targeting practical industrial applications, underscores a significant shift toward autonomous systems in enterprise environments. This evolution is critical as robust testing and evaluation frameworks lay the groundwork for agentic AI. However, a pressing challenge remains: the need for structured practice that allows teams of agents to gain repeated experience, which is currently lacking. As a pioneer in Machine Teaching—a methodology for training autonomous systems that has been implemented across various Fortune 500 companies—I have witnessed the transformative impact of agent practice while building and deploying over 200 autonomous multi-agent systems at Microsoft and now at AMESA for enterprises globally.

CEOs investing in AI often face a common dilemma: they spend billions on pilot projects that may yield uncertain results regarding real autonomy. While agents often excel in demonstrations, they struggle when faced with the complexities of real-world applications. Consequently, business leaders are hesitant to trust AI to operate independently within critical workflows or machinery. There is a growing demand for the next level of AI capability: true enterprise expertise. The focus should not be solely on the knowledge an agent can retain, but rather on whether it has had the opportunity to practice and develop expertise similar to human teams.

Just as human teams hone their skills through repetition, feedback, and well-defined roles, AI agents must also engage in realistic practice environments with structured orchestration. This practice is essential for converting intelligence into reliable and autonomous performance.

Many enterprise leaders maintain the belief that a few major large language model (LLM) companies will eventually create sufficiently advanced models and extensive data sets capable of managing complex enterprise operations entirely through what is termed “Artificial General Intelligence.” However, this perception fails to align with the intricate workings of enterprises.

Critical processes such as supply chain planning or energy optimization do not rely on a single individual with a singular skill set. Consider a basketball team: each player must work on their skills—be it dribbling or shooting—yet each has a distinct role. A center’s responsibilities differ from those of a point guard. Success arises from defined roles, expertise, and responsibilities; AI requires a similar framework.

Even if the perfect model or AGI were achieved, it is likely that agents would still falter in real-world applications due to their lack of exposure to variability, drift, anomalies, or the nuanced signals that humans instinctively navigate. They would not have differentiated their skill sets or learned when to act or pause, nor been subjected to expert feedback loops that refine real judgment.

Machine Teaching provides the necessary structure that contemporary agentic systems require. This methodology guides agents to accurately perceive their environment, master fundamental skills that mimic human operators, learn advanced strategies that reflect expert judgment, and coordinate effectively under the guidance of a supervisory agent that selects the appropriate strategy at the right moment.

For instance, in one Fortune 500 company focused on improving its nitrogen manufacturing process, agents practiced within the AMESA Agent Cloud, gaining proficiency through experimentation and feedback. Remarkably, in less than a day, these agent teams surpassed the performance of a custom-built industrial control system, outshining other automation tools and single-agent AI applications.

This achievement led to an estimated $1.2 million in annual efficiency gains and, more critically, instilled confidence in leadership to deploy autonomous systems at scale, as the agents behaved similarly to the company’s best operators.

To drive genuine autonomy in agents, practice must be prioritized. Leaders are encouraged to reshape several key assumptions: first, to shift their focus from models to teams. Daily interactions with systems like ChatGPT or Claude can mislead executives into thinking that large language models represent the path to enterprise autonomy. Instead, autonomy arises from specialized agents executing perception, control, planning, and supervisory roles through diverse technologies.

Second, it is crucial to identify areas where expertise is dwindling and to preserve that knowledge within agents. Many vital operations are reliant on experts nearing retirement; leaders should assess which processes would be most vulnerable if these individuals departed unexpectedly. These areas present ideal opportunities for a Machine Teaching approach, allowing top operators to train agents in secure practice environments, ensuring their expertise is scalable and enduring.

Lastly, organizations should recognize that they already possess the infrastructure necessary for autonomy. Years of investment in sensors, MES and SCADA systems, ERP integrations, and IoT telemetry provide the backbone for digital twins and high-fidelity simulations. Achieving success requires orchestration, structure, and effective utilization of the data foundation already established.

When enterprises allow agents room for practice prior to deployment, numerous positive outcomes emerge. Human teams begin to trust AI and gain a clearer understanding of its limitations. Leaders are better positioned to calculate genuine ROI instead of relying on speculative forecasts. Agents become safer, more consistent, and more aligned with expert judgment, while human teams are enhanced rather than replaced, as AI learns to comprehend their workflows and provide support.

Ultimately, agents cannot perform effectively without experience, and that experience is derived solely from practice. Companies that commit to this perspective will be the ones to escape the cycle of pilot purgatory and realize substantial impact.

AI Cybersecurity

OpenAI Acquires Promptfoo for Enhanced AI Security; DataBricks Strengthens SIEM with Two Startups

OpenAI acquires Promptfoo for enhanced AI security capabilities, integrating cutting-edge tools used by 25% of Fortune 500 companies into its Frontier platform.

Rachel Torres57 minutes ago

AI Generative

Microsoft Announces $10B Investment in Japan for AI Infrastructure and Cybersecurity

Microsoft reveals a $10 billion investment in Japan to expand AI infrastructure and cybersecurity, targeting the nation’s growing demand for cloud services.

Staff1 hour ago

Microsoft Invests $10 Billion in Japan to Enhance AI Infrastructure and Cybersecurity

Microsoft invests $10 billion in Japan to bolster AI infrastructure and cybersecurity, aiming to enhance digital resilience and innovation across industries.

Staff8 hours ago

AI Government

Microsoft Announces $10B Investment in Japan’s AI and Cybersecurity Sectors by 2029

Microsoft commits $10 billion to Japan's AI and cybersecurity sectors by 2029, aiming to train one million engineers and enhance data security and infrastructure.

Staff9 hours ago

Microsoft Shifts Focus, Aiming for State-of-the-Art AI Models by 2027 After OpenAI Deal

Microsoft shifts to independent AI development, targeting state-of-the-art models by 2027, fueled by Nvidia chips and a new strategic focus.

Staff11 hours ago

AI Technology

OpenAI Secures $122 Billion Funding, Achieves $852 Billion Valuation Amid AI Costs Surge

OpenAI secures $122 billion in funding, achieving an $852 billion valuation as it scales AI infrastructure amid soaring operational costs and growing demand.

Staff19 hours ago

Microsoft Launches Three New MAI Models; Google Unveils Gemma 4 Open AI Models

Microsoft unveils three new MAI models enhancing productivity, including MAI-Transcribe-1, which boasts 2.5x faster speech-to-text transcription than Azure Fast.

Staff21 hours ago

AI Generative

Microsoft Launches Three Advanced AI Foundational Models to Compete with Rivals

Microsoft boosts its AI leadership with three new models, including Copilot AI for coding, Insights AI for data analysis, and Conversational AI for enhanced...

Staff21 hours ago

AIPRESSA.COM

Top Stories

Microsoft’s Machine Teaching Reveals AI Agents Need Team-Based Practice for True Autonomy

Trending

Top Stories