Salesforce’s recent introduction of an agent testing and builder tool, alongside Jeff Bezos’s new AI venture targeting practical industrial applications, underscores a significant shift toward autonomous systems in enterprise environments. This evolution is critical as robust testing and evaluation frameworks lay the groundwork for agentic AI. However, a pressing challenge remains: the need for structured practice that allows teams of agents to gain repeated experience, which is currently lacking. As a pioneer in Machine Teaching—a methodology for training autonomous systems that has been implemented across various Fortune 500 companies—I have witnessed the transformative impact of agent practice while building and deploying over 200 autonomous multi-agent systems at Microsoft and now at AMESA for enterprises globally.
CEOs investing in AI often face a common dilemma: they spend billions on pilot projects that may yield uncertain results regarding real autonomy. While agents often excel in demonstrations, they struggle when faced with the complexities of real-world applications. Consequently, business leaders are hesitant to trust AI to operate independently within critical workflows or machinery. There is a growing demand for the next level of AI capability: true enterprise expertise. The focus should not be solely on the knowledge an agent can retain, but rather on whether it has had the opportunity to practice and develop expertise similar to human teams.
Just as human teams hone their skills through repetition, feedback, and well-defined roles, AI agents must also engage in realistic practice environments with structured orchestration. This practice is essential for converting intelligence into reliable and autonomous performance.
Many enterprise leaders maintain the belief that a few major large language model (LLM) companies will eventually create sufficiently advanced models and extensive data sets capable of managing complex enterprise operations entirely through what is termed “Artificial General Intelligence.” However, this perception fails to align with the intricate workings of enterprises.
Critical processes such as supply chain planning or energy optimization do not rely on a single individual with a singular skill set. Consider a basketball team: each player must work on their skills—be it dribbling or shooting—yet each has a distinct role. A center’s responsibilities differ from those of a point guard. Success arises from defined roles, expertise, and responsibilities; AI requires a similar framework.
Even if the perfect model or AGI were achieved, it is likely that agents would still falter in real-world applications due to their lack of exposure to variability, drift, anomalies, or the nuanced signals that humans instinctively navigate. They would not have differentiated their skill sets or learned when to act or pause, nor been subjected to expert feedback loops that refine real judgment.
Machine Teaching provides the necessary structure that contemporary agentic systems require. This methodology guides agents to accurately perceive their environment, master fundamental skills that mimic human operators, learn advanced strategies that reflect expert judgment, and coordinate effectively under the guidance of a supervisory agent that selects the appropriate strategy at the right moment.
For instance, in one Fortune 500 company focused on improving its nitrogen manufacturing process, agents practiced within the AMESA Agent Cloud, gaining proficiency through experimentation and feedback. Remarkably, in less than a day, these agent teams surpassed the performance of a custom-built industrial control system, outshining other automation tools and single-agent AI applications.
This achievement led to an estimated $1.2 million in annual efficiency gains and, more critically, instilled confidence in leadership to deploy autonomous systems at scale, as the agents behaved similarly to the company’s best operators.
To drive genuine autonomy in agents, practice must be prioritized. Leaders are encouraged to reshape several key assumptions: first, to shift their focus from models to teams. Daily interactions with systems like ChatGPT or Claude can mislead executives into thinking that large language models represent the path to enterprise autonomy. Instead, autonomy arises from specialized agents executing perception, control, planning, and supervisory roles through diverse technologies.
Second, it is crucial to identify areas where expertise is dwindling and to preserve that knowledge within agents. Many vital operations are reliant on experts nearing retirement; leaders should assess which processes would be most vulnerable if these individuals departed unexpectedly. These areas present ideal opportunities for a Machine Teaching approach, allowing top operators to train agents in secure practice environments, ensuring their expertise is scalable and enduring.
Lastly, organizations should recognize that they already possess the infrastructure necessary for autonomy. Years of investment in sensors, MES and SCADA systems, ERP integrations, and IoT telemetry provide the backbone for digital twins and high-fidelity simulations. Achieving success requires orchestration, structure, and effective utilization of the data foundation already established.
When enterprises allow agents room for practice prior to deployment, numerous positive outcomes emerge. Human teams begin to trust AI and gain a clearer understanding of its limitations. Leaders are better positioned to calculate genuine ROI instead of relying on speculative forecasts. Agents become safer, more consistent, and more aligned with expert judgment, while human teams are enhanced rather than replaced, as AI learns to comprehend their workflows and provide support.
Ultimately, agents cannot perform effectively without experience, and that experience is derived solely from practice. Companies that commit to this perspective will be the ones to escape the cycle of pilot purgatory and realize substantial impact.
See also
Experts Warn Parents: Avoid AI Toys Amid Rising Safety Concerns and Data Risks
Transforming HCM: Ethical AI Governance Ensures Compliance and Builds Trust in Workforce Management
OpenAI Announces Head of Preparedness Role to Address AI Safety Risks and Challenges
Nvidia Licenses Groq Tech as HBM4 Race Heats Up; AI Infrastructure Spending Surges Ahead



















































