In a significant advancement in artificial intelligence, researchers from Google and UC Santa Barbara have introduced a framework aimed at optimizing tool use in large language model (LLM) agents. Their new techniques, described in a recently published paper, include a lightweight module named “Budget Tracker” and a more comprehensive approach called “Budget Aware Test-time Scaling” (BATS). These innovations allow AI agents to better manage their tool and compute resource budgets, thereby enhancing efficiency in real-world applications.
As AI agents increasingly rely on various tools to perform tasks such as web browsing, managing costs and reducing latency has become crucial. Traditional methods of scaling often focus on extending the model’s thinking time, but the number of tool calls significantly impacts the quality of results. The researchers noted that increasing test-time resources without a clear sense of budget often leads to inefficiency. “In a deep research task, if the agent has no sense of budget, it often goes down blindly,” stated co-authors Zifeng Wang and Tengxiao Liu.
The Budget Tracker acts as a plug-in that continuously informs the agent about its resource availability. By implementing this module, the researchers aim to help agents internalize budget constraints, enabling them to adjust their strategies in real time without needing additional training. The simplicity of the Budget Tracker allows for straightforward implementation, providing clear policy guidelines and recommendations based on the remaining budget.
To validate the effectiveness of the Budget Tracker, experiments were conducted using various AI models including Gemini 2.5 Pro and Claude Sonnet 4. The findings indicated that the module significantly improves performance while reducing operational costs. Specifically, the addition of the Budget Tracker resulted in a 40.4% decrease in search calls and a 31.3% reduction in overall costs, all while maintaining comparable accuracy. “Adding Budget Tracker achieves comparable accuracy using 40.4% fewer search calls, 19.9% fewer browse calls, and reducing overall cost by 31.3%,” the researchers explained.
In addition to the Budget Tracker, the BATS framework offers a more sophisticated method for managing budgets and maximizing performance. This comprehensive approach maintains a continuous signal of remaining resources, allowing for dynamic adjustments in agent behavior. BATS employs a planning module to create structured action plans, while a verification module assesses whether to explore further or pivot based on resource availability.
BATS was tested on multiple benchmarks, including BrowseComp and HLE-Search, where it demonstrated superior performance compared to traditional methods like ReAct. For example, using BATS, the model achieved a 24.6% accuracy on the BrowseComp dataset, significantly higher than the 12.6% achieved using standard ReAct. The framework not only enhances performance but also yields better cost-performance trade-offs, with BATS achieving higher accuracy at a cost of approximately 23 cents, contrasting with over 50 cents required by a parallel scaling baseline.
This efficiency unlocks new possibilities for enterprises looking to deploy AI agents capable of self-managing resources. As industries increasingly adopt AI technologies, balancing accuracy and cost will become paramount. Wang and Liu emphasized that “the relationship between reasoning and economics will become inseparable,” indicating that future models must integrate value considerations into their decision-making processes.
With these advancements, the researchers believe they have opened the door for a range of applications, from complex codebase maintenance to compliance audits and document analysis. As enterprises navigate the evolving landscape of AI, the integration of budget-aware scaling techniques will likely play a significant role in shaping the future of intelligent agents.
See also
$400B AI Chip Investment Faces Scrutiny as Lifespan Estimates Shorten to 2-3 Years
FunkyMEDIA Launches AI-Driven SEO Suite to Enhance Brand Visibility and Performance
Fed Cuts Rates to 3.50%, Verizon Launches Price War Amid Telecom Turmoil
AI Chatbots Achieve Varied Success on Civil Engineering Exams, Study Reveals Key Insights
Nvidia Reveals GPS Technology to Track AI Chip Locations, Enhancing Export Compliance



















































