AI Marketing

Multimodal AI Support Could Reduce Customer Effort by Addressing Hidden Cognitive Load

Mavenoid’s Shan Lilja warns that while AI reduces company effort in customer support, it raises cognitive load for users, risking abandoned sessions and trust.

Sofía Méndez

Published

4 days ago

As artificial intelligence (AI) continues to shape the landscape of customer support, businesses are generally optimistic about its impact. Costs have decreased, coverage has expanded, and chatbots now manage inquiries that previously resulted in lengthy waits for human agents. However, the experience for customers is more complex, raising questions about whether AI has genuinely improved service quality.

Many customers find themselves navigating through extensive, generated text, assessing the accuracy of the information presented, and often needing to rephrase their queries when the AI misinterprets their requests. This shift means that the burden of effort in customer service interactions has increased for users rather than decreased. Shan Lilja, Co-Founder of Mavenoid, recently highlighted this issue in a discussion with CX Today, stating, “Company effort has gone down, but customer effort has risen.”

This disconnect between organizational efficiency and customer experience identifies a critical challenge in AI-assisted support. Lilja suggests that implementing multimodal approaches—integrating various types of media—could bridge this gap. He mentions the “hidden tax” that customers unknowingly pay in terms of cognitive load. With the introduction of large language models, the nature of customer effort has evolved. Previously, customers faced frustration from poor routing or slow responses; now they deal with the mental strain of engaging with AI that projects confidence but may not always provide accurate information.

Lilja defines this cognitive burden as a “hidden tax on every AI interaction, paid by customers.” These instances of increased effort may not be immediately visible in performance metrics like containment rates or Customer Satisfaction (CSAT) scores, but they can lead to significant consequences such as abandoned sessions and diminished customer confidence. A particularly concerning element within this cognitive tax is what Lilja refers to as “AI slop,” or low-quality, generic content generated by AI. Such inaccuracies can severely hinder customer service operations, potentially leading to product damage or safety issues.

Recent incidents illustrate these risks. Earlier this year, Woolworths was compelled to modify its AI chatbot after it mistakenly claimed to have personal family experiences, referencing an “angry mother.” In another case, Air Canada faced compensation claims due to its chatbot providing incorrect refund information. Similarly, a customer persuaded DPD’s chatbot to create a derogatory poem about the company. These scenarios underscore the necessity for AI that is based on more than mere language.

Addressing Structural Issues

Lilja asserts that the fundamental problem with text-only AI is rooted in language’s inherent complexity. He compares language to “a tree of possibilities,” where each sentence can be interpreted in various ways. When an AI selects an incorrect interpretation, the support interaction can quickly derail, placing the onus on the customer to rectify the misunderstanding. In contrast, visual context—such as photographs, real-time visuals, or instructional videos—offers a more constrained framework for interpretation. As Lilja notes, “It’s harder to b******t a human with a false image than with false words.” By grounding AI interactions in visual elements, organizations can reduce the incidence of erroneous guidance.

This approach, which Lilja describes as “visual grounding,” represents one of six essential properties for effective multimodal support. It aims to enhance context, reduce ambiguity, ensure cross-modal consistency, maintain state awareness, provide real-time feedback, and integrate visual grounding. Together, these elements target the structural failures of text-only AI, moving beyond simple fixes to address the core challenges.

One particularly troublesome failure mode is delayed feedback, which can frustrate customers who follow inaccurate instructions for extended periods. For instance, a customer might diligently follow a chatbot’s directions for cleaning a washing machine’s drain filter, only to discover after several minutes that a previous step was incorrect. This delay can exacerbate frustration and erode trust in the AI system. In contrast, real-time visual feedback—such as video guides that provide instant corrections—can preemptively address such issues, allowing customers to rectify mistakes before they escalate.

Lilja’s comment that “a picture is worth a thousand words” succinctly encapsulates the advantages of multimodal support, especially when weighed against the risks of inaccurate textual instructions. Brands that recognize and implement these innovations could elevate their customer service from mere improvements in Net Promoter Scores (NPS) to creating dependable support mechanisms that instill confidence in users. As organizations continue to navigate the complexities of AI in customer service, the integration of multimodal strategies may pave the way for a more reliable and effective support experience.

AIPRESSA.COM

AI Marketing

Multimodal AI Support Could Reduce Customer Effort by Addressing Hidden Cognitive Load

Addressing Structural Issues

Trending

Top Stories

Albania Appoints AI Bot Minister Diella Amid Corruption Concerns and EU Membership Goals

AI Government

BigBear.ai Launches Biometric Platform at O’Hare, Acquires Generative AI Ask Sage for $250M

AI Cybersecurity

Endpoint Security Market to Reach $23.9B by 2030 with 7.2% CAGR Amid Rising Cyber Threats

AI Business

Enterprise Architecture Shifts to Strategic Enabler in AI-Driven Business Models

AI Technology

AI Hardware Market Grows 30% in 2025, Driven by Generative AI and Edge Computing Demand

You May Also Like