AI Can Read the Room Better Than You Think

Every business says it listens to its customers. But when those voices come as tens of thousands of scattered online reviews, social posts, and survey comments, listening can become nearly impossible.

A new study by Kamel Jedidi, Jerome A. Chazen, Professor of Global Business at Columbia Business School, offers a breakthrough: using large language models not just to read customer reviews, but to understand them. The study comes as a collaboration between Jedidi, Cornell University Professor Khaled Boughanmi, and University of Waterloo PhD Candidate Nour Jedidi.

Their approach transforms unstructured feedback into a detailed map of what people talk about, how they feel about it, and which specific actions companies should take. The result is a tool that turns messy human language into clear, data-driven strategy.

How the Research Was Done

The researchers analyzed about 20,000 Yelp reviews of Starbucks stores across 722 locations in the U.S. and Canada, posted between 2005–2022. They designed a three-step LLM-based pipeline to extract insights.

First, the model scanned reviews to generate potential lists of attributes; broad categories like customer service or coffee quality, as well as features (specific, actionable aspects such as order accuracy or staff friendliness). Next, it analyzed each review sentence by sentence, tagging which attributes and features were mentioned, and rating sentiment on a five-point scale. Finally, it connected those sentiment scores to store ratings, identifying which features most strongly predicted satisfaction.

To ensure accuracy, the team compared the AI’s coding to human annotators. The LLM’s results were nearly as reliable, and while a human coder needed about six minutes per review, the AI completed the same task in under two seconds.

What the Researchers Found

From this data, the AI identified ten major attributes that customers discuss most—ranging from customer service and coffee quality to store ambiance and digital experience. Within each of these broad categories, it uncovered three to six specific features that companies could act on, such as staff professionalism, order accuracy, and store cleanliness.

Strikingly, the model was able to match human coders in accuracy, while also operating hundreds of times faster.

The analysis revealed that customers care more about the human side of service than about price or convenience. Interactions with baristas, wait times, and service quality dominated the reviews. Sentiment toward these attributes turned out to be a strong predictor of success: the model explained more than 70 percent of the variation in overall store ratings. Even modest improvements mattered. A one-point boost in how customers felt about staff professionalism translated into an average 0.19-point increase in store ratings—corresponding to a 1–2 percent average revenue gain per store.

Interestingly, the researchers also observed a decline in sentiment after 2016, suggesting that customer expectations continue to rise even as service quality plateaus. The finding underscores how quickly consumer standards evolve and how easily brands can fall behind when they fail to keep listening.

Listening Smarter

When every brand collects data, the winners will be those that can interpret it the fastest and act on it most effectively. LLMs can convert a flood of unstructured text into precise insight about where a business is succeeding and where it’s falling short. Rather than relying on vague measures like overall customer satisfaction, business leaders can see exactly which features drive delight or frustration.

The framework also opens the door to real-time intelligence. Firms can monitor customer sentiment across locations, regions, or time periods—transforming static reviews into a living feedback system. This turns customer commentary into something operationally valuable, i.e. a roadmap for continuous improvement rather than an overwhelming pile of anecdotes.

In this way, AI need not replace human understanding, but it can amplify it. By scaling what humans already do well, like recognizing patterns, identifying pain points, and prioritizing action, LLMs make large-scale listening possible for the first time.

FAQs

What makes this study different from traditional customer feedback analysis?

Traditional methods rely on humans to manually read and categorize reviews—a slow and subjective process. This study uses large language models (LLMs) to automatically extract themes, emotions, and actionable insights from thousands of customer comments. The result is a data-driven map of what customers care about most, built in seconds rather than weeks

How accurate is the AI compared to human analysis?

The researchers found that the AI’s performance was nearly as accurate as trained human annotators. While a person took about six minutes to analyze each review, the model did it in under two seconds, making it hundreds of times faster without sacrificing reliability.

What can businesses learn from these findings?

The study shows that customers value human interactions—like staff professionalism and friendliness—more than price or convenience. It also demonstrates that improving these “human” features can measurably increase satisfaction and revenue. Beyond that, the framework allows companies to monitor sentiment in real time, turning scattered feedback into a continuous system for improving service and strategy.