Skip to main content
Official Logo of Columbia Business School
Academics
  • Visit Academics
  • Degree Programs
  • Admissions
  • Tuition & Financial Aid
  • Campus Life
  • Career Management
Faculty & Research
  • Visit Faculty & Research
  • Academic Divisions
  • Search the Directory
  • Research
  • Research Resources
  • Teaching Excellence
Executive Education
  • Visit Executive Education
  • For Organizations
  • For Individuals
  • Program Finder
  • Online Programs
  • Certificates
About Us
  • Visit About Us
  • CBS Directory
  • Events Calendar
  • Leadership
  • Our History
  • The CBS Experience
  • Newsroom
Alumni
  • Visit Alumni
  • Update Your Information
  • Lifetime Network
  • Alumni Benefits
  • Alumni Career Management
  • Women's Circle
  • Alumni Clubs
Insights
  • Visit Insights
  • AI & Transformative Tech
  • Climate
  • Business & Society
  • Entrepreneurship
  • Finance & Investing
  • Magazine

Uncovering the Costly Bias in Marketplace Testing

Statistical bias could be misleading your product and feature testing, according to research from Columbia Business School Professor Hannah Li, but solutions might be easier than you think.

Published
April 21, 2025
Publication
Research In Brief
Focus On
AI & Transformative Tech, Marketplace Design
Jump to main content
Article Author(s)

Jonathan Sperling

Affiliated Author
Online real estate listings

Key Takeaways

Traditional A/B testing assumes a type of user independence, i.e. that the treatment assigned to one individual does not influence the behavior of another.

In platforms like marketplaces or social networks, this assumption often fails because users interact, compete, or influence one another, creating interference bias.

As a result, companies risk wrongly rolling out or rejecting features, all while believing they're making sound, data-driven decisions.

Smarter experimental designs, such as Two-Sided Randomization, can reduce bias.

Other forms of biases can arise in recommendation systems, where users can strategically interact with their recommendation algorithms by deliberately changing how they engage with content.

Category
Thought Leadership
Topic(s)
Data/Big Data, AI and Transformative Tech, Marketplace

About the Researcher(s)

Hannah Li

Hannah Li

Assistant Professor of Business
Decision, Risk, and Operations Division

0%

A/B testing can be a relatively quick and cost-efficient tool for leaders and their companies to test new features on a subset of users to understand the impact before broader deployment.  However, this testing can come with a serious caveat in many industries.

Imagine you're testing a new feature on your website — the impact of showing better-quality photos for rental listings on a platform like Airbnb. You randomly split users into two groups in preparation for an A/B test: a treatment group sees new, high-quality photos, while a control group sees the original, standard images.

In a perfect world, each user's behavior would be unaffected by what the other group sees. But that assumption often breaks down in reality, especially in marketplaces or social networks. According to research from Hannah Li, an assistant professor in Columbia Business School's Decision, Risk, and Operations Division, users don't operate in isolation- they interact, compete, and influence each other.

"When you run A/B testing in marketplaces where you have users buying and selling things from each other, the users are no longer going to be independent," Li says.

Preventing Statistical Bias

Li explained that when someone in the treatment group books a listing due to the higher-quality photos, there's now one less listing available for someone in the control group. This means the treatment unintentionally affects the control group, violating a core assumption of A/B testing: independence. 

That distortion is what Li and her fellow researchers call interference bias – an occurrence that can bias as high as 230%, meaning companies might believe an intervention is more than twice as effective as it is. That can lead to false confidence in a product change — launching something you think is a success, only to find it doesn't work in the real world. Worse, it might cause you to kill ideas that would've worked simply because your experiment didn't account for how users affect one another. All the while, a company believes they are making air-tight, data-driven decisions. 

In their research, Li and her co-researchers found that implementing the right experimental systems can curtail this bias.

Interference in Action

To investigate how interference bias arises in two-sided platforms, the researchers developed a formal marketplace model using continuous-time Markov chains. This mathematical framework allowed them to simulate a dynamic environment where buyers and sellers arrive, interact, and transact over time. 

Li and her co-researchers found that preventing this bias can be done through a novel form of experimental design, known as Two-Sided Randomization (TSR). TSR randomizes both sides of the marketplace simultaneously. Instead of randomizing either  sellers or buyers to treatment or control groups, TSR randomizes both sides, sellers and buyers, to these groups. This type of design allows the platform to measure competition effects between sellers and between buyers, the source of the interference bias, and account for these effects in the experiment estimates. 

This leads to far more accurate estimates of an experiment's Global Treatment Effect (GTE) — the metric most companies care about when deciding whether to roll out a feature to all users. Simulations from Li and her co-researchers' paper show that TSR consistently produces lower bias than standard experimental methods, across a wide range of market conditions.

If TSR is not feasible, there are other approaches companies can take, according to Li. Cluster Randomization, for example, groups users (e.g., by region) and randomizes them to minimize cross-group interaction.

Another technique is Switchback Testing. Instead of splitting users into a control group and a treatment group, alternate the treatment across time periods for the entire platform (e.g., on one day, off the next).

When Users are Strategic

A subsequent paper by Li studies how systems of people strategically interact with online platforms to influence recommended content—another form of bias that can throw companies off.

Typically, platforms like TikTok, Netflix, and Amazon suggest content based on users' past behaviors, assuming user interactions are straightforward reflections of their preferences. However, Li and her co-researchers' study suggests that users often engage in strategic behavior to shape their future recommendations.

For instance, when participants were informed that an algorithm prioritizes "likes" and "dislikes," they used these features almost twice as much as those told the algorithm focuses on viewing time. Through surveys, the researchers found that nearly half of the participants admitted to altering their behavior on platforms to control future recommendations. Some users even reported avoiding content they enjoy to prevent the platform from over-recommending similar content in the future.

"If you watch a video on YouTube, the platform learns that you like it. If you don't watch it, they learn you don't like it. But what we heard is that users are strategizing. They may see a YouTube video and actually like it, but they know that if they click on it, they will get millions of the same videos for the next three weeks. So, they don't watch the video," Li says, adding that "when this happens, the data that's being collected is not representative of the user's true preferences."

Experimental Music

To study how users adapt their behavior in response to recommendation systems, Li and her co-authors created their own music streaming app—essentially a simplified version of Spotify. This gave them total control over what users saw and how the system reacted. By stripping away real-world platform complexities, they could focus entirely on whether users tried to “game” the algorithm.

The study’s 750 participants were randomly assigned to different conditions in a controlled environment. Everyone listened to songs and could “like” or “dislike” them, or just skip ahead. In the first session, participants used the music player naturally, as if they were on a real platform. 

In the following session, participants were randomly told different things about how the recommendation algorithm worked. Some were told the system cared most about likes/dislikes, others were told it prioritized listening time, and a control group got no guidance.

This setup let researchers test how user behavior changed depending on what users believed the algorithm cared about—without changing the actual algorithm. By observing how people’s actions varied under these scenarios, the researchers could see whether users acted strategically—choosing actions not just based on personal enjoyment, but also based on what they thought would “train” the algorithm in their favor. he main behavioral metrics tracked included:

The researchers paid close attention to the number of “likes” and “dislikes” and how long users stayed on each song, or dwell time. The researchers also conducted follow-up surveys to confirm whether users admitted to similar strategic behaviors on real-world platforms like Spotify or TikTok.

Li suggested that the fact users are strategizing indicates that recommendation systems, such as Instagram’s “Explore”  page, may be over-indexing on known user preferences rather than exploring new content. Adjusting the algorithms to be less heavy-handed in pushing familiar content could help address this issue.

She also noted that users would ideally be able to more easily alter the algorithm behind their personal feed rather than strategize their behavior. Giving users more control and transparency over the recommendation system could help mitigate stratification.

 

Adapted from “Measuring Strategization in Recommendation,” by Hannah Li of Columbia Business School, Sarah H. Cen of Massachusetts Institute of Technology, Andrew Ilyas of Massachusetts Institute of Technology, Jennifer Allen of Massachusetts Institute of Technology, and Aleksander Mądry Massachusetts Institute of Technology.

Also adapted from “Experimental Design in Two-Sided Platforms,” by Hannah Li of Columbia Business School, Ramesh Johari of Stanford University, Inessa Liskovich of Stanford University, and Gabriel Y. Weintraub of Stanford University.

About the Researcher(s)

Hannah Li

Hannah Li

Assistant Professor of Business
Decision, Risk, and Operations Division

You Might Like

Artificial Intelligence, Business and Society, Faculty Views, Financial Institutions, Innovation, Machine Learning, Strategy
Date
November 03, 2025
Shutterstock Photo Image
Artificial Intelligence, Business and Society, Faculty Views, Financial Institutions, Innovation, Machine Learning, Strategy

AI Can Read the Room Better Than You Think

AI can decode thousands of online reviews to reveal what customers really care about—and what businesses should fix first.
  • Read more about AI Can Read the Room Better Than You Think about AI Can Read the Room Better Than You Think
Business and Society, Financial Institutions, Leadership, Strategy, World Business
Date
October 29, 2025
Shutterstock Photo Image
Business and Society, Financial Institutions, Leadership, Strategy, World Business

Why Business Rivals Join Forces

Alliances between fierce competitors are quietly setting innovation standards, influencing regulation, and shaping society itself. A new framework by Columbia Business School Professor Lori Yue is helping us to understand how.
  • Read more about Why Business Rivals Join Forces about Why Business Rivals Join Forces
AI and Transformative Tech, Artificial Intelligence, Business and Society, Management, Technology
Date
October 09, 2025
laptop with applications open and AI concept overlaid on it
AI and Transformative Tech, Artificial Intelligence, Business and Society, Management, Technology

AI Is Making You Boring

AI agents make our choices more predictable and less varied—raising serious questions about human individuality in an age of automation, according to new research by Columbia Business School’s Sandra Matz.
  • Read more about AI Is Making You Boring about AI Is Making You Boring
Business and Society, Entertainment, Marketing, Social Impact
Date
September 24, 2025
Concert attendee
Business and Society, Entertainment, Marketing, Social Impact

3 Keys to Creating Memorable Consumer Experiences

Columbia Business School research reveals why some moments stay with us while others fade, uncovering the three psychological pillars behind meaningful, memorable, and authentic consumer experiences.
  • Read more about 3 Keys to Creating Memorable Consumer Experiences about 3 Keys to Creating Memorable Consumer Experiences
Save Article

Download PDF

More to Explore
Share
  • Share on Facebook
  • Share on Threads
  • Share on LinkedIn
Official Logo of Columbia Business School

Columbia University in the City of New York
665 West 130th Street, New York, NY 10027
Tel. 212-854-1100

Maps and Directions
    • Centers & Programs
    • Current Students
    • Corporate
    • Directory
    • Support Us
    • Recruiters & Partners
    • Faculty & Staff
    • Newsroom
    • Careers
    • Contact Us
    • Accessibility
    • Privacy & Policy Statements
Back to Top Upward arrow
TOP

© Columbia University

  • X
  • Instagram
  • Facebook
  • YouTube
  • LinkedIn

External CSS