Skip to main content
Official Logo of Columbia Business School
Academics
  • Visit Academics
  • Degree Programs
  • Admissions
  • Tuition & Financial Aid
  • Campus Life
  • Career Management
Faculty & Research
  • Visit Faculty & Research
  • Academic Divisions
  • Search the Directory
  • Research
  • Faculty Resources
  • Teaching Excellence
Executive Education
  • Visit Executive Education
  • For Organizations
  • For Individuals
  • Program Finder
  • Online Programs
  • Certificates
About Us
  • Visit About Us
  • CBS Directory
  • Events Calendar
  • Leadership
  • Our History
  • The CBS Experience
  • Newsroom
Alumni
  • Visit Alumni
  • Update Your Information
  • Lifetime Network
  • Alumni Benefits
  • Alumni Career Management
  • Women's Circle
  • Alumni Clubs
Insights
  • Visit Insights
  • Digital Future
  • Climate
  • Business & Society
  • Entrepreneurship
  • 21st Century Finance
  • Magazine
Marketing

Can the Words We Use Predict the Reliability of Scientific Research?

Average Read Time:

Professor Oded Netzer and a team of scholars investigate whether the language used in scientific papers can indicate the replicability of the research.

Article Author(s)
  • Stephanie Walden
Based on Research by
Michal Herzenstein, Oded Netzer, Sanjana Rosario, Shin Oblander
Published
August 27, 2024
Publication
Research In Brief
Jump to main content
A research paper on a phone screen
Category
Thought Leadership
News Type(s)
Marketing Division News
Topic(s)
Marketing
Save Article

Download PDF

About the Researcher(s)

Oded Netzer

Oded Netzer

Arthur J. Samberg Professor of Business
Marketing Division
Vice Dean for Research
Dean's Office

View the Research

The Language of (Non)replicable Social Science

0%

Share
  • Share on Facebook
  • Share on Threads
  • Share on LinkedIn

Reproducibility and replicability are core tenets of modern science, but can a study’s findings be replicated, or are the results merely a one-off? 

Replicable studies are necessary for credible and reliable research. But in recent years, a lack of reproducibility across a range of scientific disciplines has raised questions about the validity of certain findings. Some commentators have called the phenomenon a “replication crisis.”

Oded Netzer, the Arthur J. Samberg Professor of Business at Columbia Business School, and a team of researchers wanted to examine the reproducibility of social science through a linguistic lens. “The language we use is often a signal of our intentions, personality, identity, and state of mind,” says Netzer. “We wondered: Can the language researchers use when they write a scientific paper tell us about the replicability of that work?”

Key takeaways:

  • The language used in academic papers can predict the research’s replicability, even after controlling for various characteristics of the authors, the paper, and study design.
  • Replicable studies often use detailed, complex, and confident language, which aligns with markers of truthful communication.
  • Nonreplicable studies tend to use vague language and exhibit markers of persuasion, such as “positive words” and clout-focused terms.
  • Language analysis could be a useful tool in assessing the credibility of scientific studies, supporting the goals of the open science movement.

How the research was done: The researchers examined 299 papers, all of which had previously been subjected to replication studies conducted as part of large-scale replication projects in psychology and economics. The success or failure of these replication attempts served as the ground truth for Netzer’s study.

The texts of the 299 papers were run through a statistical model powered by machine learning to pinpoint linguistic patterns. The analysis relied on representation learning models, as well as tools like the Linguistic Inquiry and Word Count dictionaries, which are a set of 92 predefined categories of words that reflect various psychological constructs, cognitive processes, and linguistic dimensions. “We targeted features like whether the paper uses more numeric and quantifables words. Does it use a more positive tone? Does it use more or less emotional words?” explains Netzer. Another focus area was obfuscation and readability — what level of reading comprehension is needed to understand the text? The analysis also examined the narrative arc of the texts. “We wanted to know whether these papers tell a story,” says Netzer.

The researchers implemented a strict set of controls to ensure findings weren’t influenced by other factors. “We collected comprehensive metadata, or control variables that are not strictly the text,” says Netzer. “This includes things like the keywords used in the paper, the area within psychology or economics it is focused on, the types of subjects studied, and if the research was conducted in the United States or elsewhere.” The researchers also identified control variables like the number of figures and tables included, sources cited, and authors’ level of seniority, among other metrics.

What the researchers found: Even taking the control variables into account, the study found that the language used in academic papers was a significant predictor of replicability.

In replicable papers, the language tended to be more complex, detailed, and confident sounding. There were more quantifying and comparative terms, auxiliary verbs, and interrogative words — who, what, when, why, where — indicating a deeper engagement with the data. “The replicable papers had a lot of the markers we find in linguistic work around truth-telling,” says Netzer. “They did a better job at contextualizing the results.”

In contrast, nonreplicable papers used more positive words, future tense, and vague language. They also tended to contain more phrases associated with clout — like we, us, group, and social.

Additionally, nonreplicable studies relied more on abstraction, readability, and skilled storytelling. “Readability was higher and these papers tended to have more of a narrative arc,” says Netzer. He hypothesizes that this is due to the nature of how papers are typically evaluated: “If the science is weak but the writing is good, it may still pass the bar.”

An important caveat, Netzer notes, is that the findings don’t imply that authors of nonreplicable studies are operating with ill intent. “It’s not that the authors are lying or know they’re wrong. But they may have a hunch, or they may simply be less confident in their results,” he explains. That lack of confidence shows up in the paper’s writing style.

Why it matters: Amid a larger “attack on science,” the replicability crisis adds fuel to the fire, undermining public trust in scientific findings.

As such, this research has significant implications for the scientific and academic communities. By highlighting the connection between language and replicability, it provides a new tool for assessing credibility. Netzer also hopes that his team’s work can contribute to the mounting efforts of the open science movement, which aims to increase transparency and reproducibility in research.

He also believes the general public can apply these findings to their critical thinking process when, for instance, listening to science podcasts or reading about a new “breakthrough” on social media. "As we’re listening to or reading science, we should pay attention to signals like, does the research provide sufficient details and elaboration, or does it report the results in an overly positive manner?” says Netzer. “Assessing these things can help us figure out if we can trust the research."

 

Adapted from “The Language of (Non)replicable Social Science” by Oded Netzer from Columbia Business School, Michal Herzenstein from the University of Delaware, Sanjana Rosario from Columbia Business School, and Shin Oblander from the University of British Columbia. 

About the Researcher(s)

Oded Netzer

Oded Netzer

Arthur J. Samberg Professor of Business
Marketing Division
Vice Dean for Research
Dean's Office

View the Research

The Language of (Non)replicable Social Science

Related Articles

Date
June 17, 2025
Escalators in a building

How “Better World” Innovation Is Redefining Business Success

A new CBS-backed framework redefines innovation to prioritize sustainability, equity, and long-term value in today’s business landscape.
  • Read more about How “Better World” Innovation Is Redefining Business Success about How “Better World” Innovation Is Redefining Business Success
Economics and Policy
Elections
Date
May 30, 2025
Student in a library
Economics and Policy
Elections

Can Machine Learning Detect Political Bias in Economics Papers?

New Columbia Business School research shows how economists’ political leanings can be detected in their academic writing, and may shape their findings on key policy issues like taxes and labor.
  • Read more about Can Machine Learning Detect Political Bias in Economics Papers? about Can Machine Learning Detect Political Bias in Economics Papers?
Algorithms
Artificial Intelligence
Business and Society
Digital Future
Entrepreneurship
Management
Date
May 05, 2025
Illustration of hands, resumes and laptop
Algorithms
Artificial Intelligence
Business and Society
Digital Future
Entrepreneurship
Management

Did AI Write That Pitch? The Impact of Generative AI on Hiring and Startup Evaluations

Research from Columbia Business School examines the challenges posed by generative AI in hiring and entrepreneurial pitching, offering insights into when AI helps — and when it hinders.

  • Read more about Did AI Write That Pitch? The Impact of Generative AI on Hiring and Startup Evaluations about Did AI Write That Pitch? The Impact of Generative AI on Hiring and Startup Evaluations
Economics and Policy
Finance
Financial Institutions
Financial Policy
Financial Technology
Date
April 23, 2025
Woman working on finances
Economics and Policy
Finance
Financial Institutions
Financial Policy
Financial Technology

How Tax-Deferred Retirement Accounts Cost the U.S. Government $23 Billion a Year

Columbia Business School research reveals the hidden cost of traditional retirement accounts: a $3.8 trillion government-owned investment portfolio driving $23.4 billion in annual fees. A shift to Roth accounts could save billions — and fund a national retirement match.

  • Read more about How Tax-Deferred Retirement Accounts Cost the U.S. Government $23 Billion a Year about How Tax-Deferred Retirement Accounts Cost the U.S. Government $23 Billion a Year

External CSS

Articles A11y button

Official Logo of Columbia Business School

Columbia University in the City of New York
665 West 130th Street, New York, NY 10027
Tel. 212-854-1100

Maps and Directions
    • Centers & Programs
    • Current Students
    • Corporate
    • Directory
    • Support Us
    • Recruiters & Partners
    • Faculty & Staff
    • Newsroom
    • Careers
    • Contact Us
    • Accessibility
    • Privacy & Policy Statements
Back to Top Upward arrow
TOP

© Columbia University

  • X
  • Instagram
  • Facebook
  • YouTube
  • LinkedIn