Somewhere in a hiring manager’s inbox right now, two cover letters for the same position await review. Both submissions are eloquently written and suggest the job seeker would be an excellent fit for the role. But there’s an invisible difference between them: One was written entirely by the candidate, while the other was polished to perfection using ChatGPT.
The meteoric rise of generative AI presents evaluators — from hiring managers reviewing job applications to investors weighing startup pitches — with a new challenge: distinguishing genuine expertise from AI mimicry.
Nataliya L. Wright and Bo Cowgill, both assistant professors in the Management Division of Columbia Business School, alongside colleague Pablo Hernandez-Lagos, recently conducted research on this conundrum. The trio sought to understand if AI tools make it harder for evaluators to identify qualified candidates and whether there are any contexts in which AI actually improves screening processes.
“It’s a very complicated story,” says Wright. “We wanted to examine the longer term dynamic: What happens when everybody is writing pitches or job applications this way and evaluators know it?”
Key Takeaways:
- The rise of generative AI makes it more difficult for evaluators — like hiring managers reviewing job applications or investors assessing startup pitches — to distinguish polished AI-assisted submissions from genuine expertise.
- Overall, generative AI reduces evaluators’ ability to spot expertise by 4 to 9 percent, particularly when it helps less-qualified individuals mimic the language of experts.
- That said, AI can also be helpful in certain screening contexts — like when skilled candidates face language barriers in pitching but have deep subject-matter expertise.
- Organizations may need new methods, like face-to-face interviews or task-based assessments, to counter AI’s impact on written materials.
Game Theory Meets AI Ghostwriting
The researchers took a mixed-methods approach, balancing theoretical modeling with empirical research. For the theoretical portion, they applied game theory — a branch of mathematics that models strategic interactions — to map out how both candidates and evaluators might change their behavior knowing AI tools are part of the equation.
The analysis involved designing a signaling game, which captures the interplay between candidates conveying their abilities and evaluators interpreting those signals. The model needed to account for a complex web of motivations and expectations. “It’s not only about what the evaluator or applicant wants but also what I, as an applicant, think the evaluator wants and vice versa,” says Hernandez-Lagos.
The theoretical component also addressed a more technical challenge — a phenomenon economists call a signal extraction problem. Essentially, how can evaluators separate meaningful indicators of expertise from noise when both qualified and unqualified candidates have access to tools that can mirror expert language?
The modeling exercise revealed a key insight: When AI tools help already-skilled users more than they help novices, they can make screening easier for evaluators. But when AI provides similar benefits to both experts and non-experts, it becomes harder to distinguish true expertise.
Cowgill illustrates how AI can amplify expertise in the domain of visual art: “In AI art, AI can help novice artists on the margin. But it gives superpowers to people who actually know and understand art. Trained artists can prompt the AI with detailed requests that novices would never think about. For example, ‘Paint the NYC skyline with the colors of Monet but the brush strokes of Picasso’ or something similarly specific that leverages art history and technique."
Bringing Theory into the Real World
To complement their theoretical predictions, the researchers also designed a large-scale experiment drawing from a global pool of participants. They recruited both job candidates and entrepreneurs to write pitches about areas in which they had deep knowledge as well as topics in which they didn’t. Each participant produced four different ideas: two in their area of expertise (one with ChatGPT, one without) and two outside their expertise (again, one with the aid of AI, one without).
On the evaluation side, the researchers enlisted experienced hiring managers and investors to assess the pitches. Each evaluator reviewed eight different pitches. The researchers measured whether pitches written with AI made it more difficult for evaluators to distinguish between experts and non-experts, compared to pitches written without AI. They also tracked how often evaluators felt they needed additional information beyond the pitch.
Does AI Help or Hinder Screenings?
The study found that, overall, AI reduces evaluators’ ability to identify true expertise by about 4 to 9 percent. But interestingly, this effect isn’t uniform across all contexts.
“We wanted to investigate if AI and similar technologies could be helpful to people with good ideas who aren’t able to communicate them effectively,” explains Wright. “Basically, if you have a great product or idea but don’t have the English language to convey it, does generative AI give you the ability to do that?”
The research confirmed it does: When an expert has strong background knowledge but faces language barriers, AI helps them communicate more effectively — without providing similar benefits to non-experts. “It also implies that generative AI has different effects in different geographies,” notes Wright.
However, among native English speakers, ChatGPT tends to level the playing field in written communications, making it harder to distinguish expertise through writing alone.
Forging Ahead: Fairer Evaluations in an AI-Fueled World
The study’s findings challenge the assumption that AI will universally degrade the ability to evaluate written work. It could even mean more accurate screening processes in certain cases. For example, AI could make it easier for companies to hire for international roles in which non-English-speaking candidates struggle to convey their knowledge fluently. It could also open doors in emerging markets where cultural or linguistic barriers hinder communication.
That said, the research also suggests a need for evaluators to adapt their methods in areas where AI creates homogeneity. Organizations might need to rely less on written materials and more on alternative methods, such as conducting in-depth interviews or verifying qualifications through background checks.
Wright emphasizes the risks of such adjustments, noting that in the absence of reliable screening signals, evaluators may rely more heavily upon their own biases. For example, they may falsely associate capability and competence with factors like gender, race, or alma mater. Any new screening processes that emerge in response to AI must carefully guard against such systematic biases.
Ultimately, the researchers hope their work sparks more deliberate innovation in how expertise is assessed. Wright adds that refining evaluative methods can help organizations balance fairness with the realities of an AI-driven landscape across the entire professional spectrum: “At the end of the day, the inherent model and findings are relevant to nearly any field where persuasion and expertise intersect — whether it’s sales, screenwriting, or even politics.”
Adapted from “Does AI Cheapen Talk? Theory and Evidence from Global Entrepreneurship and Hiring” by Bo Cowgill of Columbia Business School, Pablo Hernandez-Lagos of Yeshiva University, and Nataliya L. Wright of Columbia Business School.