Abstract
Language plays a crucial role in marketing, influencing outcomes such as consumer engagement and decision-making. Although prior research has extensively analyzed the relationship between linguistic features and business outcomes, most approaches have been descriptive or predictive, limiting their value for crafting more effective content. Understanding the causal effects of specific linguistic features is essential but challenging because, in real-world settings, the focal textual feature often changes simultaneously with other confounding factors. This paper builds on recent advances in causal text analysis and introduces an embedding-based causal inference framework that isolates the impact of specific linguistic elements while controlling for both textual confounders and nontextual controls. The approach leverages foundational language models to create text representations optimized for causal inference, enhancing the accuracy of causal estimates. We rigorously validate our methodology using both semi-synthetic experiments and experimental data from large-scale A/B tests of news headlines. The experimental data allow us to offer a first-of-its-kind validation of the causal text approach using a marketing-relevant application. Applying the causal text approach to online donation and crowdfunding applications, we find that, for example, pre-thanking and second-person pronouns have a strong positive causal effect on success rates. However, these effects can be weakened or reversed if textual confounding is not properly controlled.