Abstract
Firms often rely on randomized experiments to estimate customer-level treatment effects for targeting policies. Standard “test-then-learn” approaches typically sample customers uniformly to optimize estimation accuracy but ignore economic objectives, leading to statistically sound yet suboptimal targeting policies. We propose a policy-aware sampling strategy—that directly incorporates the firm’s profit-maximizing objective into the selection of which customers to sample in the experiment. Specifically, we introduce expected profit loss (EPL), a criterion that quantifies customers’ learning value by measuring the financial impact of treatment effect estimation errors on targeting profitability. We show that allocating experimental resources based on EPL achieves near-optimality with theoretical guarantees, and develop a sequential sampling algorithm that prioritizes these consequential customers with highest EPLs for practical implementation. Using simulations and two empirical applications, we show that our approach yields more profitable targeting policies than existing methods. Across both applications, our approach improves targeting performance by 5% to 10% in profit terms relative to benchmark methods, and achieves comparable outcomes with up to 97% fewer experimental samples, highlighting substantial gains in both targeting effectiveness and data efficiency. Our results underscore the importance of aligning experimental design and accuracy considerations with business objectives.