Abstract
Large language model (LLM)-based digital twin simulation, where LLMs are used to emulate individual human behavior, holds great promise for research in business, artificial intelligence, social science, and digital experimentation. However, progress in this area has been hindered by the scarcity of real individual-level data sets that are both large and publicly available. To address this gap, we introduce a large-scale public data set designed to capture a rich and holistic view of individual human behavior. We survey a representative sample of 𝑁=2,058 participants (average 2.42 hours per person) in the United States across four waves with more than 500 questions in total, covering a comprehensive battery of demographic, psychological, economic, personality, and cognitive measures, as well as replications of behavioral economics experiments and a pricing survey. The final wave repeats tasks from earlier waves to establish a test-retest accuracy baseline. Initial analyses suggest the data are of high quality and show promise for constructing digital twins that predict human behavior well at the individual and aggregate levels. Beyond LLM applications, due to its unique breadth and scale, the data set also enables broad social science and business research, including studies of cross-construct correlations and heterogeneous treatment effects.