Abstract
The author proposes a topic model tailored to the study of creative documents (e.g., academic papers, movie scripts), which extends Poisson factorization in two ways. First, the creativity literature emphasizes the importance of novelty in creative industries. Accordingly, this article introduces a set of residual topics that represent the portion of each document that is not explained by a combination of common topics. Second, creative documents are typically accompanied by summaries (e.g., abstracts, synopses). Accordingly, the author jointly models the content of creative documents and their summaries, and captures systematic variations in topic intensities between the documents and their summaries. This article validates and illustrates the model in three domains: marketing academic papers, movie scripts, and TV show closed captions. It illustrates how the joint modeling of documents and summaries provides some insight into how people summarize creative documents and enhances understanding of the significance of each topic. It shows that the model described produces new measures of distinctiveness that can inform the perennial debate on the relation between novelty and success in creative industries. Finally, the author shows how the proposed model may form the basis for decision support tools that assist people in writing summaries of creative documents.