Abstract
We develop a flexible content-based search model that links the content preferences of search engine users to query search volume and click-through rates, while allowing content preferences to vary systematically based on the context of a search. Content preferences are defined over latent topics that describe the content of search queries and search result descriptions. Compared with existing applications of topic modeling in marketing and recommendation systems, our proposed approach can simultaneously capture multiple types of information and investigate multiple aspects of behavioral dynamics in a single framework that enables interpretable results for business decision making. To facilitate efficient and scalable inference, we develop a full Bayesian variational inference algorithm. We evaluate our modeling framework using real-world search data for TV shows from the Bing search engine. We illustrate how our model can quantify the content preferences associated with each query and how these preferences vary systematically based on whether the query is observed before, during, or after a TV show is aired. We also show that our model can help the search engine improve its ranking of search results as well as address the cold-start problem for new page links.