Abstract
We analyze methods for selecting topics in news articles to explain stock returns. We find, through empirical and theoretical results, that supervised Latent Dirichlet Allocation (sLDA) implemented through Gibbs sampling in a stochastic EM algorithm will often overfit returns to the detriment of the topic model. We obtain better out-of-sample performance through a random search of plain LDA models. A branching procedure that reinforces effective topic assignments often performs best. We test these methods on an archive of over 90,000 news articles about S&P 500 firms.
Full Citation
Proceedings of the ACM International Conference on AI in Finance (ICAIF-2020)
vol.
1
,
(October 01, 2020):
unknown
.