Abstract
We present an end-to-end text mining methodology for relation extraction of adverse drug reactions (ADRs) from medical forums on the Web. Our methodology is novel in that it combines three major characteristics: (i) an underlying concept of using a head-driven phrase structure grammar (HPSG) based parser; (ii) domain-specific relation patterns, the acquisition of which is done primarily using unsupervised methods applied to a large, unlabeled text corpus; and (iii) automated post-processing algorithms for enhancing the set of extracted relations. We empirically demonstrate the ability of our proposed approach to predict ADRs prior to their reporting by the Food and Drug Administration (FDA). Put differently, we put our approach to a predictive test by demonstrating that our methodology can credibly point to ADRs that were not uncovered in clinical trials for evaluating new drugs that come to market but were only reported later on by the FDA as a label change.