Publications from 2008
-
Seeded Discovery of Base Relations in Large Corpora
Relationship discovery is the task of identifying salient relationships between named entities in text. We propose novel approaches for two sub-tasks of the problem: identifying the entities of interest, and partitioning and describing the relations based on their semantics. In particular, we show that term frequency patterns can be used effectively instead of supervised NER, and that the p-median clustering objective function naturally uncovers relation exemplars appropriate for describing the partitioning. Furthermore, we introduce a novel application of relationship discovery: the unsupervised identification of protein-protein interaction phrases.
Nicholas Andrews , Naren Ramakrishnan
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, 2008