Healthcare Opinion Poll Dataset

We utilized a set of responses to a public opinion poll about the 2010 U.S. healthcare legislation for a project on viewpoint summarization. This data is useful because it includes labels ('for' or 'against' the legislation) as well as interesting demographic information of the respondants.

The data can be downloaded directly from the Gallup website: the raw verbatim responses, and the article describing the results, from which our gold standard summaries were created.

We are also releasing the annotated evaluation set we created in the paper below. This consists of a set of pairs of sentences across viewpoints that our annotators identified as being contrastive with each other.

  • Michael J. Paul, ChengXiang Zhai and Roxana Girju. Summarizing Contrastive Viewpoints In Opinionated Text. In the proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing (EMNLP 2010), pages 65-75, MIT, Cambridge, Massachusetts. October 2010.