PLEASE NOTE: our URL has recently changed from www.cs.pitt.edu/mpqa/ to mpqa.cs.pitt.edu. Please update your bookmarks accordingly.
-
MPQA Opinion Corpus
The MPQA Opinion Corpus contains news articles from a wide variety of news sources manually annotated for opinions and other private states (i.e., beliefs, emotions, sentiments, speculations, etc.). To download the MPQA Opinion Corpus click here.
For sample documents and instructions for MPQA annotation in GATE, click here. Updated July 2011.To learn more about the subjectivity and sentiment research that produced MPQA, please refer to the following publications:
Janyce Wiebe, Theresa Wilson , and Claire Cardie (2005). Annotating expressions of opinions and emotions in language. Language Resources and Evaluation, volume 39, issue 2-3, pp. 165-210.
Theresa Wilson (2008). Fine-Grained Subjectivity Analysis. PhD Dissertation, Intelligent Systems Program, University of Pittsburgh.
Additional Training/explanatory materials coming soon.
-
Subjectivity Lexicon
Made available under the terms of GNU General Public License. They are distributed without any warranty.
The Subjectivity Lexicon (list of subjectivity clues) that is part of OpinionFinder is also available for separate download. These clues were compiled from several sources (see the enclosed README). This is the version of the lexicon used in:
Theresa Wilson, Janyce Wiebe, and Paul Hoffmann (2005). Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis. Proc. of HLT-EMNLP-2005.
-
Subjectivity Sense Annotations
Made available under the terms of GNU General Public License. They are distributed without any warranty.
The Subjectivity Sense Annotations used in (Wiebe and Mihalcea, 2006), (Gyamfi et al., 2009), (Akkaya et al., 2009), and (Akkaya et al., 2011) are all available for download. All annotation efforts follow the annotation schema described in (Wiebe and Mihalcea 2006). Further information on the data can be found in the README of the archive you download.
Janyce Wiebe and Rada Mihalcea (2006). Word Sense and Subjectivity. Joint conference of the International Committee on Computational Linguistics and the Association for Computational Linguistics. (COLING-ACL 2006).
Yaw Gyamfi, Janyce Wiebe, Rada Mihalcea and Cem Akkaya (2009). Integrating Knowledge for Subjectivity Sense Labeling. Joint Conference of the North American Chapter of the Association for Computational Linguistics and the Human Language Technologies Conference (NAACL-HLT 2009).
Cem Akkaya, Janyce Wiebe and Rada Mihalcea. (2009). Subjectivity Word Sense Disambiguation. Conference on Empirical Methods on Natural Language Processing (EMNLP 2009).
Cem Akkaya, Janyce Wiebe, Alexander Conrad and Rada Mihalcea (2011). Improving the Impact of Subjectivity Word Sense Disambiguation on Contextual Opinion Analysis. Conference on Computational Natural Language Learning (CoNNL 2011).
-
Product Debate Data
The Product Debate Corpus is available for download. Further information on the data can be found in the README of the archive you download. This corpus was used in:Swapna Somasundaran and Janyce Wiebe (2009). Recognizing Stances in Online Debates. ACL 2009: Joint conference of the 47th Annual Meeting of the Association for Computational Linguistics and the 4th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, August 2-7, 2009, Singapore.
-
Political Debate Data
The Political Debate Corpus is available for download. Further information on the data can be found in the README of the archive you download. This corpus was used in:
Swapna Somasundaran and Janyce Wiebe (2010). Recognizing Stances in Ideological On-line Debates In Proceedings of the NAACL HLT 2010 Workshop on Computational Approaches to Analysis and Generation of Emotion in Text, pages 116-124, Los Angeles, CA. Association for Computational Linguistics, 2010.
-
OpinionFinder System
OpinionFinder is a system that processes documents and automatically identifies subjective sentences as well as various aspects of subjectivity within sentences, including agents who are sources of opinion, direct subjective expressions and speech events, and sentiment expressions. OpinionFinder was developed by researchers at the University of Pittsburgh, Cornell University, and the University of Utah. In addition to OpinionFinder, we are also releasing the automatic annotations produced by running OpinionFinder on a subset of the Penn Treebank. To go to the OpinionFinder download page click here.
Please note that OpinionFinder currently only runs on Linux.
OpinionFinder 2 coming soon!
