Additional Content
Main Content
OpinionCloud
Synopsis

In this project we develop OpinionCloud, a new opinion summarization technology for Web comments in general and YouTube and Flickr in particular. Popular Web items often get up to thousands of comments and in order to get an idea about the crowd's overall opinion one has to read all of them, which is of course impractical. Our summarization approach helps to retrieve this important piece of information by generating an opinion word cloud for a given set of comments. We operationalize the technology in a browser add-on which summarizes the comments on a YouTube video when the user starts watching it.
Browser Add-on
Install the OpinionCloud Firefox add-on, or the Google Chrome extension.
Project Outline
Our research on opinion summarization of Web comments boils down to two research areas: sentiment analysis and summary visualization. The former deals with the classification of words as positive, negative, or neutral, whereas the latter deals with the design of an accessible visual representation of a set of opinions.
Sentiment Analysis & Opinion Visualization. In sentiment analysis a word's polarity can be identified by measuring its co-occurrence with words whose polarity is known in advance, i.e., if a given word occurs with a high probability in the vicinity of positive (negative) words it can be considered positive (negative) as well. Neutral words, however, tend to occur arbitrarily next to words of both polarities. We use this idea to train a dictionary of opinion words which also contains slang terms that are often used in comments. The dictionary is then used to classify the words of comments into positive, negative, and neutral words. By default, words that are not contained in the dictionary are considered neutral.
The visualization of the opinions found in a set of comments is done as shown in Figure 1. The words are arranged in a cloud where the color of a word denotes its polarity and the size of a word its frequency in the comments. This visualization is comparable to the well-known tag clouds for folksonomies.
Why YouTube? We have chosen YouTube as a working example for our technology since a comment on YouTube usually contains only some kind of opinion exclamation, and, a large amount of comments is available. For a user, reading these comments is time-consuming and boring, or put another way, comments on YouTube are neither universally accessible nor useful. However, for an information retrieval researcher these comments form a unique large-scale corpus of highly opinion-coloured language. For instance, to train our dictionary we have analyzed about 9 million YouTube comments.
Related Publications
Content signature
© Fakultät Medien 07.02.2012 / Contact / Imprint / Data privacy / Your feedback
The Bauhaus-Universität Weimar uses Piwik for web analytics.




