Business, College of


Date of this Version



A THESIS Presented to the Faculty of The Graduate College at the University of Nebraska In Partial Fulfillment of Requirements For the Degree of Master of Arts. Major: Management Information System. Under the Supervision of Professor Keng L. Siau. Lincoln, Nebraska April, 2012


Sentiment analysis is a technique to classify people’s opinions in product reviews, blogs or social networks. It has different usages and has received much attention from researchers and practitioners lately. In this study, we are interested in product feature based sentiment analysis. In other words, we are more interested in identifying the opinion polarities (positive, neutral or negative) expressed on product features than in identifying the opinion polarities of reviews or sentences. This is termed as the product feature based sentiment analysis. Several studies have applied unsupervised learning to calculate sentiment scores of product features. Although many studies used supervised learning in document-level or sentence-level sentiment analysis, we did not come across any study that employed supervised learning to product feature based sentiment analysis. In this research, we investigated unsupervised and supervised learning by incorporating linguistic rules and constraints that could improve the performance of calculations and classifications. In the unsupervised learning, sentiment scores of product features were calculated by aggregating opinion polarities of opinion words that were around the product features. In the supervised learning, feature spaces that contained right features for product feature based sentiment analysis were constructed. To reduce the dimensions of feature spaces, feature selection methods, Information Gain (IG) and Mutual Information (MI), were applied and compared. The results show that (i) product features
were good indicators in determining the polarity classifications of document or sentences; (ii) rule based features could perform well in supervised learning e; and (iii) IG performed better in document analysis, while MI performed better in sentence-level analysis.

Advisor: Keng L. Siau