Extended Feature Spaces Based Classifier Ensembles for Sentiment Analysis of Short Texts

Zeynep Hilal Kilimci, Sevinç İlhan Omurca

Abstract


Sentiment classification has become very popular to analyze opinions about events, products, and so on, especially for social networks such as Twitter. Due to the size limitation of expressing ideas on social networks, the classification performance needs to be boosted by proposing various techniques. In this work, the enhancement of feature space with word embedding based features is proposed to deal with the size limitation issues and the classification success of sentiment analysis is improved by employing classifier ensembles. The contributions of this paper are fivefold. First, the representative capabilities of features are enriched by using a semantic word embedding model and followingly the conventional feature selection techniques are compared. Second, traditional machine learning algorithms, namely naïve Bayes, support vector machine, and random forest are carried out to select baseline classifier for the proposed ensemble system. Third, three ensemble strategies namely, bagging, boosting and random subspace are introduced to ensure the diversity of ensemble learning. Fourth, experiments are conducted to compare the performance of the models with the word embedding baseline. Eventually, a wide range of comparative experiments on Twitter datasets demonstrate that the classification performance of the proposed model significantly outperforms the state of art studies

DOI: http://dx.doi.org/10.5755/j01.itc.47.3.20935


Keywords


Word embedding; ant colony optimization; information gain; sentiment analysis; classifier ensembles; extended spaces

Full Text: PDF

Print ISSN: 1392-124X 
Online ISSN: 2335-884X