Sentiment Classification of Opinions based on Multi-source Transfer Learning Using Structural Correspondence Learning

Document Type : Persian Original Article

Authors

1 Computer Engineering Department, Faculty of Engineering, Yazd University, Safayieh, Yazd, Iran

2 Computer Engineering Department, Faculty of Engineering, Shahrekord University, Rahbar Blvd., Shahrekord, Iran

Abstract

Abstract :Sentiment classification of opinions is a field of Natural Language Processing which has been considered in recent years by researchers due to popularity of Internet stores and the possibility of expressing opinions about sold goods or services. To train classifier models, we need labeled datasets, but as there are not rich labeled samples and as labeling is a difficult and time-consuming process, we must employ labeled samples of other domains. In this article, a new method for binary classification of opinions is proposed based on multi-domain transfer learning. The proposed method tries to adapt different domains by using Structural Correspondence Learning; and based on repetitive procedure of the boosting algorithm, a weight is assigned to classified samples of different domains and the class of each opinion is specified by merging these classifiers. Weighting the dataset samples to boost the process of classification based on the Adaboost algorithm and combining it with the Structural Corresponding Learning is the most important innovation of the current research. The Amazon dataset of four different domains, each one containing 1000 positive and 1000 negative opinions is used for training the proposed model. Accuracy measures of %89.64, %93.97, %92.39 and %90.17 are obtained for Electronics, DVD, Books and Kitchen domains, respectively. It illustrates that the proposed method is very effective compared with the similar methods.

Keywords


[1]          L. Deng, “A tutorial survey of architectures, algorithms, and applications for deep learning,” APSIPA Transactions on Signal Information Processing., Vol. 3, no 2, pp. 1-29, 2014.
[2]          S. J. Pan and Q. Yang, “A survey on transfer learning,” IEEE Transactions on Knowledge and Data Engineering, Vol. 22, No 10, pp. 1345–1359, 2010.
[3]          W. Pan, E. Zhong, and Q. Yang, “Transfer learning for text mining,” Mining Text Data, pp. 223–257, 2012.
[4]          J. Pan, X. Hu, Y. Zhang, P. Li, Y. Lin, H. Li, W. He, and L. Li, “Quadruple Transfer Learning: Exploiting both shared and non-shared concepts for text classification,” Knowledge-Based Systems, Vol. 90, pp. 199–210, 2015.
[5]          G. Vinodhini and R. M. Chandrasekaran, “Sentiment analysis and opinion mining: a survey,” International Journal of Advanced Research in Computer Science and Software Engineering, Vol. 2, No 6, pp. 282–292, 2012.
[6]          B. Pang, L. Lee, and S. Vaithyanathan, “Thumbs up? Sentiment Classification using Machine Learning Techniques,” in Proceedings of the conference on Empirical methods in natural language processing (EMNLP), pp. 79-86, 2002.
[7]          M. Thomas, B. Pang, and L. Lee, “Get out the vote: Determining support or opposition from Congressional floor-debate transcripts,” in Proceedings of the 2006 conference on empirical methods in natural language processing, pp. 327–335, 2006.
[8]          B. Pang and L. Lee, “Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales,” in Proceedings of the 43rd annual meeting on association for computational linguistics, pp. 115–124, 2005.
[9]          A. B. Rosario, F. Sotgiu, K. De Valck, and T. H. A. Bijmolt, “The effect of electronic word of mouth on sales: A meta-analytic review of platform, product, and metric factors,” Journal of Marketing Science, Vol. 53, no 3, pp. 297-318, 2016.
[10]        A. Hervas-Drane, “Recommended for you: The effect of word of mouth on sales concentration,” International Journal of Research in Marketing., Vol. 32, No 2, pp. 207–218, 2015.
[11]        D.-H. Park and S. Kim, “The effects of consumer knowledge on message processing of electronic word-of-mouth via online consumer reviews,” Electronic Commerce Research and Applications, Vol. 7, No 4, pp. 399–410, 2008.
[12]        M. Hu and B. Liu, “Mining and summarizing customer reviews,” in Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 168-177, 2004.
[13]        S.-M. Kim, P. Pantel, T. Chklovski, and M. Pennacchiotti, “Automatically assessing review helpfulness,” in Proceedings of the 2006 Conference on empirical methods in natural language processing, pp. 423–430, 2006.
[14]        S. Basuroy, S. Chatterjee, and S. A. Ravid, “How critical are critical reviews? The box office effects of film critics, star power, and budgets,” Journal of Marketing, Vol. 67, No 4, pp. 103–117, 2003.
[15]        J. A. Chevalier and D. Mayzlin, “The effect of word of mouth on sales: Online book reviews,” Journal of Marketing Research, Vol. 43, No 3, pp. 345–354, 2006.
[16]        H. Baek, J. Ahn, and Y. Choi, “Helpfulness of online consumer reviews: Readers’ objectives and review cues,” International Journal of Electronic Commerce, Vol. 17, No 2, pp. 99–126, 2012.
[17]        Q. Cao, W. Duan, and Q. Gan, “Exploring determinants of voting for the ‘helpfulness’ of online user reviews: A text mining approach,” Decision Support Systems, Vol. 50, No 2, pp. 511–521, 2011.
[18]        N. S. Koh, N. Hu, and E. K. Clemons, “Do online reviews reflect a product’s true perceived quality? An investigation of online movie reviews across cultures,” Electronic Commerce Research and Applications, Vol. 9, No 5, pp. 374–385, 2010.
[19]        W. Duan, B. Gu, and A. B. Whinston, “The dynamics of online word-of-mouth and product sales—An empirical investigation of the movie industry,” Journal of Retailing, Vol. 84, No 2, pp. 233–242, 2008.
[20]        M. Li, L. Huang, C.-H. Tan, and K.-K. Wei, “Helpfulness of online product reviews as seen by consumers: Source and content features,” International Journal of Electronic Commerce, Vol. 17, No 4, pp. 101–136, 2013.
[21]        M. L. Kushwaha and S. D. Rathod, “New Opinion Mining Technique For Online Product Reviews And Features,” Multidisciplinary Journal of Research in Engineering and Technology, Vol. 2, No 4, pp. 852–858, 2015.
[22]        Y. Liu, X. Huang, A. An, and X. Yu, “Modeling and predicting the helpfulness of online reviews,” ICDM ’08 Proceedings of Eighth IEEE International Conference on Data Mining, pp. 443–452, 2008.
[23]        C. Banea, R. Mihalcea, J. Wiebe, and S. Hassan, “Multilingual subjectivity analysis using machine translation,” in Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 127–135, 2008.
[24]        P. Prettenhofer and B. Stein, “Cross-lingual adaptation using structural correspondence learning,” ACM Transactions on Intelligent Systems and Technology, Vol. 3, No 1, pp. 13:1-13:22, 2011.
[25]        J. Blitzer, R. McDonald, and F. Pereira, “Domain adaptation with structural correspondence learning,” in Proceedings of the 2006 conference on empirical methods in natural language processing, pp. 120–128, 2006.
[26]        P. Wang, C. Domeniconi, and J. Hu, “Using wikipedia for co-clustering based cross-domain text classification,” ICDM ’08 Proceedings of Eighth IEEE International Conference on Data Mining, pp. 1085–1090, 2008.
[27]        E. W. Xiang, B. Cao, D. H. Hu, and Q. Yang, “Bridging domains using world wide knowledge for transfer learning,” IEEE Transactions on Knowledge and Data Engineering, Vol. 22, No 6, pp. 770–783, 2010.
[28]        Y. He, C. Lin, and H. Alani, “Automatically extracting polarity-bearing topics for cross-domain sentiment classification,” in Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Vol. 1, pp. 123–131, 2011.
[29]        K. Denecke, “Are SentiWordNet scores suited for multi-domain sentiment classification?,” 2009 Fourth International Conference on Digital Information Management (ICDIM 2009), pp. 1–6, 2009.
[30]        D. Bollegala, D. Weir, and J. Carroll, “Using multiple sources to construct a sentiment sensitive thesaurus for cross-domain sentiment classification,” in Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Vol. 1, pp. 132–141, 2011.
[31]        N. Manjunathan, “Cross-Domain Opinion Mining Using a Thesaurus in Social Media Content,” International Journal of Innovative Research in Computer and Communication Engineering, Vol. 2, No 5, 2014.
[32]        Liang, J., Zhang, K., Zhou, X., Hu, Y., Tan, J., and S. Bai,  "Leveraging Latent Sentiment Constraint in Probabilistic Matrix Factorization for Cross-domain Sentiment Classification", The International Conference on Computational Science (ICCS 2016),  Vol. 80, 366-375, 2016.
 [33]       Zhou, G., Zhou, Y., Guo, X., Tu, X., and T. He, "Cross-domain sentiment classification via topical correspondence transfer", Neurocomputing, Vol. 159, Issue C, pp. 298-305, 2015.
 [34]       C. Cagatay, and M. Nangir. "A sentiment classification model based on multiple classifiers", Applied Soft Computing, Vol. 50, pp. 135-141, 2017.‏
[35]        V. Malik, and  A. Kumar, "Sentiment Analysis of Twitter Data Using Naive Bayes Algorithm", International Journal on Recent and Innovation Trends in Computing and Communication, Vol. 6, No 4, pp.120-125, 2018.
[36]        A. Hasan, S. Moin, A. Karim and S. Shamshirband, "Machine Learning-Based Sentiment Analysis for Twitter Accounts", Mathematical and Computational Applications, Vol. 23, No 11, pp. 2297-8747, 2018.
[37]        H. M. Keerthi Kumar, B. S. Harish and H. K. Darshan, "Sentiment Analysis on IMDb Movie Reviews Using Hybrid Feature Extraction Method", International Journal of Interactive Multimedia and Artificial Intelligence,  Vol. 5, No 5, pp. 109-114, 2018.
 [38]       M. Chen and Y. Sun, "Sentiment Analysis with Amazon Review Data", Standford University, 2017, Available at http://cs229.stanford.edu/proj2017/final-reports/5163147.pdf.
[39]        T. Haque, N. N. Saber and F. M. Shah, “Sentiment Analysis on Large Scale Amazon Product Reviews”, IEEE International Conference on Innovative Research and Development (ICIRD), pp. 1-6, 2018.
[40]        Y. Freund and R. E. Schapire, “A desicion-theoretic generalization of on-line learning and an application to boosting,” in European conference on computational learning theory, pp. 23–37, 1995.
[41]        S. J. Pan, X. Ni, J.-T. Sun, Q. Yang, and Z. Chen, “Cross-domain sentiment classification via spectral feature alignment,” in Proceedings of the 19th international conference on World Wide Web, pp. 751–760, 2010.