发明名称 Performing sentiment analysis on microblogging data, including identifying a new opinion term therein
摘要 There is provided a computer-implemented method of performing sentiment analysis. An exemplary method comprises performing a first sentiment analysis on microblogging data based on a method using an opinion lexicon. The method also includes training a classifier using training data from the first sentiment analysis. Additionally, the method includes identifying a new opinion term in the microblogging data by performing a statistical test. The new opinion terms are not in the opinion lexicon. The method also includes identifying new microblogging data based on the new opinion term. Further, the method includes performing a second sentiment analysis on the new microblogging data using the classifier.
申请公布号 US9275041(B2) 申请公布日期 2016.03.01
申请号 US201113280031 申请日期 2011.10.24
申请人 Hewlett Packard Enterprise Development LP 发明人 Ghosh Riddhiman;Zhang Lei;Dekhil Mohamed E.;Liu Bing
分类号 G06F17/27;G06F17/30;G06Q30/02;G06N99/00 主分类号 G06F17/27
代理机构 代理人 Dryja Michael A.
主权项 1. A computer-implemented method of performing sentiment analysis, comprising: performing a first sentiment analysis on microblogging data based on a method using an opinion lexicon that includes non-domain-specific opinion terms, to generate first training data; training a classifier using the first training data; identifying a new opinion term in the microblogging data, by performing a statistical test on results of the first sentiment analysis on the microblogging data, wherein the new opinion term is domain-specific to the microblogging data and is not in the opinion lexicon; adding the new opinion term to the opinion lexicon to grow the opinion lexicon so that the opinion lexicon includes domain-specific opinion terms; identifying new microblogging data other than the microblogging data on which the first sentiment analysis has been performed, based on the opinion lexicon to which the new opinion term has been added; performing a second sentiment analysis on the new microblogging data using the classifier to generate second training data; and retraining the classifier as has been trained using the first training data, using the second training data to improve the classifier.
地址 Houston TX US