发明名称 |
Method and system for document classification |
摘要 |
A method and system of classifying documents is provided. The method includes receiving a plurality of documents from at least one user, wherein each document includes information relating to a customer support issue or sentiment and identifying at least one customer support issue or sentiment contained within each document. The method also includes classifying the documents satisfying a confidence threshold using a classifier, clustering the remainder of the plurality of documents into groups using a clustering engine, the clustering engine applying a word analysis, and outputting a frequency of each identified customer support issue or sentiment, the frequency based on the classifying or the clustering. |
申请公布号 |
US8977620(B1) |
申请公布日期 |
2015.03.10 |
申请号 |
US201213531049 |
申请日期 |
2012.06.22 |
申请人 |
Google Inc. |
发明人 |
Buryak Kirill;Ben-Artzi Aner;Lewis Glenn M.;Peng Jun |
分类号 |
G06F17/30 |
主分类号 |
G06F17/30 |
代理机构 |
Armstrong Teasdale LLP |
代理人 |
Armstrong Teasdale LLP |
主权项 |
1. A computer-implemented method of classifying documents including executing instructions stored on a computer-readable medium, said method comprising:
receiving a plurality of documents from at least one user, wherein each document includes at least one of unplanned information relating to a customer support issue and an indication of a sentiment; identifying at least one customer support issue or sentiment contained within each document by parsing the plurality of documents; classifying, using a classifier, at least a portion of the plurality of documents that satisfy a confidence threshold into one of a plurality of classes, each class associated with the identified at least one customer support issue or sentiment; clustering a remainder of the plurality of documents that do not satisfy the confidence threshold for the identified at least one customer support issue or sentiment into a plurality of clustered groups using a clustering engine, the clustering engine applying a word analysis; and outputting a frequency of each identified customer support issue or sentiment in each of the classes and clustered groups, the frequency based on said classifying or said clustering. |
地址 |
Mountain View CA US |