发明名称 SYSTEM FOR EXTRACTING TERM FROM DOCUMENT CONTAINING TEXT SEGMENT
摘要 Provided is techniques of extracting terms from a document, classifying the extracted terms from a viewpoint useful for the summary understanding or the details understanding of the document, and presenting the classified terms to a user. A computer system extracts noun words from document data containing a text segment by using first text processing information, extracts term candidates for the noun words from the document data or from a corpus including text data described in the same language as the document data by using second text processing information, selects a kind of noun words to be given a weight by using third text processing information in order to determine to which kind of noun words, out of a plurality of kinds of noun words, the noun words and the term candidates belong, gives the weight to the respective noun words and term candidates according to the selected kind, determines a kind to which the noun words and the term candidates belong according to the given weight, and outputs the noun words and the term candidates in association with the determined kind.
申请公布号 WO2010038540(A1) 申请公布日期 2010.04.08
申请号 WO2009JP63584 申请日期 2009.07.30
申请人 INTERNATIONAL BUSINESS MACHINES CORPORATION;IKAWA YOHEI;TAKEUCHI HIRONORI;NEGISHI SHIHO 发明人 IKAWA YOHEI;TAKEUCHI HIRONORI;NEGISHI SHIHO
分类号 G06F17/28;G06F17/21;G06F17/30 主分类号 G06F17/28
代理机构 代理人
主权项
地址