发明名称 equipamento de computador, método e programa de computador para extração de termos a partir de dados de documentos incluindo segmentos de texto
摘要 Object An object of the present invention is to provide a method of utilizing the linguistic and structural characteristics of a document, particularly a technical document, to extract terms, automatically classifying the extracted terms in a useful way for an overall understanding or detailed understanding of the document, and presenting the classified term to a user. Solving Means The present invention provides a computer system for extracting a term from document data including a text segment. The computer system includes: a first extraction unit that uses first text processing information to extract a noun word from the document data; a second extraction unit that uses second text processing information to extract a term candidate in relation to the extracted noun word from any one of the document data and a corpus including text data described in the same language used in the document data; a weight assignment unit that, in order to determine which one of a plurality of noun word types the extracted noun word and the extracted term candidate each belong to, uses third text processing information to select which type to assign a weight from the plurality of types and assigns the weight to the selected type for each of the extracted noun word and the extracted term candidate; a determination unit that determines the type to which the extracted noun word and the extracted term candidate each belong, based on the assigned weight; and an output unit which follows the determination to output the extracted noun word and the extracted term candidate each in association with the determined type. The present invention also provides a method and a computer program for extracting a term from a document including a text segment.
申请公布号 BRPI0913815(A2) 申请公布日期 2015.10.20
申请号 BR2009PI13815 申请日期 2009.07.30
申请人 INTERNATIONAL BUSINESS MACHINES CORPORATION 发明人 HIRONORI TAKEUCHI;SHIHO NEGISHI;YOHEI IKAWA
分类号 G06F17/21;G06F17/28;G06F17/30 主分类号 G06F17/21
代理机构 代理人
主权项
地址