发明名称 |
EXTRACTING TERMS FROM DOCUMENT DATA INCLUDING TEXT SEGMENT |
摘要 |
A computer system, method, and article of manufacture for extracting a term from electronic document data that includes a text segment. The system includes: a first extraction unit that uses a first text processing information to extract a noun word from the document data; a second extraction unit that uses a second text processing information to extract a term candidate in relation to the noun word or a corpus that includes text data described in the same language used in the document data; a weight assignment unit that uses a third text processing information to select which type to assign a weight from the plurality of types and assigns the weight to the selected type for each noun word and term candidate; a determination unit that determines the type to which the noun word and term candidate belong; and an output unit to output the noun word and term candidate. |
申请公布号 |
US2013253916(A1) |
申请公布日期 |
2013.09.26 |
申请号 |
US201313899020 |
申请日期 |
2013.05.21 |
申请人 |
INTERNATIONAL BUSINESS MACHINES CORPORATION |
发明人 |
IKAWA YOHEI;NEGISHI SHIHO;TAKEUCHI HIRONORI |
分类号 |
G06F17/28 |
主分类号 |
G06F17/28 |
代理机构 |
|
代理人 |
|
主权项 |
|
地址 |
|