发明名称 DOCUMENT ANALYSIS SYSTEM, DOCUMENT ANALYSIS METHOD, AND PROGRAM
摘要 PROBLEM TO BE SOLVED: To reduce an unnecessary erroneous report which is the cause of efficiency reduction in the work of confirming the result of ambiguous word analysis.SOLUTION: A document analysis system according to the present invention comprises: a document input unit; a document analysis unit for extracting word information pertaining to each word and a place where the word is used; a document separation unit for separating the document into a plurality of separated documents; a separated document quality evaluation unit for calculating a quality index for each separated document; an ambiguous example database in which the features of example uses of ambiguous words are collected and accumulated; an example use analysis unit for analyzing the example use of each individual word information and extracting each ambiguous word and example use in the document in correlation thereof; a classification accuracy database in which classification accuracy of correctly extracting, for each set of an ambiguous word and example use, an applicable example use from the document is collected and accumulated; an ambiguous word analysis condition optimization unit for optimizing an analysis condition so that a condition with poor classification accuracy is not applied as much for an ambiguous word having word information used in a separated document with good quality index as possible; and an ambiguity determination unit for calculating the degree of ambiguity of each ambiguous word on the basis of altered analysis condition and determining an ambiguous word having high ambiguity; and an ambiguous information output unit.
申请公布号 JP2014235584(A) 申请公布日期 2014.12.15
申请号 JP20130116909 申请日期 2013.06.03
申请人 NEC CORP 发明人 HIRAO EIJI;GOTO TOMOHISA
分类号 G06F17/21 主分类号 G06F17/21
代理机构 代理人
主权项
地址