发明名称 APPARATUS AND METHOD FOR EXTRACTING A PAIR OF PLACE NAME AND WORD FROM DOCUMENT, AND PROGRAM
摘要 PROBLEM TO BE SOLVED: To extract many pairs of place names and words in a document by preventing a pair having a weak semantic relationship from being extracted.SOLUTION: An extraction apparatus: calculates a total appearance frequency of a word in a document acquired from document storage means; sets a plurality of extraction references with a document structure; on the basis of a position of a place name in the document, counts a reference coincidence frequency indicating that a pair of a place name and a word is coincident with each extraction reference; stores the reference coincidence frequency in storage means; when a correct pair useful for a place name and a range indicated by the place name is given, acquires the total appearance frequency of a word and the reference coincidence frequency from the storage means; determines weight of an extraction reference by performing classification on the basis of the extraction reference; and extracts a set of pairs in which the determined weight and a pair of a place name and a word satisfy a predetermined condition.
申请公布号 JP2013257634(A) 申请公布日期 2013.12.26
申请号 JP20120131940 申请日期 2012.06.11
申请人 NIPPON TELEGR & TELEPH CORP <NTT> 发明人 YASUDA YOSHIHITO;NISHINO MASAAKI;KATAOKA RYOJI
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址