发明名称 |
APPARATUS AND METHOD FOR EXTRACTING A PAIR OF PLACE NAME AND WORD FROM DOCUMENT, AND PROGRAM |
摘要 |
PROBLEM TO BE SOLVED: To extract many pairs of place names and words in a document by preventing a pair having a weak semantic relationship from being extracted.SOLUTION: An extraction apparatus: calculates a total appearance frequency of a word in a document acquired from document storage means; sets a plurality of extraction references with a document structure; on the basis of a position of a place name in the document, counts a reference coincidence frequency indicating that a pair of a place name and a word is coincident with each extraction reference; stores the reference coincidence frequency in storage means; when a correct pair useful for a place name and a range indicated by the place name is given, acquires the total appearance frequency of a word and the reference coincidence frequency from the storage means; determines weight of an extraction reference by performing classification on the basis of the extraction reference; and extracts a set of pairs in which the determined weight and a pair of a place name and a word satisfy a predetermined condition. |
申请公布号 |
JP2013257634(A) |
申请公布日期 |
2013.12.26 |
申请号 |
JP20120131940 |
申请日期 |
2012.06.11 |
申请人 |
NIPPON TELEGR & TELEPH CORP <NTT> |
发明人 |
YASUDA YOSHIHITO;NISHINO MASAAKI;KATAOKA RYOJI |
分类号 |
G06F17/30 |
主分类号 |
G06F17/30 |
代理机构 |
|
代理人 |
|
主权项 |
|
地址 |
|