发明名称 CORPUS CLUSTERING, CONFIDENCE REFINEMENT, AND RANKING FOR GEOGRAPHIC TEXT SEARCH AND INFORMATION RETRIEVAL
摘要 A computer-implemented method for processing a plurality of toponyms, the method involving: in a large corpus, identifying geo-textual correlations among readings of the toponyms within the plurality of toponyms; and for each toponym selected from the plurality of toponyms, using the identified geo-textual correlations to generate a value for a confidence that the selected toponym refers to a corresponding geographic location. Also a method of generating information useful for ranking a document that includes a plurality of toponyms for which there is a corresponding plurality of (toponym,place) pairs, there being associated with each (toponym,place) pair of said plurality of (toponym,place) pairs a corresponding value for a confidence that the toponym of that (toponym,place) pair refers to the place of that (toponym,place) pair. This further method includes, for a selected (toponym,place) pair of the plurality of (toponym,place) pairs, (1) determining if another toponym is present within the document that has an associated place that is geographically related to the place of the selected (toponym, place) pair; and (2) if a toponym is identified within the document that has an associated place that is geographically related to the place of the selected (toponym, place) pair, boosting the value of the confidence for the selected (toponym,place) pair.
申请公布号 WO2004084099(A2) 申请公布日期 2004.09.30
申请号 WO2004US08309 申请日期 2004.03.18
申请人 METACARTA, INC.;FRANK, JOHN, R. 发明人 FRANK, JOHN, R.
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址