发明名称 Method of text similarity measurement
摘要 In one aspect, the present invention provides a for estimating the similarity between at least two portions of text including the steps of forming a set of syntactic tuples, each tuple including at least two terms and a relation betweeen the two terms; classifying the relation between the terms in the tuples according to a predefined set of relations; establishing the relative agreement between syntactic tuples from the portions of text under comparison according to predefined classes of agreement; calculating a value representative of the similarity between the portions of text of each of the classes of agreement; and establishing a value for the similarity between the portions of text by calculating a weighted sum of the values representative of the similarity between the portions of text for each of the classes of agreement. Preferaly, the step of calculating a value representative of the similarity between the portions of text for each of the classes of agreement includes a weighting based upon the number of matched terms occurring in particular parts of speech in which the text occurs. It is also preferred that the step of calculating a value representative of the similarity between the portions of text for each of the classes of agreement include the application of a weighting factor to the estimate of similarity for each of the classes of agreement and the parts of speech in which matched terms occur.
申请公布号 US7346491(B2) 申请公布日期 2008.03.18
申请号 US20030250746 申请日期 2003.11.26
申请人 AGENCY FOR SCIENCE, TECHNOLOGY AND RESEARCH 发明人 KANAGASABAI RAJARAMAN;PAN HONG
分类号 G06F17/27 主分类号 G06F17/27
代理机构 代理人
主权项
地址
您可能感兴趣的专利