发明名称 Grouping words with equivalent substrings by automatic clustering based on suffix relationships
摘要 A set of words of a natural language is grouped by automatically obtaining suffix relation data that indicate a relation value for each of a set of relationships between suffixes that occur in the natural language, and, then, by automatically clustering the words in the set using the relation values from the suffix relation data, to obtain group data indicating groups of words. Two or more words in a group have suffixes as in one of the relationships and, preceding the suffixes, equivalent substrings. The relationships can be pairwise relationships, and the relation value can indicate the number of occurrences of a suffix pair. The suffix relation data can be obtained using an inflectional lexicon. Complete link clustering can be used.
申请公布号 US6308149(B1) 申请公布日期 2001.10.23
申请号 US19980213309 申请日期 1998.12.16
申请人 XEROX CORPORATION 发明人 GAUSSIER ERIC;GREFENSTETTE GREGORY;CHANOD JEAN-PIERRE
分类号 G06F17/28;G06F17/30;(IPC1-7):G06F17/27 主分类号 G06F17/28
代理机构 代理人
主权项
地址