发明名称 Detecting common prefixes and suffixes in a list of strings
摘要 A computer-implemented method includes receiving a plurality of character strings. The number of strings (M) in the plurality of strings having a unique substring of X characters at an extremity of the string is determined, the number of strings (N) in the plurality of strings having at least X characters in the string is determined. A probability is determined, based on a predetermined model for a distribution of characters in the strings, that the unique substring of X characters would occur M or more times out of the N strings, given that the unique character string occurs at least once. Based on the probability, the number M, and the number N, it is determined that the unique character string is a significant affix in the plurality of character strings, and the unique character string is stored.
申请公布号 US8095530(B1) 申请公布日期 2012.01.10
申请号 US20080177102 申请日期 2008.07.21
申请人 LLOYD MATTHEW;GOOGLE INC. 发明人 LLOYD MATTHEW
分类号 G06F7/00;G06F17/00;G06F17/30 主分类号 G06F7/00
代理机构 代理人
主权项
地址
您可能感兴趣的专利