发明名称 Fast text character set recognition
摘要 Methods and apparatus, including computer program products, for identifying a language corresponding to a string of data include receiving a data string and dividing the data string into coded character sequences for each of a plurality of languages. A length of one or more coded character sequences varies among different languages for coded character sequences having a particular number of characters. The coded character sequences are analyzed to calculate, for each of the plurality of languages, a probability that the data string corresponds to language. The calculated probabilities are compared among the languages, and a language is identified as corresponding to the data string based on the comparison.
申请公布号 US2006025988(A1) 申请公布日期 2006.02.02
申请号 US20040909262 申请日期 2004.07.30
申请人 XU MING;MORI NOBUYOSHI 发明人 XU MING;MORI NOBUYOSHI
分类号 G06F17/20 主分类号 G06F17/20
代理机构 代理人
主权项
地址