发明名称 System and method for evaluating character sets of a message containing a plurality of character sets
摘要 An evaluator system accepts input textual messages in unknown languages and assesses which character sets, corresponding to languages, matches that message. Textual messages whose individual characters are encoded in 16 bit Unicode of other universal format are parsed, and character sets which can express each character and the accumulated correspondence is logged. When the character sets against which the message is being tested only provide partial matches, the invention can determine which offers the best fit, including by way of a weighting function. The evaluation technology of the invention can be applied to multipart documents, and to search engines and indices.
申请公布号 US6539118(B1) 申请公布日期 2003.03.25
申请号 US19990384442 申请日期 1999.08.27
申请人 INTERNATIONAL BUSINESS MACHINES CORPORATION 发明人 MURRAY BRENDAN P.;TAKIZAWA KUNIAKI
分类号 G06F17/22;(IPC1-7):G06K9/72;G06F15/00;G06F17/20;G06K9/68 主分类号 G06F17/22
代理机构 代理人
主权项
地址