发明名称 Word storage table for natural language determination
摘要 A language in which a document is written is identified through the use of sets of most frequently used words in each of a plurality of candidate languages. Each set of most frequently used words in a respective set of word tables for a respective candidate language according to letter pairs in each set of most frequently used words. In the preferred embodiment, each word table is an NxN bit table, where each bit represents a given letter pair at a particular place in one of the most frequently used words in one of the candidate languages. Words from the document are compared to the most frequently used words stored in the word tables. A count of the number of matches between the words from the document and the words stored in each respective set of word tables is kept for each respective language. The language of the document as the respective candidate language having the greatest number of matches.
申请公布号 US6009382(A) 申请公布日期 1999.12.28
申请号 US19960723813 申请日期 1996.09.30
申请人 INTERNATIONAL BUSINESS MACHINES CORPORATION 发明人 MARTINO, MICHAEL JOHN;PAULSEN, JR., ROBERT CHARLES
分类号 G06F17/27;G06F17/28;(IPC1-7):G06F17/21 主分类号 G06F17/27
代理机构 代理人
主权项
地址
您可能感兴趣的专利