发明名称 Finite state data structures with paths representing paired strings of tags and tag combinations
摘要 A finite state data structure includes paths that represent pairs of strings, with a first string that is a string of tag combinations and a second string that is a string of tags for tokens in a language. The second strings of a set of paths with the same first string include only highly probable strings of tags for the first string. The data structure can be an FST or a bimachine, and can be used for mapping strings of tag combinations to strings of tags. The tags can, for example, indicate parts of speech of words, and the tag combinations can be ambiguity classes or, in a bimachine, reduced ambiguity classes. An FST can be obtained by approximating a Hidden Markov Model. A bimachine can include left-to-right and right-to-left sequential FSTs obtained based on frequencies of tokens in a training corpus.
申请公布号 US6816830(B1) 申请公布日期 2004.11.09
申请号 US19990419435 申请日期 1999.10.15
申请人 XEROX CORPORATION 发明人 KEMPE ANDRE
分类号 G06F17/27;(IPC1-7):G06F17/27;G10L15/00 主分类号 G06F17/27
代理机构 代理人
主权项
地址