发明名称 Lattice and method for identifying and normalizing orthographic variations in Japanese text
摘要 A lattice data structure suitable for storage on a computer-readable medium is provided which represents a plurality of orthographic forms of a Japanese lexical entry. The lattice includes a plurality of data fields each adapted to hold data representing a word element of the entry. Each data field includes a first subfield containing data representing a primary form of the corresponding word element and a second field containing data representing an alternate form of the corresponding word element. Also provided is a method of normalizing Japanese lexical entries to produce a normalized form that includes the primary form of each word-element representation of the lattice and does not include the alternate forms. Also provided are methods of segmenting text using the disclosed lattice.
申请公布号 US6731802(B1) 申请公布日期 2004.05.04
申请号 US20000563636 申请日期 2000.05.02
申请人 MICROSOFT CORPORATION 发明人 KACMARCIK GARY;BROCKETT CHRISTOPHER J.
分类号 G06F17/27;(IPC1-7):G06K9/18;G06K9/72;G06F7/00 主分类号 G06F17/27
代理机构 代理人
主权项
地址