发明名称 Automatic segmentation of a text
摘要 A system 100 is capable of segmenting a connected text, such as Japanese or Chinese sentence, into words. The system includes means 110 for reading an input string representing the connected text. Segmentation means 120 identifies at least one word sequence in the connected text by building a tree structure representing word sequence(s) in the input string in an iterative manner. Initially the input string is taken as a working string. Each word of a dictionary 122 is compared with the beginning of the working string. A match is represented by a node in the tree, and the process is continued with the remaining part of the input string. The system further includes means 130 for outputting at least one of the identified word sequences. A language model may be used to select between candidate sequences. Preferably the system is used in a speech recognition system to update the lexicon based on representative texts.
申请公布号 US6374210(B1) 申请公布日期 2002.04.16
申请号 US19990449231 申请日期 1999.11.24
申请人 U.S. PHILIPS CORPORATION 发明人 CHU YA-CHERNG
分类号 G06F17/27;G06F17/28;(IPC1-7):G06F17/27 主分类号 G06F17/27
代理机构 代理人
主权项
地址