发明名称 AUTOMATIC SEGMENTATION OF A TEXT
摘要 A system (100) is capable of segmenting a connected text, such as Japanese or Chinese sentence, into words. The system includes means (110) for reading an input string representing the connected text. Segmentation means (120) identifies at least one word sequence in the connected text by building a tree structure representing word sequence(s) in the input string in an iterative manner. Initially the input string is taken as a working string. Each word of a dictionary (122) is compared with the beginning of the working string. A match is represented by a node in the tree, and the process is continued with the remaining part of the input string. The system further includes means (130) for outputting at least one of the identified word sequences. A language model may be used to select between candidate sequences. Preferably the system is used in a speech recognition system to update the lexicon based on representative texts.
申请公布号 WO0033211(A3) 申请公布日期 2000.09.08
申请号 WO1999EP08942 申请日期 1999.11.18
申请人 KONINKLIJKE PHILIPS ELECTRONICS N.V. 发明人 CHU, YA-CHERNG
分类号 G06F17/27;G06F17/28 主分类号 G06F17/27
代理机构 代理人
主权项
地址
您可能感兴趣的专利