发明名称 METHOD AND SYSTEM OF SELECTING WORD SEQUENCE FOR TEXT WRITTEN IN LANGUAGE WITHOUT WORD BOUNDARY MARKERS
摘要 The present disclosure discloses a method and apparatus of selecting a word sequence for a text written in a language without word boundary in order to solve the problem of having excessively large computation load when selecting an optimal word sequence in existing technologies. The disclosed method includes: segmenting a segment of the text to obtain different word sequences; determining a common word boundary for the word sequences; and performing optimal word sequence selection for portions of the word sequences prior to the common word boundary. Because optimal word sequence selection is performed for portions of word sequences prior to a common word boundary, shorter independent units can be obtained, thus reducing computation load of word segmentation.
申请公布号 WO2010077572(A2) 申请公布日期 2010.07.08
申请号 WO2009US66753 申请日期 2009.12.04
申请人 ALIBABA GROUP HOLDING LIMITED;DAI, NENG 发明人 DAI, NENG
分类号 G06F17/22 主分类号 G06F17/22
代理机构 代理人
主权项
地址