发明名称 METHOD AND APPARATUS FOR BREAKING WORDS IN A STREAM OF TEXT
摘要 <p>A word breaker utilizing a lexicon module and a processing module to identify word breaks in a stream of Asian (e.g. Japanese, Chinese, or Korean) language text. The lexicon module is a dictionary or database containing words native to the language of the input text. The processing module includes a plurality of analysis modules which operate on the input text. In particular, the processing module can include modules that analyze the input text using heuristic rules and statistical analysis to identify a first set of word breaks, thereby reducing the size of segments with undefined word breaks. The processing module also includes a database analysis module that identifies the remaining undefined word breaks in those smaller segments that have undergone heuristic or statistical analysis.</p>
申请公布号 WO1998008169(A1) 申请公布日期 1998.02.26
申请号 US1997014741 申请日期 1997.08.21
申请人 发明人
分类号 主分类号
代理机构 代理人
主权项
地址