发明名称 IDENTIFICATION OF WORDS IN JAPANESE TEXT BY A COMPUTER SYSTEM
摘要 A word breaking facility operates to identify words within a Japanese text string. The word breaking facility performs morphological processing to identify postfix bound morphemes and prefix bound morphemes. The word breaking facility also performs opheme matching to identify likely stem characters. A scoring heuristic is applied to determine an optimal analysis that includes a postfix analysis, a stem analysis, and a prefix analysis. The morphological analyses are stored in an efficient compressed format to minimize the amount of memory they occupy and maximize the analysis speed. The morphological analyses of postfixes, stems, and prefixes are performed in a right-to-left fashion. The word breaking facility may be used in applications that demand identity of selection granularity, autosummarization applications, content indexing applications, and natural language processing applications.
申请公布号 WO9800794(A1) 申请公布日期 1998.01.08
申请号 WO1997US11029 申请日期 1997.06.25
申请人 MICROSOFT CORPORATION 发明人 HALSTEAD, PATRICK, H., JR.;SUZUKI, HISAMI
分类号 G06F17/27;G06F17/28;(IPC1-7):G06F17/28 主分类号 G06F17/27
代理机构 代理人
主权项
地址