发明名称 APPARATUS AND METHOD FOR TEXT SEGMENTATION BASED ON COHERENT UNITS
摘要 A text segmentation apparatus comprising means for analyzing an electronic text to determine likelihood of segmentation point for each of sentence end in the text based on a coherent unit, and means for segmenting said text into text segments based on the likelihood of segmentation point. The apparatus determines the similarity between the text parts contained in a pair of windows (Step 3) to be set up on the left and right sides of each sentence end position in the text so as to obtain similarity curves (Step 4). Then, the apparatus determines the likelihood of segmentation point (Step 5) for each sentence end point based on the obtained similarity curves, and segments the text at the point having the best likelihood of segmentation point (Step 6).
申请公布号 WO0229547(A1) 申请公布日期 2002.04.11
申请号 WO2001US30734 申请日期 2001.10.02
申请人 HEWLETT-PACKARD COMPANY;SHIMIZU, HIROYUKI;NAKAGAWA, SHINYA 发明人 SHIMIZU, HIROYUKI;NAKAGAWA, SHINYA
分类号 G06F17/21;G06F7/60;G06F12/00;G06F15/00;G06F17/00;G06F17/10;G06F17/22;G06F17/24;G06F17/27;G06K9/00;(IPC1-7):G06F7/60 主分类号 G06F17/21
代理机构 代理人
主权项
地址