发明名称 DOCUMENT ANALYSIS DEVICE, DOCUMENT ANALYSIS PROGRAM AND DOCUMENT ANALYSIS METHOD
摘要 <p><P>PROBLEM TO BE SOLVED: To shape text data of e-mails so that more exact unit of sentence is recognized when the e-mails are exchanged a plurality of times and quotation hierarchies are confused. <P>SOLUTION: A line coupling determination threshold calculation part 34 determines data length for every line based on the number of characters of each line character string of the text data of the e-mails, and calculates a difference between the maximum value of the data length and a value of the data length with high appearance frequency except the maximum value as a line coupling determination value. A coupled line determination part 35 imparts, when the data length is within a predetermined range of the maximum value and a difference between the data length of the line and data length of the next line is within a predetermined range of the line coupling determination threshold, repetition information which is an index of line coupling to the read line and the next line, and a data shaping part 36 couples, when coupling between lines is instructed by the repetition information imparted by the coupled line determination part of the read line, lines to output the coupled character strings to an output device. <P>COPYRIGHT: (C)2011,JPO&INPIT</p>
申请公布号 JP2011159311(A) 申请公布日期 2011.08.18
申请号 JP20110065462 申请日期 2011.03.24
申请人 TOSHIBA CORP;TOSHIBA SOLUTIONS CORP 发明人 SHIBUYA TAKASHI;YOSHIMURA YUMIKO;SHINDO MASAKI;SAI ENKO
分类号 G06F17/21;G06F13/00 主分类号 G06F17/21
代理机构 代理人
主权项
地址