发明名称 DOCUMENT DIVIDING APPARATUS, DOCUMENT PROCESSING SYSTEM, AND PROGRAM
摘要 <P>PROBLEM TO BE SOLVED: To provide a document dividing apparatus capable of dividing a document in a form corresponding to contents intended by a creator of the document, a document processing system and a program. <P>SOLUTION: A document dividing apparatus is provided which is characterized in having: section extraction means for acquiring a document file in which text data are described, detecting partition information of the text data and extracting a plurality of sections from the text data; phrase importance degree calculation means for calculating an importance degree for a phrase of the text data; weight determination means for extracting layout information of the phrase from the document file and reading weight information corresponding to the layout information from weight information storage means; feature vector creation means for creating for each section a feature vector with a value generated from the importance degree and the weight information as a coefficient; and group extraction means for extracting the plurality of sections as information as one group in accordance with similarity of feature vectors of the sections. <P>COPYRIGHT: (C)2012,JPO&INPIT
申请公布号 JP2012059227(A) 申请公布日期 2012.03.22
申请号 JP20100204859 申请日期 2010.09.13
申请人 RICOH CO LTD 发明人 NAKAOMI MASASHI
分类号 G06F17/21;G06F17/30 主分类号 G06F17/21
代理机构 代理人
主权项
地址