发明名称 DOCUMENT PROCESSING APPARATUS AND DOCUMENT PROCESSING METHOD
摘要 A document processing apparatus comprises a layout analysis module configured to analyze image data input, divide areas for each classification, and acquire coordinate information of a text area from the areas by a classification; a text area information calculation module configured to calculate position information of a partial area for each text area on the basis of the coordinate information acquired by the layout analysis module; a feature extraction module configured to extract features of the text area on the basis of the position information calculated by the text area information calculation module; an analysis executing module configured to analyze semantic information of the partial area using a plurality of kinds of analysis component modules; and a component formation module configured to select and construct one or a plurality of analysis component modules on the basis of the features of the text area extracted by the feature extraction module and permit the analysis executing module to execute analysis of the semantic information of the partial area according to the one or plurality of analysis components modules contracted.
申请公布号 US2009110288(A1) 申请公布日期 2009.04.30
申请号 US20080260485 申请日期 2008.10.29
申请人 KABUSHIKI KAISHA TOSHIBA;TOSHIBA TEC KABUSHIKI KAISHA 发明人 FUJIWARA AKIHIKO
分类号 G06K9/46;G06K9/62 主分类号 G06K9/46
代理机构 代理人
主权项
地址