发明名称 Document processing device, method, and recording medium for creating and correcting formats for extracting characters strings
摘要 A document processing device comprises: a document data acquiring part for acquiring document data; a character string extracting part for extracting character strings satisfying a predetermined condition for character string extraction from the document data acquired by the document data acquiring part; a format creating part for deriving the respective features of the character strings extracted by the character string extracting part, and for creating a format containing the derived features in the form of data; a display part on which the character strings extracted by the character string extracting part are displayed in a list form, and on which the format created by the format creating part is displayed; and a format correcting part for correcting the format displayed on the display part. The character string extracting part extracts character strings again to conform to the format corrected by the format correcting part.
申请公布号 US8854635(B2) 申请公布日期 2014.10.07
申请号 US201012842160 申请日期 2010.07.23
申请人 Konica Minolta Business Technologies, Inc. 发明人 Mishima Nobuhiro;Iwai Hidetaka;Inui Kazuo;Ozawa Kaitaku
分类号 G06F3/12;G06K15/00;G06K9/00;G06F17/30 主分类号 G06F3/12
代理机构 Buchanan Ingersoll & Rooney PC 代理人 Buchanan Ingersoll & Rooney PC
主权项 1. A document processing device comprising: a document data acquiring part for acquiring document data; a character string extracting part for extracting character strings that correspond to at least one of a heading, a title, or a subtitle as bookmark candidate character strings from said document data acquired by said document data acquiring part; a format creating part for deriving respective features of said bookmark candidate character strings extracted by said character string extracting part, categorizing said bookmark candidate character strings having a common feature into one group, and creating a format for the group based on the common feature of the group; a display part on which said bookmark candidate character strings extracted by said character string extracting part are displayed in a list form, and on which the format corresponding to said bookmark candidate character strings created by said format creating part is displayed; and a format correcting part for receiving a correcting operation of a user given to the format displayed on said display part and correcting the format in response to the correcting operation, wherein said character string extracting part extracts the character strings again that have the common feature of the format corrected by said format correcting part from said document data as said bookmark candidate character strings.
地址 Chiyoda-Ku, Tokyo JP