发明名称 Reading order determination apparatus, method, and program for determining reading order of characters
摘要 A method and apparatus for determining a reading order of characters The method includes preparing a list of character information, which is character information extracted from image data by character recognition processing and preparing a list of line information, which is made up of a line box surrounding a set of characters which are continuously aligned in the same direction in image data and an alignment direction of characters in the line box. In response to a request for adding character information to the list of character information, extracting a line box containing a character region of the character to be added, obtaining all character information having the character region contained in the concerned line box from the list of character information and rearranging according to the position with respect to the alignment direction of characters corresponding to the line box to determine a new reading order of characters.
申请公布号 US8989494(B2) 申请公布日期 2015.03.24
申请号 US201213488645 申请日期 2012.06.05
申请人 International Business Machines Corporation 发明人 Itoko Toshinari;Sato Daisuke
分类号 G06K9/32;G06K9/18;G06K9/03;G06K9/72 主分类号 G06K9/32
代理机构 代理人 Feighan Patricia B.;Davis Jennifer R.
主权项 1. A reading order determination apparatus for determining a reading order of characters, comprising: a character information storage unit for storing a list of character information, wherein the list containing character information made up of text data and character region data of each character extracted from image data by character recognition processing aligned in a reading order of characters; a line information storage unit for storing a list of line information, wherein the list listing line information made up of a line box that surrounds a set of characters continuously aligned in the same direction and an alignment direction of characters in the line box in the alignment order of lines; a detection unit for, in response to a request for addition of character information to the list of character information, detecting line information having a line box containing a region indicated by character region data of added character information from the list of line information; a subset determination unit for obtaining a subset of character region data from the list of character information, each character region data indicating a region contained in the line box of the line information detected by said detection unit; a rearrangement unit for rearranging each character region data in the subset according to a position with respect to an alignment direction of characters of the line information that has been detected; and a order determination unit for determining a reading order of characters in the list of character information by updating the list of character information based on an alignment order of character region data in the subset.
地址 Armonk NY US