发明名称 Paragraph recognition in an optical character recognition (OCR) process
摘要 An image processing apparatus for detecting paragraphs in a textual image includes an input component for receiving an input image in which textual lines and words have been identified and a page classification component for classifying the input image as a first or second page type. The apparatus also includes a paragraph detection component for classifying all textual lines on the input image as a beginning paragraph line or a continuation paragraph line. The apparatus is also provided with a paragraph creation component for creating paragraphs that include textual lines between two successive beginning paragraph lines, including a first of the two successive beginning paragraph lines. The paragraphs that have been identified may be classified by the type of alignment they exhibit. For instance, paragraphs may be classified according to whether they are left aligned, right aligned, center aligned or justified.
申请公布号 US8565474(B2) 申请公布日期 2013.10.22
申请号 US20100720992 申请日期 2010.03.10
申请人 RADAKOVIC BOGDAN;GALIC SASA;UZELAC ALEKSANDAR;MICROSOFT CORPORATION 发明人 RADAKOVIC BOGDAN;GALIC SASA;UZELAC ALEKSANDAR
分类号 G06K9/00 主分类号 G06K9/00
代理机构 代理人
主权项
地址