发明名称 Automatic method of identifying sentence boundaries in a document image
摘要 A method of automatically identifying sentence boundaries in a document image without performing character recognition to generate an ASCII representation of the document text. The identification process begins by selecting a connected component from the multiplicity of connected components of a text line. Next, it is determined whether the selected connected component might represent a period based upon its shape. If the selected connected component is dot shaped, then it is determined whether the selected connected component might represent a colon. Finally, if the selected connected component is dot shaped and not part of a colon, the selected connected component is labeled as a sentence boundary.
申请公布号 US5892842(A) 申请公布日期 1999.04.06
申请号 US19950572597 申请日期 1995.12.14
申请人 XEROX CORPORATION 发明人 BLOOMBERG, DAN S.
分类号 G06K9/62;G06K9/32;(IPC1-7):G06K9/34 主分类号 G06K9/62
代理机构 代理人
主权项
地址