发明名称 DOCUMENT TYPE CLASSIFICATION FOR SCANNED BITMAPS
摘要 Systems and methods are described that facilitate determining an original document format for a scanned document by analyzing a bitmap thereof. Text objects are extracted from the document, binarized, and segmented to identify text. Page orientation and text size are used to distinguish between a slideshow-type document, and a word processing or spreadsheet-type document. To further distinguish between the word processing and spreadsheet types, text column structure and count is analyzed.
申请公布号 US2010033765(A1) 申请公布日期 2010.02.11
申请号 US20080185904 申请日期 2008.08.05
申请人 XEROX CORPORATION 发明人 FAN ZHIGANG;NAGARAJAN RAMESH
分类号 H04N1/40 主分类号 H04N1/40
代理机构 代理人
主权项
地址