发明名称 Dynamically generating table of contents for printable or scanned content
摘要 Methods and devices receive a document comprising raster images, using an optical scanner. These methods and devices automatically identify topical items within the raster images based on raster content in the raster images, using a processor. Further, these methods and devices automatically associate the topical items with topics in the document based on previously established rules for identifying topical sections, and automatically crop the topical items from the raster images to produce cropped portions of the raster images, using the processor. These methods and devices then automatically create an index for the document by combining the cropped portions of the raster images organized by the topics, using the processor, and output the index from the processor.
申请公布号 US9454696(B2) 申请公布日期 2016.09.27
申请号 US201414254976 申请日期 2014.04.17
申请人 Xerox Corporation 发明人 Campanelli Michael R.;Prabhat Saurabh;Srinivasan Raja
分类号 G06F3/12;G06K9/00;G06K15/02;H04N1/387 主分类号 G06F3/12
代理机构 Gibb & Riley, LLC 代理人 Gibb & Riley, LLC
主权项 1. A method comprising: receiving a document using an optical scanner comprising raster images; automatically identifying topical items within said raster images based only on distinct font styles of images of text characters in said raster images, using a processor, said images of text characters being pixel-based and being distinct from recognized characters produced in optical character recognition processing; automatically ranking said topical items based on previously established rules for identifying topical sections in documents, using said processor; automatically filtering said topical items based on said ranking to identify highest-ranking topical items, using said processor; automatically associating said highest-ranking topical items with topics and subtopics in said document based on said previously established rules, using said processor; automatically cropping said highest-ranking topical items from said raster images by copying pixel patterns of said topical items within said raster images to produce cropped-image portions of said raster images, using said processor; automatically creating a cropped-image index for said document by combining said cropped-image portions of said raster images organized by said topics and subtopics, using said processor, said cropped-image index comprising multiple ones of said cropped-image portions combined together and organized by said topics and subtopics; and outputting said cropped-image index from said processor.
地址 Norwalk CT US