发明名称 Creating an electronic book using video-based input
摘要 Some implementations include using a trained classifier to identify page-turn events in a video. The video may be divided into multiple segments based on the page-turn events, with each segment of the multiple segments corresponding to a pair of adjacent pages in a book. Exemplar frames that provide non-redundant data compared to other frames may be chosen from each segment. The exemplar frames may be cropped to include content portions of pages. The exemplar frames may be aligned such that a pixel is located in a same position in each frame. Optical character recognition (OCR) may be performed on exemplar frames and the OCR for exemplar frames in each segment may be combined. The exemplar frames in each segment may be combined to create a composite image for each pair of adjacent pages in the book, and OCR may be performed on the composite image.
申请公布号 US9191554(B1) 申请公布日期 2015.11.17
申请号 US201213677096 申请日期 2012.11.14
申请人 Amazon Technologies, Inc. 发明人 Manohar Vasant;Godavarthy Sridhar;Sankaranarayanan Viswanath
分类号 G06F17/00;H04N5/14;G06F17/21 主分类号 G06F17/00
代理机构 Lee & Hayes, PLLC 代理人 Lee & Hayes, PLLC ;Niampally Shivshanker S.
主权项 1. A method comprising: under control of one or more processors configured with executable instructions to perform acts comprising: receiving a video comprising a plurality of frames, the video including images of pages in a book being turned; dividing the plurality of frames into one or more segments, each segment of the one or more segments associated with a pair of adjacent pages of the book; determining one or more exemplar frames in each segment of the one or more segments, each of the one or more exemplar frames including non-redundant data as compared to other frames in each segment, the non-redundant data in each exemplar frame comprising data that is: associated with at least a portion of the pair of adjacent pages, andnot found in other frames in each segment; in each segment, performing optical character recognition (OCR) on each of the one or more exemplar frames to create one or more OCR results; and based at least in part on processing the one or more OCR results in each segment, creating OCR content for each pair of adjacent pages in the book.
地址 Seattle WA US