发明名称 Recognizing text from frames of image data using contextual information
摘要 Disclosed are techniques for recognizing text from one or more frames of image data using contextual information. In some implementations, image data including a captured textual item is processed to identify an entity in the image data. A context can be selected using the entity, where the context corresponds to a dictionary. Text in the captured textual item can be identified using the dictionary. The identified text can be output to a display device.
申请公布号 US9355336(B1) 申请公布日期 2016.05.31
申请号 US201414259905 申请日期 2014.04.23
申请人 Amazon Technologies, Inc. 发明人 Jahagirdar Sonjeev;Cole Matthew Joseph;Ramos David Paul;Prateek Utkarsh;McConville Emilie Noelle;Datta Ankur;Finney Laura Varnum;Liu Yue;Doshi Bhavesh Anil;Sikka Avnish;Vanne Michael
分类号 G06K9/00;G06K9/62 主分类号 G06K9/00
代理机构 Hogan Lovells US LLP 代理人 Hogan Lovells US LLP
主权项 1. A non-transitory computer-readable storage medium storing instructions executable by one or more processors of a server to cause a method to be performed for recognizing text from one or more frames of image data, the method comprising: receiving, from a device in communication with the server via a network, image data including a captured textual item, the image data corresponding to at least a portion of a first frame of video data; identifying an entity in the image data; identifying a plurality of contexts using the entity, each context identifying one or more of a document, a product, or a service, each context corresponding to one or more of a plurality of dictionaries stored in a database; generating a respective confidence level for each context according to a comparison of the entity with each context of the plurality of contexts; identifying one or more contexts of the plurality of contexts as being related contexts to the entity, the respective confidence level for each related context of the related contexts satisfying a threshold; selecting one or more of the dictionaries corresponding to the one or more related contexts; determining an identified text in the captured textual item using the selected one or more dictionaries; and sending the identified text to the device.
地址 Reno NV US