发明名称 |
Recognizing text from frames of image data using contextual information |
摘要 |
Disclosed are techniques for recognizing text from one or more frames of image data using contextual information. In some implementations, image data including a captured textual item is processed to identify an entity in the image data. A context can be selected using the entity, where the context corresponds to a dictionary. Text in the captured textual item can be identified using the dictionary. The identified text can be output to a display device. |
申请公布号 |
US9355336(B1) |
申请公布日期 |
2016.05.31 |
申请号 |
US201414259905 |
申请日期 |
2014.04.23 |
申请人 |
Amazon Technologies, Inc. |
发明人 |
Jahagirdar Sonjeev;Cole Matthew Joseph;Ramos David Paul;Prateek Utkarsh;McConville Emilie Noelle;Datta Ankur;Finney Laura Varnum;Liu Yue;Doshi Bhavesh Anil;Sikka Avnish;Vanne Michael |
分类号 |
G06K9/00;G06K9/62 |
主分类号 |
G06K9/00 |
代理机构 |
Hogan Lovells US LLP |
代理人 |
Hogan Lovells US LLP |
主权项 |
1. A non-transitory computer-readable storage medium storing instructions executable by one or more processors of a server to cause a method to be performed for recognizing text from one or more frames of image data, the method comprising:
receiving, from a device in communication with the server via a network, image data including a captured textual item, the image data corresponding to at least a portion of a first frame of video data; identifying an entity in the image data; identifying a plurality of contexts using the entity, each context identifying one or more of a document, a product, or a service, each context corresponding to one or more of a plurality of dictionaries stored in a database; generating a respective confidence level for each context according to a comparison of the entity with each context of the plurality of contexts; identifying one or more contexts of the plurality of contexts as being related contexts to the entity, the respective confidence level for each related context of the related contexts satisfying a threshold; selecting one or more of the dictionaries corresponding to the one or more related contexts; determining an identified text in the captured textual item using the selected one or more dictionaries; and sending the identified text to the device. |
地址 |
Reno NV US |