发明名称 Cohort-based learning from user edits
摘要 A platform for generating a first character recognition-based work including a first plurality of automatically-made edits, each edit being characterized by a Unicode and a confidence score. The platform may identify at least one edit as being of questionable accuracy based on the confidence score, may determine a unique character signature of the edit, and may receive a manual correction made to the edit. The platform may also store the manual correction in association with the character signature and the Unicode, such that the manual correction is configured for use in generating a second plurality of automatically-made edits in a second character recognition-based work different than the first work.
申请公布号 US9286526(B1) 申请公布日期 2016.03.15
申请号 US201314100819 申请日期 2013.12.09
申请人 Amazon Technologies, Inc. 发明人 Manohar Vasant
分类号 G06K9/03;G06K9/00 主分类号 G06K9/03
代理机构 Lee & Hayes, PLLC 代理人 Lee & Hayes, PLLC
主权项 1. A method comprising: generating, by one or more computing devices, a first character recognition-based work including a first plurality of automatically-made edits made by the one or more computing devices, each edit of the first plurality of edits being characterized by a Unicode and a confidence score; comparing the respective confidence scores of the first plurality of automatically-made edits to a confidence score threshold; identifying at least one edit of the first plurality of automatically-made edits as having a respective confidence score below the confidence score threshold; characterizing the at least one edit of the first plurality of automatically-made edits as being of questionable accuracy based at least in part on the respective confidence score of the at least one edit being below the confidence score threshold; determining a character signature of the at least one edit, wherein the character signature comprises one or more of a shape identifier, a boundary identifier, or a location identifier, and wherein the character signature is indicative of a character of the at least one edit; receiving, from a first user of the one or more computing devices, a correction made to the at least one edit, the correction comprising one or more revised characters; storing, at the one or more computing devices, the one or more revised characters in association with the character signature and the Unicode of the at least one edit; and generating, using the one or more revised characters, a second plurality of automatically-made edits in a second character recognition-based work, wherein the second character recognition-based work is different than the first character recognition-based work.
地址 Seattle WA US