发明名称 Hierarchical classification in credit card data extraction
摘要 Embodiments herein provide computer-implemented techniques for allowing a user computing device to extract financial card information using optical character recognition (“OCR”). Extracting financial card information may be improved by applying various classifiers and other transformations to the image data. For example, applying a linear classifier to the image to determine digit locations before applying the OCR algorithm allows the user computing device to use less processing capacity to extract accurate card data. The OCR application may train a classifier to use the wear patterns of a card to improve OCR algorithm performance. The OCR application may apply a linear classifier and then a nonlinear classifier to improve the performance and the accuracy of the OCR algorithm. The OCR application uses the known digit patterns used by typical credit and debit cards to improve the accuracy of the OCR algorithm.
申请公布号 US9213907(B2) 申请公布日期 2015.12.15
申请号 US201314059071 申请日期 2013.10.21
申请人 GOOGLE INC. 发明人 Kumar Sanjiv;Rowley Henry Allan;Wang Xiaohang;Rodrigues Jose Jeronimo Moreira
分类号 G06K9/62;G06K9/18;G06K9/66;G06T3/00;G06K9/00;G06Q20/22;G06Q20/34;G07F7/08;G06K9/32 主分类号 G06K9/62
代理机构 Johnson, Marcou & Isaacs, LLC 代理人 Johnson, Marcou & Isaacs, LLC
主权项 1. A computer-implemented method to extract card information, comprising: receiving, by one or more computing devices, an image of a card from a camera; identifying, by the one or more computing devices, a first area of the image, the first area being selected as a potential location of a digit on the card in the image and of a size that will encompass not more than a single complete digit, the potential location and the size of the first area being identified from a comparison of the image to a database of card layouts stored on the one or more computing devices; performing, by the one or more computing devices, a linear classification algorithm on data encompassed by the first area; determining, by the one or more computing devices, a confidence level of a first result of the application of the linear classification algorithm to the first area, wherein the confidence level of the first result indicates a likelihood that the first area encompasses the single complete digit; determining, by the one or more computing devices, that the first area does not encompasses a single complete digit upon determining that the confidence level of the first result is under a configured threshold; identifying, by the one or more computing devices, a second area of the image, the second area being in a different location from the first area and of a size that will encompass not more than a single complete digit; performing, by the one or more computing devices, a linear classification algorithm on data encompassed by the second area; determining, by the one or more computing devices, a confidence level of a second result of the application of the linear classification algorithm to the second area indicating that the second area encompasses a single complete digit, wherein the confidence level of the second result indicates a likelihood that the second area encompasses the single complete digit; determining, by the one or more computing devices, that the second area encompasses the single complete digit upon determining that the confidence level of the second result is over a configured threshold; and performing, by the one or more computing devices, an optical character recognition algorithm on the second area upon a determination that the second area encompasses the single complete digit.
地址 Mountain View CA US