发明名称 |
Data extraction confidence attribute with transformations |
摘要 |
A data extraction system for receiving and scanning documents to generate ordered input for storage in a database employs a non-linear statistical model for a data extraction sequence having a plurality of transformations. Each transformation transitions an extracted data value in various forms from a raw data image to a computed data value. For each transformation, a confidence model learns a confidence component for the particular transformation. The learned confidence components, generated from a control set of documents having known values, are employed in a production mode with actual raw data. The confidence component corresponds to a likelihood of transformation accuracy, and the confidence model aggregates the confidence components to compute a confidence for the extracted data value. A database stores the extracted data value labeled with the computed confidence attribute for subsequent use by an application employing the extracted data. |
申请公布号 |
US8676731(B1) |
申请公布日期 |
2014.03.18 |
申请号 |
US201113180068 |
申请日期 |
2011.07.11 |
申请人 |
SATHYANARAYANA VINAYA;PATI PEETA BASA;SIVANANDA SALAKA;T. R. RAJARAJAN;CORELOGIC, INC. |
发明人 |
SATHYANARAYANA VINAYA;PATI PEETA BASA;SIVANANDA SALAKA;T. R. RAJARAJAN |
分类号 |
G06F15/18 |
主分类号 |
G06F15/18 |
代理机构 |
|
代理人 |
|
主权项 |
|
地址 |
|