发明名称 Method and system of extracting structured data from a document
摘要 This disclosure provides an exemplary method and system for extracting structured data from an unstructured textual document. According to an exemplary method, initially a layout analysis is performed resulting in one or more alternatives for grouping and ordering the page elements of interest. Next, the content of these page elements are tagged based on application-specific heuristics. Finally, a sequence-based method is applied to the tags for identifying repetitive contiguous patterns.
申请公布号 EP2884425(A1) 申请公布日期 2015.06.17
申请号 EP20140195976 申请日期 2014.12.02
申请人 XEROX CORPORATION 发明人 DÉJEAN, HERVÉ;SCHROEDER, DARREN S.
分类号 G06K9/00 主分类号 G06K9/00
代理机构 代理人
主权项
地址
您可能感兴趣的专利