发明名称 METHODS AND SYSTEM FOR DOCUMENT RECONSTRUCTION
摘要 Different embodiments of the invention use different techniques for analyzing an unstructured document to define a structured document. The unstructured document includes numerous primitive elements, but does not include structural elements that specify the structural relationship between the primitive elements and/or structural attributes of the document based on these primitive elements. To define the structured document, the primitive elements of the unstructured document are used to identify various geometric attributes of the unstructured document. The identified geometric attributes and other attributes of the primitive elements are used to define structural elements, such as associated primitive elements (e.g., words, paragraphs, joined graphs, etc.), tables, guides, gutters, etc, as well as to define the flow of reading through the primitive and structural elements. Various methods to enhance the efficiency of the geometric analysis and document reconstruction processes, ( e.g., hierarchical profiling, efficient cluster analysis techniques, efficient data structures) are provided.
申请公布号 WO2010078475(A3) 申请公布日期 2011.04.14
申请号 WO2009US69885 申请日期 2009.12.31
申请人 APPLE INC.;MANSFIELD, PHILIP, ANDREW;LEVY, MICHAEL, ROBERT;CLEGG, DEREK, B. 发明人 MANSFIELD, PHILIP, ANDREW;LEVY, MICHAEL, ROBERT;CLEGG, DEREK, B.
分类号 G06F17/27 主分类号 G06F17/27
代理机构 代理人
主权项
地址