发明名称 |
Method and apparatus for detecting a table of contents and reference determination |
摘要 |
In a method for identifying a table of contents in a document, an ordered sequence of text fragments is derived from the document. A table of contents is selected as a contiguous sub-sequence of the ordered sequence of text fragments satisfying the criteria: (i) entries defined by text fragments of the table of contents each have a link to a target text fragment having textual similarity with the entry; (ii) no target text fragment lies within the table of contents; and (iii) the target text fragments have an ascending ordering corresponding to an ascending ordering of the entries defining the target text fragments. |
申请公布号 |
US8706475(B2) |
申请公布日期 |
2014.04.22 |
申请号 |
US20050032814 |
申请日期 |
2005.01.10 |
申请人 |
DEJEAN HERVE;MEUNIER JEAN-LUC;FAMBON OLIVIER;XEROX CORPORATION |
发明人 |
DEJEAN HERVE;MEUNIER JEAN-LUC;FAMBON OLIVIER |
分类号 |
G06F17/20;G06F3/00;G06F7/00;G06F17/27 |
主分类号 |
G06F17/20 |
代理机构 |
|
代理人 |
|
主权项 |
|
地址 |
|