发明名称 Structuring document based on table of contents
摘要 A document is organized as a plurality of nodes associated with a table of contents. The nodes are clustered into a plurality of clusters based on a similarity criterion. One of the clusters is identified as corresponding to a highest or lowest level of the table of contents based on a selection criterion. The highest or lowest level is assigned to the nodes belonging to the identified cluster. The identifying and assigning are repeated to assign levels to the nodes belonging to each next highest or lowest level of the table of contents. The repeated identifying is based on the selection criteria applied disregarding nodes that have already been assigned a level. The document is structured based at least in part on the levels assigned to the table of contents nodes.
申请公布号 US8302002(B2) 申请公布日期 2012.10.30
申请号 US20050116100 申请日期 2005.04.27
申请人 DEJEAN HERVE;MEUNIER JEAN-LUC;XEROX CORPORATION 发明人 DEJEAN HERVE;MEUNIER JEAN-LUC
分类号 G06F17/00 主分类号 G06F17/00
代理机构 代理人
主权项
地址