发明名称 Generating a Table of Contents for Unformatted Text
摘要 An approach is provided for an information handling system that includes a processor and a memory to generate a table of contents pertaining to a document. The approach semantically analyzes the document to identify semantic relationships of proximate elements of the document. A number of candidate headings corresponding to a semantically related section of the document are identified and each of the candidate headings are scored. Based on the scores of each of the candidate headings, a section heading for the semantically related section of the document is selected. The selected heading is then included in the table of contents for the section of the document. The process of identifying candidate headings, scoring candidates, and selecting the section heading is repeated for other semantically related sections of the document.
申请公布号 US2015169676(A1) 申请公布日期 2015.06.18
申请号 US201314132173 申请日期 2013.12.18
申请人 International Business Machines Corporation 发明人 Bohra Amit P.;Kummamuru Krishna;Pikovsky Alexander;Shivkumar Abhishek
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项 1. A method, in an information handling system comprising a processor and a memory, of generating a table of contents pertaining to a document, the method comprising: semantically analyzing the document to identify semantic relationships of proximate elements of the document; identifying a plurality of candidate headings corresponding to a semantically related section of the document; scoring the each of the plurality of candidate headings; selecting, based on the scores of each of the plurality of candidate headings, a section heading for the semantically related section of the document; including the selected heading in the table of contents for the section of the document; and repeating the identifying, scoring, selecting, and including steps for other semantically related sections of the document.
地址 Armonk NY US