摘要 |
A system, method, and computer program determines section information of a digital volume. Digital volumes include digital representations of human-readable content, such as digitized books. Phrases are extracted from a table of contents of a digital volume. Matching phrases that at least approximately match the extracted phrases are identified in the body of the digital volume. A best matching phrase is determined for each extracted phrase based on the ordering of the extracted phrases and the matching phrases, and based on match scores indicating the quality of the matches. Section information is generated, including section headings and section start locations based on the best matching phrases. The digital volume is presented to users with links from the table of contents to the section headings on the section start pages. The section information is also used to enhance searching of the digital volume by users.
|