摘要 |
PROBLEM TO BE SOLVED: To associate a specific position of a document with content at low cost without setting any object area with which the content is associated. SOLUTION: A content retrieval device of this present invention is configured to extract a character block from a document and to associate the character block, and a page identifier and coordinates within a page in the document where the character block appears, to output to an index DB. The index DB is searched based on a query character block extracted from an inputted search query (a partial area in the document) to tabulate retrieval results for every page, and a page where the largest number of character blocks have been retrieved is defined as a hit page. The center of gravity of the coordinates within the page of the character block retrieved from the hit page is calculated and defined as a hit position within the page. The calculated hit page and the hit position within the page are defined as a query, and content with which the neighboring page position of the hit position within the page is associated is retrieved from the content DB. COPYRIGHT: (C)2012,JPO&INPIT |