摘要 |
A system and method for page frame detection for pages of a document are disclosed. The method includes receiving a set of document pages for a document, each page having at least one detected object. For each page in the set, the method includes determining dimensions of bounding box which encompasses the detected objects of the page and determining margin dimensions, based on a position of the bounding box on the page. A page frame is computed as a combination of bounding box dimensions and margin dimensions, based on frequencies of the bounding box dimensions and margin dimensions computed for the set of pages. The computed page frame is matched to pages of the document. Information based on the matching, such as content of text objects within the matched page frame, can be output. |