发明名称 METHOD FOR LAYOUT BASED DOCUMENT ZONE QUERYING
摘要 A method and a system are disclosed for querying a document collection based on the layout of only a fragment of the content of a document, specified as a query zone. The method includes providing an index for a collection of documents. In the index, content of a document page in the collection that has been decomposed into layout blocks is indexed according to representations of the blocks and one or more geometric relations between the blocks. A query is generated which is based on representations of blocks determined to be within the query zone and geometric relations between them. This is used to query the index to retrieve pages of documents in the collection which can each be expected to include a layout zone somewhere in the page that is similar in layout to the query zone.
申请公布号 US2012005225(A1) 申请公布日期 2012.01.05
申请号 US20100829553 申请日期 2010.07.02
申请人 CHIDLOVSKII BORIS;XEROX CORPORATION 发明人 CHIDLOVSKII BORIS
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址