发明名称 Comparing text pages using image features based on word positions
摘要 A signature for a page of text is generated. The signature serves as an identifier of the text page. Positions of words in a text page are determined. Positions of multiple second words in the text page are determined relative to the position of a first word in the text page. A signature value is generated that describes the second word positions relative to the first word position. The signature value is stored. Additional signatures for the text page can be generated, each signature describing positions of other words in the text page relative to a word in the text page for which the signature is being generated. The signatures can be used to compare the text page to another text page and generate a measure of similarity that describes the result of the comparison.
申请公布号 US8151187(B1) 申请公布日期 2012.04.03
申请号 US201113245814 申请日期 2011.09.26
申请人 SPASOJEVIC NEMANJA L.;PONCIN GUILLAUME;BLOOMBERG DAN S.;GOOGLE INC. 发明人 SPASOJEVIC NEMANJA L.;PONCIN GUILLAUME;BLOOMBERG DAN S.
分类号 G06F15/177 主分类号 G06F15/177
代理机构 代理人
主权项
地址