发明名称 Method and apparatus for retrieving text using document signatures
摘要 A method and apparatus for retrieving similar or identical textual passages among different documents is disclosed. Normal discourse structures along with textual content attributes are used to encode a known passage with "marker sequences" that give a characterizing "signature" to the passage. The encoded known passage is then evaluated against similarly encoded passages appearing in a database of documents. If it is determined that there is a possible match between the encoded known passage and an encoded passage in a database document, a sequential string search is performed to determine whether the two passages are likely to be similar or identical. If the sequential string search records a probable match between the known passage and the database passage, the database passage is displayed for further review.
申请公布号 US6820079(B1) 申请公布日期 2004.11.16
申请号 US20000483868 申请日期 2000.01.18
申请人 CLARITECH CORPORATION 发明人 EVANS DAVID A.
分类号 G06F17/30;G06F17/22;G06F17/27;(IPC1-7):G06F17/00 主分类号 G06F17/30
代理机构 代理人
主权项
地址