发明名称 KEYWORD SEARCH OVER HEAVY-TAILED DATA AND MULTI-KEYWORD QUERIES
摘要 Index structures and query processing framework that enforces a given threshold on the overhead of computing conjunctive keyword queries. This includes a keyword processing algorithm, logic to determine which indexes to materialize, and a probabilistic approach to reducing the overhead for determining which indexes to build. The index structures leverage the fact that the frequency distribution of natural-language text follows a power law. Given a document collection, a set of indexes is proposed for materialization so that the time for intersecting keywords does not exceed a given threshold Delta. When considering the associated space requirement, the additional indexes are limited. Materialization of such a set of indexes for reasonable values of Delta (e.g., the time required to scan 20% of the largest inverted index), at least for a collection of short documents is distributed by the power law.
申请公布号 US2009083214(A1) 申请公布日期 2009.03.26
申请号 US20070858920 申请日期 2007.09.21
申请人 MICROSOFT CORPORATION 发明人 KONIG ARND C.;CHAUDHURI SURAJIT;CHURCH KENNETH;SUI LIYING
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址