发明名称 Classification-based method and apparatus for string selectivity estimation
摘要 Histogram construction and selectivity estimation for string and substring match queries in databases of data having strings associated with attributes. The histogram construction counts string-attribute pairs in the documents, and outputs string-attribute-count triples sorted by count. The collection is partitioned into buckets. A synopsis is generated for the partition, having an average selectivity or count of the string-attribute-count triples in the partition and summary information representing the set of string-attribute pairs belonging to the bucket. Subsequent queries, both for exact and substring matches, use the synopsis to estimate the selectivity of buckets.
申请公布号 US7987180(B2) 申请公布日期 2011.07.26
申请号 US20080057885 申请日期 2008.03.28
申请人 INTERNATIONAL BUSINESS MACHINES CORPORATION 发明人 LIM LIPYEOW;WANG MIN
分类号 G06F7/00 主分类号 G06F7/00
代理机构 代理人
主权项
地址