发明名称 Selection of a set of optimal n-grams for indexing string data in a DBMS system under space constraints introduced by the system
摘要 The present invention provides a computer-readable medium and system for selecting a set of n-grams for indexing string data in a DBMS system. Aspects of the invention include providing a set of candidate n-grams, each n-gram comprising a sequence of characters; identifying sample queries having character strings containing the candidate n-grams; and based on the set of candidate n-grams, the sample queries, database records, and an n-gram space constraint, automatically selecting, given the space constraint, a minimal set of an n-grams from the set of candidate n-grams that minimizes the number of false hits for the set of sample queries had the sample queries been executed against the database records.
申请公布号 US7478081(B2) 申请公布日期 2009.01.13
申请号 US20040981895 申请日期 2004.11.05
申请人 INTERNATIONAL BUSINESS MACHINES CORPORATION 发明人 HACIGUMUS VAHIT HAKAN;IYER BALAKRISHNA RAGHAVENDRA;MEHROTRA SHARAD
分类号 G06F7/00;G06F17/30 主分类号 G06F7/00
代理机构 代理人
主权项
地址