发明名称 Server side sampling of databases
摘要 A system and method for use with a data mining application for a large database having a large number of records. A selection attribute is chosen from one of a plurality of attributes contained by records within the database. Records are scanned in the database and a randomizing function is applied to the selection attribute of each record to create a randomized record value. A selection criteria is then applied to identify records for inclusion within a subset of records (smaller than the original data set) by comparing the randomized record value of each record with the selection criteria. The subset of records having a randomized record value satisfying the selection criteria approximates the entire database but takes up less memory and can be evaluated or scanned much more quickly.
申请公布号 US2003005087(A1) 申请公布日期 2003.01.02
申请号 US20010864591 申请日期 2001.05.24
申请人 MICROSOFT CORPORATION 发明人 BERNHARDT JEFFREY R.;VINARSKY ILYA
分类号 G06F7/00;G06F17/30;(IPC1-7):G06F7/00 主分类号 G06F7/00
代理机构 代理人
主权项
地址