发明名称 Clustering of databases having mixed data attributes
摘要 In a computer data processing system, method and apparatus for clustering data in a database. A database having a number of data records having both discrete and continuous attributes is stored on one or more storage media which may be connected by a network. The records in the database are scanned so that data records which have the same discrete attribute configuration can be tabulated. A first set of configurations is determined wherein the number of data records of each configuration of said first set of configurations exceeds a threshold number of data records. Data records that do not belong to one of the first set of configurations are added to or tabulated with a configuration within said first set of configurations to produce a subset of records from the database belonging to configurations in the first set of configurations. The data in these configurations are then clustered based on the continuous data attributes of records contained within that first set of configurations to produce a clustering model.
申请公布号 US2004010497(A1) 申请公布日期 2004.01.15
申请号 US20010886771 申请日期 2001.06.21
申请人 MICROSOFT CORPORATION 发明人 BRADLEY PAUL S.;WAWRYNIUK MARKUS
分类号 G06F17/30;(IPC1-7):G06F7/00 主分类号 G06F17/30
代理机构 代理人
主权项
地址