发明名称 |
Clustering of databases having mixed data attributes |
摘要 |
In a computer data processing system, method and apparatus for clustering data in a database. A database having a number of data records having both discrete and continuous attributes is stored on one or more storage media which may be connected by a network. The records in the database are scanned so that data records which have the same discrete attribute configuration can be tabulated. A first set of configurations is determined wherein the number of data records of each configuration of said first set of configurations exceeds a threshold number of data records. Data records that do not belong to one of the first set of configurations are added to or tabulated with a configuration within said first set of configurations to produce a subset of records from the database belonging to configurations in the first set of configurations. The data in these configurations are then clustered based on the continuous data attributes of records contained within that first set of configurations to produce a clustering model.
|
申请公布号 |
US2004010497(A1) |
申请公布日期 |
2004.01.15 |
申请号 |
US20010886771 |
申请日期 |
2001.06.21 |
申请人 |
MICROSOFT CORPORATION |
发明人 |
BRADLEY PAUL S.;WAWRYNIUK MARKUS |
分类号 |
G06F17/30;(IPC1-7):G06F7/00 |
主分类号 |
G06F17/30 |
代理机构 |
|
代理人 |
|
主权项 |
|
地址 |
|