发明名称 Identification of complementary data objects
摘要 In one aspect, the description relates to identifying complementary data objects, including providing a plurality of data objects, applying a clustering algorithm for grouping at least some of the data objects into two or more clusters, for each of the clusters, calculating a cluster center, calculating, for at least a first one of the cluster centers, a complementary cluster center, determining a second cluster center of a second cluster, the second cluster center being determined as the one of the cluster centers having the smallest distance in respect to the complementary cluster center, selecting at least one data object of the determined second cluster. Other features and aspects may be realized, depending upon the particular application.
申请公布号 US9501562(B2) 申请公布日期 2016.11.22
申请号 US201313779593 申请日期 2013.02.27
申请人 International Business Machines Corporation 发明人 Labenski Marcin;Madduri Hari H.
分类号 G06F17/30;G06K9/62 主分类号 G06F17/30
代理机构 Konrad, Raynes, Davda and Victor LLP 代理人 Konrad William K.;Konrad, Raynes, Davda and Victor LLP
主权项 1. A computer-implemented method for identifying complementary data objects, the method comprising: providing a plurality of data objects, each of the data objects having a plurality of property-value pairs; applying a clustering algorithm for grouping at least some of the data objects into a set of two or more clusters, the grouping depending on the property-value pairs of the data objects; for each cluster of the set of clusters, calculating a cluster center, the cluster center comprising a plurality of derivative property-value pairs derived from the property-value pairs of all data objects belonging to said cluster; calculating, for at least a first one of the cluster centers, a complementary cluster center, the first cluster center being a cluster center of a first one of the set of clusters, the complementary cluster center not being a cluster center of a cluster of the set of clusters and having a maximum possible degree of complementarity in respect to the first cluster center wherein the maximum possible degree of complementarity is a function of a maximum property value less a property value of a property-value pair of a data object of the first one of the clusters; determining a second cluster center of a second cluster, the second cluster center being determined as the one of the cluster centers having the smallest distance in respect to the complementary cluster center; selecting at least one data object of the determined second cluster as a data object being complementary to the data objects of the first cluster; wherein each of the data objects represents a piece of data and wherein at least some of the property-value pairs of each data object are selected, in any combination, from a group comprising: an average CPU utilization; a maximum CPU utilization; an average disc space utilization; a maximum disc space utilization; an average memory utilization; a maximum memory utilization; an average disc I/O utilization; a maximum disc I/O utilization; an average network I/O utilization; and a maximum network disc I/O utilization.
地址 Armonk NY US