发明名称 K-MEANS METHOD FOR CLUSTERING HOMOGENEOUS INFORMATION BY A DISTANCE BASED ON PREFERENCES AND RATIOS.
摘要 The invention refers to a method known as k-means, which includes innovations intended to customize and homogenize the data mining, such as: the user identifies the clustering attributes and establishes the preference degree (i.e., for demanding a higher degree of vicinity between the cluster group; or provide more freedom); the user establishes the convergence threshold (i.e., percentage of satisfaction in the reassignment of members to the clusters); the method provides the initial values for characterizing the centroids in symmetric regions with the same ratio (i.e., the minimum value and the maximum value are symmetric with regard to the centroid); the method estimates the Euclidian distance in a standard and balanced manner, since the accumulation of differences between the centroids and attributes with values represented by heterogeneous units (i.e., hundredths, millions...) is changed by distance percentages (i.e., the difference between the centroid and the value of an attribu te is divided amongst the centroid and the whole values existing in the information repository of that attribute) and these percentages being magnified or degraded in a portion equivalent to the preference assigned to the attribute (i.e., the higher is the relevance of the attribute, the more the percentage ratio grows, for instance a value upper than 1.0:1.1, 1.25); the less is the relevance, the percent valueá proportionally decreases (i.e., a value lower than 1.0: 0.9, 0.75); the method ends the mining upon satisfying the threshold defined by the user, (i.e., avoiding the treatment of the members that were recently assigned to a new cluster).
申请公布号 MX340155(B) 申请公布日期 2016.06.17
申请号 MX20130014454 申请日期 2013.12.09
申请人 INSTITUTO POLITÉCNICO NACIONAL 发明人 Alejandro PEÑA AYALA;Leonor Adriana CÁRDENAS ROBLEDO
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址
您可能感兴趣的专利