发明名称 CLUSTERING METHOD AND SYSTEM
摘要 The present disclosure discloses a method and system for clustering. The method includes: vectorizing a plurality of readable files to obtain a plurality of file vectors corresponding to the multiple readable files; extracting a total characteristic vector based on the file vectors; and clustering the readable files based on a ranking result of a respective similarity degree between the total characteristic vector and each of the file vectors. The present disclosure also provides a method and system for clustering webpages. An application of the methods or systems described in the present disclosure reduces the number of times of comparison of similarity degrees between file vectors, and further reduces the resulting burden on system resources. This advantageously results in reduced usage of CPU and memory, reduced run time of clustering and improved performance of clustering.
申请公布号 WO2011059588(A1) 申请公布日期 2011.05.19
申请号 WO2010US51069 申请日期 2010.10.01
申请人 ALIBABA GROUP HOLDING LIMITED;ZHANG, TAO;GUO, JIAQING 发明人 ZHANG, TAO;GUO, JIAQING
分类号 G06F9/46 主分类号 G06F9/46
代理机构 代理人
主权项
地址