主权项 |
1. A computer-implemented method comprising:
receiving, by a computer system comprising at least one processor, a set of data points and similarity values according to a pairwise similarity function, wherein the pairwise similarity function provides similarity values representative of a similarity between each data point and each other data point of the set of data points, wherein the similarity values are determined and the similarity function is estimated using one or more machine learning process; clustering, by the computer system, the set of data points into at least one cluster based on the similarity values, the at least one cluster comprising one or more data points of the set of data points; consolidating, by the computer system, data stored in the one or more data points associated with the at least one cluster to create a consolidated data point, wherein consolidating data stored in the one or more data points associated with the at least one cluster comprises:
extracting a data element from a data point of the at least one cluster;determining another data element from another data point of the at least one cluster, wherein the other data element is a duplicate of the data element;selecting one among the data element and the other data element to be added to the consolidated data point; andstoring the selected data element in the consolidated data point; and using the consolidated data point when providing results responsive to an associated search query to a user. |