摘要 |
Various systems and methods are provided for identification disambiguation in databases. In one embodiment, a system includes an approximate structural equivalence (ASE) analyzer including logic that obtains a set of records from a database; logic that determines a knowledge homogeneity score (KHS) for a pair of records in the set of records; and logic that determines a condition of ASE for the pair of records based upon the KHS and a predefined KHS threshold. In another embodiment, a method includes determining a plurality of references shared by at least two records in a set of records; determining a weighting value for each shared reference; and determining a KHS for each pair of records in the set of records based upon at least one reference shared by the pair of records and the weighting value corresponding to the at least one shared reference.
|