摘要 |
A system and method for using numbers to query a corpus of documents, particularly but not exclusively for data spaces that have low reflectivity, i.e., for a point xi described by one or more numbers, the data space does not contain very many permutations of the numbers. For each document to be searched, each query number is matched with one and only one document number preferably using a bipartite graph or heuristic rule such that a distance function is minimised. The distance function can, but not must, take into account attribute names and unit names. A limiting algorithm can be used to limit the number of documents that must be searched. |