发明名称 Measuring duplication in search results
摘要 Measuring duplication in search results is described. In one example, duplication between a pair of results provided by an information retrieval system in response to a query is measured. History data for the information retrieval system is accessed and query data retrieved, which describes the number of times that users have previously selected either or both of the pair of results, and a relative presentation sequence of the pair of results when displayed at each selection. From the query data, a fraction of user selections is determined in which a predefined combination of one or both of the pair of results were selected for a predefined presentation sequence. From the fraction, a measure of duplication between the pair of results is found. In further examples, the information retrieval system uses the measure of duplication to determine an overall redundancy value for a result set, and controls the result display accordingly.
申请公布号 US8825641(B2) 申请公布日期 2014.09.02
申请号 US201012942553 申请日期 2010.11.09
申请人 Microsoft Corporation 发明人 Radlinski Filip;Bennett Paul Nathan;Yilmaz Emine
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人 Tapia Pablo;Ross Jim;Minhas Micky
主权项 1. A computer-implemented method of measuring duplication between a pair of results provided by an information retrieval system in response to a query, the method comprising: accessing history data for the information retrieval system stored on a memory and retrieving query data describing the number of times that users of the information retrieval system have previously selected either or both of the pair of results, and a relative presentation sequence of the pair of results when displayed by the information retrieval system at each selection; determining from the query data, at a processor, a fraction of user selections in which a predefined combination of one or both of the pair of results were selected for a predefined presentation sequence; determining from the fraction, at the processor, a measure of the duplication between the pair of results; and causing a display of results for the query to indicate the pair of results are duplicates when the measure of the duplication meets a condition, wherein the query data comprises: a count of the number of times that both of the pair of results were selected when the first one of the pair of results was presented ahead of a second one of the pair of results; anda count of the number of times that both of the pair of results were selected when the second one of the pair of results was presented ahead of the first one of the pair of results.
地址 Redmond WA US