主权项 |
1. A method of using a computer to efficiently produce an ordered list of pointers from N inverted lists, said list of pointers containing all and only the pointers that are stored in one of the inverted lists and also in each of the other lists, the method comprising:
storing each inverted list as an ordered sequence of index records; storing each index record with an encoding of N pointers, N being larger than or equal to one, and N being no larger than a total number of pointers in the inverted list, and allowing access to the value of a minimum pointer and of a maximum pointer stored in the index record, each index record being contiguous or non-contiguous, an index record being contiguous when the index record stores all possible pointer values between the minimum and the maximum pointer values extremes included, the index record being non-contiguous when the index record does not store all the possible pointer values between the minimum and the maximum pointer values extremes included; sequentially reading each inverted list through a cursor C, C consisting of a component C.pos identifying the current index record in the ordered list of index records, and of a component C.cont representing a group of pointers in the form <ps, pe>, where C.cont.ps identifies the minimum pointer and C.cont.pe the maximum pointer in the group, C.cont being set during the operation of reading the current index record identified by C.pos and C.cont.ps being set to the minimum pointer in the current index record and C.cont.pe being set to the maximum pointer in the current index record during said operation; and advancing the cursor to the next pointer wherein when the current pointer C.cont.ps is not the last pointer in the current record, C.cont.ps is set to the next pointer in the current record, and C.cont.ps is returned, and otherwise when the current pointer C.cont.ps is the last pointer in the current record, the next record in the inverted list is read, when said record does not exist an end-of-scan condition being returned, otherwise C.cont.ps being returned, each cursor C also supporting an operation that advances said cursor C to a value x, where: a) while x>C.cont.pe, the cursor is advanced to the next record, b) when x≦C.cont.ps, the cursor is not advanced, c) when C.cont.ps<x≦C.cont.pe, the cursor is advanced to the next pointer until x≦C.cont.ps; each list J in said N inverted lists having a cursor C[J], said operation to produce an ordered list of pointers from N inverted lists, said list of pointers containing all and only the pointers that are stored in one of the inverted lists and also in each of the other lists, comprising the following steps: a. compute minp as the minimum value of C[J].cont.pe for all the cursors and maxp as the maximum value of C[J].cont.ps for all the cursors; b. when minp≧maxp then compute the intersection among the lists in the interval <maxp, minp> and advance each cursor to minp+1, otherwise advance each cursor to maxp, iterate in both cases steps a and b until any cursor returns an end-of-scan condition during advancing; wherein: when all cursors are defined on a contiguous index record, the output of the intersection in the interval <maxp, minp> is given by all the possible pointer values in said interval; otherwise, the output of the intersection in the interval <maxp, minp> is computed by ignoring all the cursors defined on a contiguous index record, and outputting all the values in the interval <maxp, minp> extremes included, that appear in all the remaining cursors. |