发明名称 METHOD AND MEANS FOR SEARCHING A COMPRESSED INDEX
摘要 1280485 Information retrieval; calculating INTERNATIONAL BUSINESS MACHINES CORP 30 Dec 1969 [3 Jan 1969] 63204/69 Headings G4A and G4C Compressed keys are searched using a search argument, e.g. for information retrieval. The compressed keys are of the type generated in Specification 1,280,483 (referred to), and each includes none or more key bytes K from a corresponding original uncompressed key, a key byte count field L indicating the number of such key bytes included, and a factor byte count field F indicating the number of key bytes in the uncompressed key to high order of those included. F and L are one-byte long each. A pointer to information associated with the immediately preceding key is also included, together with a length (count) field for the pointer. After setting a count E c to zero, the compressed keys CK are taken in turn for byte-by-byte comparison, in general, with successive bytes of the search argument SA, as follows (Fig. 16). Starting at 202, fields F and L of the current CK are read. E c is compared with F. If it is less than F, any K bytes (number indicated by L) and the pointer, of the current CK, are by-passed 202a, 209a, 211a, and the next CK is entered 202. If E c is greater than F, any K bytes are by-passed 202b, 209b, and the pointer is read 226 for use in accessing the information associated with the key. The original uncompressed key is obtained from this and compared (not shown) with the search argument SA, inequality indicating the SA was not in the original sequence of uncompressed keys. If E c is equal to F, then if there are no more K bytes (L equals 0 at 202c) the pointer is read as before, but otherwise L is decremented by 1 at 206, and the next K byte is read 207 and compared 208 with the corresponding byte A of the search argument SA. If A is greater than K, any remaining K bytes and the pointer are bypassed and the next CK is entered 206a, 209a, 211a, 202. If A is less than K, the pointer is read 226 after by-passing any remaining K bytes 206b, 209c. If A is equal to K and there are no more A bytes 216, then if there are remaining K bytes 206c they are by-passed 209c and the pointer is read 226 whereas if there are no remaining K bytes 206c the pointer is bypassed 211b together with the F and L fields and K bytes of the next compressed key 223, 209c before reading the (next )pointer 226. If A is equal to K and there are more A bytes 216, the next byte is read 217, count E c is incremented by one 218 (indicating the number of equal bytes found and the position of the next to be compared), then if there are no more K bytes in the current CK as indicated by L equal to 0 at 206d the pointer is by-passed 211a and the next CK entered 202, whereas if L is not zero L is decremented by one 206 and the next K byte is read 207 for comparison 208 with the next A byte and so on. Hardware disclosed uses an adder, used for subtraction by complemented addition, comparison by subtraction or fed with a one for incrementing. The adder output can be sign-tested and zero-tested and fed to registers for E c and L (the latter holding L in twoscomplemented form so that decrementing of L is done by incrementing the register contents). A register for an A byte is provided, as well as registers in I/O interface units communicating with a computer providing the SA and an I/O device providing the CKs. The CK bytes are provided from their register in their I/O interface in ones-complement form and are passed through the adder with addition of one to convert to twos-complement form. As a modification, the F and L fields may be a half-byte each and if this is not large enough in any particular CK, the relevant half-byte field stores 1111 to indicate that an extended (one-byte) field follows containing the actual F or L value. In this case, the hardware is modified by permitting the high-order half-byte width of one adder input to be forced to 1111, and providing two half-byte wide zero-testers connected in parallel with this adder input to detect the 1111 halfbyte fields indicating L or F is in an extended field (these 1111 fields, like all CK fields, are provided in ones-complement form, i.e. as 0000). Minor variations in the algorithm for more efficient implementation are disclosed.
申请公布号 GB1280485(A) 申请公布日期 1972.07.05
申请号 GB19690063204 申请日期 1969.12.30
申请人 INTERNATIONAL BUSINESS MACHINES CORPORATION 发明人
分类号 G06F12/00;G06F7/02;G06F13/12;G06F17/30 主分类号 G06F12/00
代理机构 代理人
主权项
地址