发明名称 Decoding of variable-length data with group formats
摘要 Embodiments provide methods and systems for encoding and decoding variable-length data, which may include methods for encoding and decoding search engine posting lists. Embodiments may include different encoding formats including group unary, packed unary, and/or packed binary formats. Some embodiments may utilize single instruction multiple data (SIMD) instructions that may perform a parallel shuffle operation on encoded data as part of the decoding processes. Some embodiments may utilize lookup tables to determine shuffle sequences and/or masks and/or shifts to be utilized in the decoding processes. Some embodiments may utilize hybrid formats.
申请公布号 US9195675(B2) 申请公布日期 2015.11.24
申请号 US201113077417 申请日期 2011.03.31
申请人 A9.com, Inc. 发明人 Rose Daniel E.;Stepanov Alexander A.;Gangolli Anil Ramesh;Oberoi Paramjit S.;Ernst Ryan Jacob
分类号 G06F7/00;G06F17/30;H03M7/40;H03M7/46 主分类号 G06F7/00
代理机构 Hogan Lovells US LLP 代理人 Hogan Lovells US LLP
主权项 1. A computer-implemented method for decoding encoded differences between document identification numbers in a search engine posting list comprising: under control of one or more computer systems configured with executable instructions, reading one or more descriptors, each descriptor including information regarding a plurality of size information for a group of encoded differences between document identification numbers for the search engine posting list and encoded in at least one of a packed unary or a group unary format;reading data representing the group of encoded differences between document identification numbers linked with the one or more descriptors;identifying one or more shuffle sequences linked with the one or more descriptors from a lookup table;generating shuffled data by performing one or more shuffle operations in parallel on the data representing the group of encoded differences between document identification numbers using the one or more shuffle sequences, wherein the one or more shuffle operations include inserting one or more sequences of zeros into the data; anddetermining a plurality of decoded differences between document identification numbers from the shuffled data.
地址 Palo Alto CA US