发明名称 Merging sorted data arrays based on vector minimum, maximum, and permute instructions
摘要 A method and apparatus are provided to perform efficient merging operations of two or more streams of data by using SIMD instruction. Streams of data are merged together in parallel and with mitigated or removed conditional branching. The merge operations of the streams of data include Merge AND and Merge OR operations.
申请公布号 US9298419(B2) 申请公布日期 2016.03.29
申请号 US201213601177 申请日期 2012.08.31
申请人 SAP SE 发明人 Inoue Hiroshi;Ohara Moriyoshi;Komatsu Hideaki
分类号 G06F7/36;G06F9/30;G06F9/38 主分类号 G06F7/36
代理机构 Sterne, Kessler, Goldstein & Fox P.L.L.C. 代理人 Sterne, Kessler, Goldstein & Fox P.L.L.C.
主权项 1. A method comprising: loading a first sorted set of data elements into a first input hardware vector register from a first input stream, wherein the first sorted set of data elements is sorted in ascending order; loading a second sorted set of data elements into a second input hardware vector register from a second input stream, wherein the second sorted set of data elements is sorted in ascending order; merging the first sorted set of data elements and the second sorted set of data elements to generate a third sorted set of data elements, wherein the merging comprises a plurality of stages, and wherein each stage of the plurality of stages comprises: executing a single instruction, multiple data (SIMD) vector minimum instruction on the first input hardware vector register and the second input hardware vector register, wherein the SIMD vector minimum instruction compares values of elements in the first input hardware vector register to corresponding values of elements in the second input hardware vector register and outputs a smaller of the values compared by the SIMD vector minimum instruction;executing a SIMD vector maximum instruction on the first input hardware vector register and the second input hardware vector register, wherein the SIMD vector maximum instruction compares values of elements in the first input hardware vector register to corresponding values of elements in the second input hardware vector register and outputs a larger of the values compared by the SIMD vector maximum instruction; andexecuting a SIMD permute instruction on the first hardware vector register storing the smaller of the values compared by the SIMD vector minimum instruction and the second hardware vector register storing the larger of the values compared by the SIMD vector maximum instruction, wherein the third sorted set of data elements is sorted in ascending order; placing a set of smaller data elements of the third sorted set of data elements in a first output hardware vector register; and placing contents of the first output hardware vector register into an output stream.
地址 Walldorf DE
您可能感兴趣的专利