发明名称 Providing vector sub-byte decompression functionality
摘要 Methods, apparatus, instructions and logic provide SIMD vector sub-byte decompression functionality. Embodiments include shuffling a first and second byte into the least significant portion of a first vector element, and a third and fourth byte into the most significant portion. Processing continues shuffling a fifth and sixth byte into the least significant portion of a second vector element, and a seventh and eighth byte into the most significant portion. Then by shifting the first vector element by a first shift count and the second vector element by a second shift count, sub-byte elements are aligned to the least significant bits of their respective bytes. Processors then shuffle a byte from each of the shifted vector elements' least significant portions into byte positions of a destination vector element, and from each of the shifted vector elements' most significant portions into byte positions of another destination vector element.
申请公布号 US9405539(B2) 申请公布日期 2016.08.02
申请号 US201313956347 申请日期 2013.07.31
申请人 Intel Corporation 发明人 Uliel Tal;Ould-Ahmed-Vall Elmoustapha;Willhalm Thomas;Valentine Robert
分类号 G06F9/315;G06F9/30 主分类号 G06F9/315
代理机构 Lowenstein Sandler LLP 代理人 Lowenstein Sandler LLP
主权项 1. A processor comprising: a decoder to decode a first instruction into a decoded first instruction, wherein the first instruction specifies a vector sub-byte decompression operation, a destination vector, a source of sub-byte elements and a sub-byte element size; and logic circuitry of an execution unit coupled to the decoder that, responsive to the decoded first instruction, is to: shuffle from the source, a first two bytes containing a first sub-byte element of a first bit alignment into a least significant portion of a first vector element, and a second two bytes containing a second sub-byte element of the first bit alignment into a most significant portion of the first vector element; shuffle from the source, a third two bytes containing a third sub-byte element of a second bit alignment into a least significant portion of a second vector element, and a fourth two bytes containing a fourth sub-byte element of the second bit alignment into a most significant portion of the second vector element; shift the first vector element by a first shift count to align the first sub-byte element to a least significant bit of the least significant portion of the first vector element, and the second sub-byte element to a least significant bit of the most significant portion of the first vector element; shift the second vector element by a second shift count to align the third sub-byte element to a least significant bit of the least significant portion of the second vector element, and the fourth sub-byte element to a least significant bit of the most significant portion of the second vector element; and shuffle the first sub-byte and the third sub-byte from the shifted first and second vector elements into a first destination vector element and the second sub-byte and the fourth sub-byte from the shifted first and second vector elements into a second destination vector element to at least partially restore an original sub-byte order of the sub-byte elements.
地址 Santa Clara CA US