发明名称 |
Providing vector sub-byte decompression functionality |
摘要 |
Methods, apparatus, instructions and logic provide SIMD vector sub-byte decompression functionality. Embodiments include shuffling a first and second byte into the least significant portion of a first vector element, and a third and fourth byte into the most significant portion. Processing continues shuffling a fifth and sixth byte into the least significant portion of a second vector element, and a seventh and eighth byte into the most significant portion. Then by shifting the first vector element by a first shift count and the second vector element by a second shift count, sub-byte elements are aligned to the least significant bits of their respective bytes. Processors then shuffle a byte from each of the shifted vector elements' least significant portions into byte positions of a destination vector element, and from each of the shifted vector elements' most significant portions into byte positions of another destination vector element. |
申请公布号 |
US9405539(B2) |
申请公布日期 |
2016.08.02 |
申请号 |
US201313956347 |
申请日期 |
2013.07.31 |
申请人 |
Intel Corporation |
发明人 |
Uliel Tal;Ould-Ahmed-Vall Elmoustapha;Willhalm Thomas;Valentine Robert |
分类号 |
G06F9/315;G06F9/30 |
主分类号 |
G06F9/315 |
代理机构 |
Lowenstein Sandler LLP |
代理人 |
Lowenstein Sandler LLP |
主权项 |
1. A processor comprising:
a decoder to decode a first instruction into a decoded first instruction, wherein the first instruction specifies a vector sub-byte decompression operation, a destination vector, a source of sub-byte elements and a sub-byte element size; and logic circuitry of an execution unit coupled to the decoder that, responsive to the decoded first instruction, is to: shuffle from the source, a first two bytes containing a first sub-byte element of a first bit alignment into a least significant portion of a first vector element, and a second two bytes containing a second sub-byte element of the first bit alignment into a most significant portion of the first vector element; shuffle from the source, a third two bytes containing a third sub-byte element of a second bit alignment into a least significant portion of a second vector element, and a fourth two bytes containing a fourth sub-byte element of the second bit alignment into a most significant portion of the second vector element; shift the first vector element by a first shift count to align the first sub-byte element to a least significant bit of the least significant portion of the first vector element, and the second sub-byte element to a least significant bit of the most significant portion of the first vector element; shift the second vector element by a second shift count to align the third sub-byte element to a least significant bit of the least significant portion of the second vector element, and the fourth sub-byte element to a least significant bit of the most significant portion of the second vector element; and shuffle the first sub-byte and the third sub-byte from the shifted first and second vector elements into a first destination vector element and the second sub-byte and the fourth sub-byte from the shifted first and second vector elements into a second destination vector element to at least partially restore an original sub-byte order of the sub-byte elements. |
地址 |
Santa Clara CA US |