发明名称 INSTRUCTIONS AND LOGIC TO VECTORIZE CONDITIONAL LOOPS
摘要 A processing device to provide vectorization of conditional loops includes vector physical registers to store a source vector having a first plurality of n data fields, and a destination vector comprising a second plurality of data fields corresponding to the first plurality of data fields, wherein each of the second plurality of data fields corresponds to a mask value in a vector conditions mask. The processing device includes a decode stage to decode a first processor instruction specifying a vector expand operation and a data partition size, and execution units to set elements of the source vector to n count values, obtain a decisions vector, generate the vector conditions mask according to the decisions vector, and copy data from consecutive vector elements in the source vector, into unmasked vector elements of the destination vector, without copying data from the source vector into masked vector elements of the destination vector.
申请公布号 US2017052785(A1) 申请公布日期 2017.02.23
申请号 US201615344836 申请日期 2016.11.07
申请人 Intel Corporation 发明人 Uliel Tal;Ould-Ahmed-Vall Elmoustapha;Toll Bret L.
分类号 G06F9/30;G06F15/80 主分类号 G06F9/30
代理机构 代理人
主权项 1. A processor comprising: mask physical registers to store a vector conditions mask; vector physical registers to store: a source vector having a first plurality of n data fields having a variable partition size of m bytes, anda destination vector comprising a second plurality of data fields corresponding to the first plurality of data fields, wherein each of the second plurality of data fields in the destination vector corresponds to a mask value in said vector conditions mask; a decode stage to decode processor instructions, a first processor instruction specifying a vector expand operation and a data partition size; and one or more execution units, responsive to the decoded processor instructions, to: set elements of the source vector to n count values;obtain a decisions vector;generate said vector conditions mask according to the decisions vector; andresponsive to executing the first processor instruction, copy data from consecutive vector elements in the source vector, into unmasked vector elements of the destination vector, without copying data from the source vector into masked vector elements of the destination vector.
地址 Santa Clara CA US