摘要 |
A SIMD smart memory comprise addressable registers and functionality of random access memory, as well as processing elements made of addressable and internal registers, neighboring connectivity between the processing elements, and a lattice-like element activation scheme. This memory carries out parallel processing within itself of those simple parallel operations that are universal to all elements, or only involve neighboring memory elements. Many common algorithms using this memory are discussed. For an array of N items, it reduces the total instruction cycle count of universal operations such as insertion and match finding to ~1, local operations, such as filtering and template matching, to ~local operation size, and global operations such as sum and sorting to ~sqrt(N). Particularly, it eliminates most streaming activities for data processing purpose on the system bus. Yet it is easy to use, pin and functional compatible with a random accessible conventional memory, and practical for implementation. In addition, some new designs for components, such as all-line decoder, general decoder, parallel shifter, parallel comparator, parallel adder and parallel divider, are presented.
|