Data shuffling in a non-uniform memory access device,申请号US201414147917-传众专利搜索

发明名称	Data shuffling in a non-uniform memory access device
摘要	Embodiments relate to the orchestration of data shuffling among memory devices of a non-uniform memory access device. An aspect includes a method of orchestrated shuffling of data in a non-uniform memory access device includes running an application on a plurality of threads executing on a plurality of processing nodes and identifying data to be shuffled among the plurality of processing nodes. The method includes registering the data to be shuffled and generating a plan for orchestrating the shuffling of the data. The method further includes disabling cache coherency of cache memory associated with the processing nodes and shuffling the data among all of the memory devices upon disabling the cache coherency, the shuffling performed based on the plan for orchestrating the shuffling. The method further includes restoring the cache coherency of the cache memory based on completing the shuffling of the data among all of the memory devices.
申请公布号	US9256534(B2)	申请公布日期	2016.02.09
申请号	US201414147917	申请日期	2014.01.06
申请人	International Business Machines Corporation	发明人	Li Yinan;Lohman Guy M.;Mueller Rene;Pandis Ippokratis;Raman Vijayshankar
分类号	G06F12/00;G06F12/08	主分类号	G06F12/00
代理机构	Cantor Colburn LLP	代理人	Cantor Colburn LLP ;Butler Bryan
主权项	1. A method of orchestrated shuffling of data in a non-uniform memory access device that includes a plurality of processing nodes, each processing node directly connected to at least one memory device and indirectly connected to at least one of the other memory devices via at least one of the other processing nodes, the method comprising: running an application on a plurality of threads executing on the plurality of processing nodes; identifying, by the threads, data to be shuffled from source threads running on source processing nodes among the processing nodes to target threads executing on target processing nodes among the processing nodes; registering, by the plurality of threads, the data to be shuffled among threads by transferring the data from source memory devices directly connected to the source processing nodes to target memory devices directly connected to the target processing nodes; generating a plan for orchestrating the shuffling of the data among all memory devices associated with the plurality of threads; disabling cache coherency of cache memory associated with the plurality of processing nodes; performing the shuffling of the data among all memory devices upon disabling the cache coherency, based on the plan for orchestrating the shuffling; and restoring the cache coherency of the cache memory based on completing the shuffling of the data among all memory devices.
地址	Armonk NY US