发明名称 Deep Learning Training System
摘要 Training large neural network models by providing training input to model training machines organized as multiple replicas that asynchronously update a shared model via a global parameter server is described herein. In at least one embodiment, a system including a model module storing a portion of a model and a deep learning training module that communicates with the model module are configured for asynchronously sending updates to shared parameters associated with the model. The techniques herein describe receiving and processing a batch of data items to calculate updates. Replicas of training machines communicate asynchronously with a global parameter server to provide updates to a shared model and return updated weight values. The model may be modified to reflect the updated weight values. The techniques described herein include computation and communication optimizations that improve system efficiency and scaling of large neural networks.
申请公布号 US2015324690(A1) 申请公布日期 2015.11.12
申请号 US201414492270 申请日期 2014.09.22
申请人 Microsoft Corporation 发明人 Chilimbi Trishul A.;Suzue Yutaka;Apacible Johnson R.;Kalyanaraman Karthik
分类号 G06N3/08;G06N3/063 主分类号 G06N3/08
代理机构 代理人
主权项 1. A system comprising: a computer-readable media storing at least two modules; a processing unit operably coupled to the computer-readable media, the processing unit adapted to execute the at least two modules, the at least two modules comprising: a model module configured to store a portion of a model; anda deep learning training module configured to communicate with the model module and asynchronously sending updates to parameters shared by the model.
地址 Redmond WA US