发明名称 Deep convex network with joint use of nonlinear random projection, restricted boltzmann machine and batch-based parallelizable optimization
摘要 A method is disclosed herein that includes an act of causing a processor to access a deep-structured, layered or hierarchical model, called a deep convex network, retained in a computer-readable medium, wherein the deep-structured model comprises a plurality of layers with weights assigned thereto. This layered model can produce the output serving as the scores to combine with transition probabilities between states in a hidden Markov model and language model scores to form a full speech recognizer. Batch-based, convex optimization is performed to learn a portion of the deep convex network's weights, rendering it appropriate for parallel computation to accomplish the training. The method can further include the act of jointly substantially optimizing the weights, the transition probabilities, and the language model scores of the deep-structured model using the optimization criterion based on a sequence rather than a set of unrelated frames.
申请公布号 US9390371(B2) 申请公布日期 2016.07.12
申请号 US201313919106 申请日期 2013.06.17
申请人 Microsoft Technology Licensing, LLC 发明人 Deng Li;Yu Dong;Acero Alejandro
分类号 G06N3/04;G06N3/08;G06N3/02 主分类号 G06N3/04
代理机构 代理人 Corie Alin;Swain Sandy;Minhas Micky
主权项 1. A method comprising the following computer-executable acts: providing an input sample to a deep convex network, the deep convex network comprising a first module and a second module arranged in a layered configuration, the first module comprising: a first linear input layer; a first nonlinear hidden layer; and a first linear output layer that comprises a first plurality of output units, the second module comprising: a second linear input layer; a second nonlinear hidden layer; and a second output layer that comprises a second plurality of output units, the second linear input layer comprising a first plurality of input units and a second plurality of input units, the first plurality of input units comprising the first plurality of output units of the first module; and recognizing an entity in the input sample based at least in part upon respective outputs of the second plurality of output units.
地址 Redmond WA US
您可能感兴趣的专利