发明名称 Identifying Relevant Content Items using a Deep-Structured Neural Network
摘要 A computer-implemented technique is described herein for identifying one or more content items that are relevant to an input linguistic item (e.g., an input query) using a deep-structured neural network, trained based on a corpus of click-through data. The input linguistic item has a collection of input tokens. The deep-structured neural network includes a first part that produces word embeddings associated with the respective input tokens, a second part that generates state vectors that capture context information associated with the input tokens, and a third part which distinguishes important parts of the input linguistic item from less important parts. The second part of the deep-structured neural network can be implemented as a recurrent neural network, such as a bi-directional neural network. The third part of the deep-structured neural network can generate a concept vector by forming a weighted sum of the state vectors.
申请公布号 US2017124447(A1) 申请公布日期 2017.05.04
申请号 US201514926617 申请日期 2015.10.29
申请人 Microsoft Technology Licensing, LLC 发明人 Chang Keng-hao;Zhang Ruofei;Zhai Shuangfei
分类号 G06N3/04;G06N3/08 主分类号 G06N3/04
代理机构 代理人
主权项 1. A system for identifying at least one content item, implemented by a processing engine that includes one or more computing devices, comprising: a user interface component configured to receive an input linguistic item from a user computing device, the input linguistic item having a set of input tokens; an interpretation component configured to interpret the input linguistic item using a semantic transformation component that is implemented as a deep-structured neural network having three parts, the first part of the deep-structured neural network being configured to generate word embeddings associated with the respective input tokens;the second part of the deep-structured neural network being configured to generate state vectors based on the respective word embeddings, the state vectors reflecting respective contexts of the input tokens within the input linguistic item; andthe third part of the deep-structured neural network providing a noise-identification mechanism that is configured to generate probability information based on the state vectors, the probability information specifying a relative importance measure associated with each input token that conveys an extent to which that input token contributes to an expression of an underlying meaning associated with the input linguistic item; and a response-generating component configured to generate at least one output result item based, at least in part, on the probability information, said at least one output result item identifying at least one content item that is relevant to the input linguistic item, the user interface component being further configured to provide said at least one output result item to the user computing device.
地址 Redmond WA US