发明名称 LANGUAGE MODEL TRAINED USING PREDICTED QUERIES FROM STATISTICAL MACHINE TRANSLATION
摘要 A Statistical Machine Translation (SMT) model is trained using pairs of sentences that include content obtained from one or more content sources (e.g. feed(s)) with corresponding queries that have been used to access the content. A query click graph may be used to assist in determining candidate pairs for the SMT training data. All/portion of the candidate pairs may be used to train the SMT model. After training the SMT model using the SMT training data, the SMT model is applied to content to determine predicted queries that may be used to search for the content. The predicted queries are used to train a language model, such as a query language model. The query language model may be interpolated other language models, such as a background language model, as well as a feed language model trained using the content used in determining the predicted queries.
申请公布号 US2014350931(A1) 申请公布日期 2014.11.27
申请号 US201313902470 申请日期 2013.05.24
申请人 Microsoft Corporation 发明人 Levit Michael;Hakkani-Tur Dilek;Tur Gokhan
分类号 G10L15/06 主分类号 G10L15/06
代理机构 代理人
主权项 1. A method for training a language model, comprising: accessing a statistical machine translation (SMT) model trained using pairs that each include a sentence obtained from a content source and a query previously used to access content associated with the sentence; receiving content from a content source; applying the SMT model to the content to determine predicted queries; and training a language model using the predicted queries.
地址 Redmond WA US