Consistent phrase relevance measures,申请号US201213609257-传众专利搜索

发明名称	Consistent phrase relevance measures
摘要	Two methods for measuring keyword-document relevance are described. The methods receive a keyword and a document as input and output a probability value for the keyword. The first method is a similarity-based approach which uses techniques for measuring similarity between two short-text segments to measure relevance between the keyword and the document. The second method is a regression-based approach based on an assumption that if an out-of-document phrase (the keyword) is semantically similar to an in-document phrase, then relevance scores of the in and out-of document phrases should be close to each other.
申请公布号	US8996515(B2)	申请公布日期	2015.03.31
申请号	US201213609257	申请日期	2012.09.11
申请人	Microsoft Corporation	发明人	Yih Wen-tau;Meek Christopher A.
分类号	G06F7/00;G06F17/30;G06Q30/02	主分类号	G06F7/00
代理机构		代理人	Swain Sandy;Yee Judy;Minhas Micky
主权项	1. One or more computer readable media, not comprising a signal, storing information to enable a computing device to perform a process of predicting a probability that an input out-of-document phrase that is not in a document is relevant to the document, the process comprising: applying an in-document phrase relevance measure to the target document to get a list of in-document keywords in the document and respective associated probabilities of relevance, to the document, of the in-document keywords; representing each in-document keyword as a respective term vector, each term vector computed by expansion of it corresponding in-document keyword, wherein each in-document keyword has a respective term vector and probability; computing a term vector for the out-of-document phrase by performing term expansion on the out-of-document phrase, terms of the term vector for the out-of-document phrase having respective weights; and using a regression model to predict the probability of relevance, to the document, of the out-of-document phrase, wherein the regression model uses the term vectors and probabilities of the in-document keywords, respectively, and uses the term vector of the out-of-document phrase to predict the probability of relevance of the out-of-document phrase, wherein the probability of relevance of the out-of-document phrase is consistent with the probabilities of the in-document keywords.
地址	Redmond WA US