发明名称 Using core words to extract key phrases from documents
摘要 Described is a technology, such as for use in information retrieval, by which key phrases (e.g., the phrases most relevant to a document) are extracted from a document based upon core words in that document (e.g., the words most relevant to the document). Various relevance features of each candidate word may be used to score and rank the candidate words relative to one another and thereby determine the core word or core words. The core word or words may be used to filter a document's phrases into candidate phrases, and/or used to determine core word feature values associated with each candidate phrase. The features of each candidate phrase, one or more of which may be based on the presence or absence of core words in the candidate phrase, are used to rank the candidate phrases, with the top-ranked candidate phrases being the key phrases associated with the document.
申请公布号 US7895205(B2) 申请公布日期 2011.02.22
申请号 US20080041677 申请日期 2008.03.04
申请人 MICROSOFT CORPORATION 发明人 QIN SHI;YUE PEI
分类号 G06F7/00;G06F17/30 主分类号 G06F7/00
代理机构 代理人
主权项
地址