发明名称 Generating snippets based on content features
摘要 Systems, methods, and computer storage media having computer-executable instructions embodied thereon that facilitate generation of snippets. In embodiments, text features within a keyword-sentence window are identified. The text features are utilized to determine break features that indicate favorability of breaking at a particular location of the keyword-sentence window. The break features are used to recognize features of partial snippets such that a snippet score to indicate the strength of the partial snippet can be calculated. Snippet scores associated with partial snippets are compared to select an optimal snippet, that is, the snippet having the highest snippet score.
申请公布号 US8788260(B2) 申请公布日期 2014.07.22
申请号 US201012777323 申请日期 2010.05.11
申请人 Microsoft Corporation 发明人 Nygaard Valerie Rose;Turchetto Riccardo;Chan Joanna Mun Yee;Biemann Christian;Ahn David Dongjah;Burbank Andrea Ryerson;Pan Feng;Converse Timothy McDonnell;Reinhold James Michael;King Tracy Holloway
分类号 G06F17/27 主分类号 G06F17/27
代理机构 代理人 Ream Dave;Barker Doug;Minhas Micky
主权项 1. One or more computer media devices having computer-executable instructions embodied thereon, that when executed, cause a computing device to perform a method for facilitating generation of snippets provided in association with search results, the method comprising: referencing a keyword-sentence window comprising a sequence of tokens including one or more keywords that match one or more query terms; identifying a part-of-speech for one or more tokens within the keyword-sentence window; utilizing the part-of-speech corresponding with each of the one or more tokens to identify one or more text features associated with a span including two consecutive tokens, wherein at least one text feature comprises a bigram type that is a sequence of two parts-of-speech identifiers that correspond with the span of the two consecutive tokens, the one or more text features being used to generate at least one breaking indicator for at least one token that indicates an extent to which it is favorable to break the keyword-sentence window following the corresponding token, wherein the extent to which it is favorable to break the keyword-sentence window following the corresponding token is represented using a scale or rating technique; generating a plurality of partial snippets comprising portions of the keyword-sentence window; for each partial snippet, identifying a snippet feature that indicates a relative strength of truncating the keyword-sentence window in accordance with the corresponding partial snippet, wherein the snippet feature comprises a sum of breaking indicators associated with the partial snippet that each indicate an extent to which it is favorable to break the partial snippet at the corresponding break; and using the snippet features to select a partial snippet from the plurality of partial snippets for display in association with a search result.
地址 Redmond WA US