发明名称 Generating n-gram clusters associated with events
摘要 Methods, systems, and apparatus, including computer programs for receiving a set of content items, each content item including a time reference and digital content that is distributed using one or more computer-implemented services, for each content item: determining an event time based on a respective time reference, identifying one or more n-grams based on text of the content item, associating each of the one or more n-grams with the event time, and including the one or more n-grams in a superset of n-grams, the superset of n-grams including n-grams provided from one or more of the content items in the set of content items, generating one or more n-gram clusters based on the superset of n-grams, each n-gram cluster providing a description of an event and including at least one n-gram and an associated event time and storing each of the one or more n-gram clusters in a cluster database.
申请公布号 US9152692(B2) 申请公布日期 2015.10.06
申请号 US201213536314 申请日期 2012.06.28
申请人 Google Inc. 发明人 Schueppert Michael Jens;Thakur Kumar Mayur
分类号 G06F17/30 主分类号 G06F17/30
代理机构 Fish & Richardson P.C. 代理人 Fish & Richardson P.C.
主权项 1. A computer-implemented method executed using one or more processors, the method comprising: receiving a set of content items, each content item in the set of content items including a time reference and comprising digital content that is distributed using one or more computer-implemented services; for each content item in the set of content items: identifying i) a timestamp associated with a distribution of the content item and ii) a time reference indicated within the content item;based on i) the timestamp associated with the distribution of the content item and ii) the time reference indicated within the content item, determining an event time;identifying one or more n-grams based on text of the content item,associating each of the one or more n-grams with the event time, andincluding the one or more n-grams in a superset of n-grams, the superset of n-grams comprising n-grams provided from one or more of the content items in the set of content items; generating one or more n-gram clusters based on the superset of n-grams, each n-gram cluster providing a description of an event and comprising at least one n-gram and an associated event time; storing each of the one or more n-gram clusters in a cluster database; receiving a search query at an input time; determining i) that the search query is included in a first n-gram cluster of the one or more n-gram clusters, and ii) that the input time is within a threshold time period with respect to the associated event time of the first n-gram cluster, and in response to the determining, providing a first set of search results associated with one or more content items related to the first n-gram cluster; and determining i) that the search query is included in a second n-gram cluster of the one or more n-gram clusters and ii) that the input time is not within a threshold time period with respect to the associated event time of the second n-gram cluster, and in response to the determining, providing a second set of search results responsive to the search query, the second set of search results differing from the first set of search results.
地址 Mountain View CA US