发明名称 |
Generating n-gram clusters associated with events |
摘要 |
Methods, systems, and apparatus, including computer programs for receiving a set of content items, each content item including a time reference and digital content that is distributed using one or more computer-implemented services, for each content item: determining an event time based on a respective time reference, identifying one or more n-grams based on text of the content item, associating each of the one or more n-grams with the event time, and including the one or more n-grams in a superset of n-grams, the superset of n-grams including n-grams provided from one or more of the content items in the set of content items, generating one or more n-gram clusters based on the superset of n-grams, each n-gram cluster providing a description of an event and including at least one n-gram and an associated event time and storing each of the one or more n-gram clusters in a cluster database. |
申请公布号 |
US9152692(B2) |
申请公布日期 |
2015.10.06 |
申请号 |
US201213536314 |
申请日期 |
2012.06.28 |
申请人 |
Google Inc. |
发明人 |
Schueppert Michael Jens;Thakur Kumar Mayur |
分类号 |
G06F17/30 |
主分类号 |
G06F17/30 |
代理机构 |
Fish & Richardson P.C. |
代理人 |
Fish & Richardson P.C. |
主权项 |
1. A computer-implemented method executed using one or more processors, the method comprising:
receiving a set of content items, each content item in the set of content items including a time reference and comprising digital content that is distributed using one or more computer-implemented services; for each content item in the set of content items:
identifying i) a timestamp associated with a distribution of the content item and ii) a time reference indicated within the content item;based on i) the timestamp associated with the distribution of the content item and ii) the time reference indicated within the content item, determining an event time;identifying one or more n-grams based on text of the content item,associating each of the one or more n-grams with the event time, andincluding the one or more n-grams in a superset of n-grams, the superset of n-grams comprising n-grams provided from one or more of the content items in the set of content items; generating one or more n-gram clusters based on the superset of n-grams, each n-gram cluster providing a description of an event and comprising at least one n-gram and an associated event time; storing each of the one or more n-gram clusters in a cluster database; receiving a search query at an input time; determining i) that the search query is included in a first n-gram cluster of the one or more n-gram clusters, and ii) that the input time is within a threshold time period with respect to the associated event time of the first n-gram cluster, and in response to the determining, providing a first set of search results associated with one or more content items related to the first n-gram cluster; and determining i) that the search query is included in a second n-gram cluster of the one or more n-gram clusters and ii) that the input time is not within a threshold time period with respect to the associated event time of the second n-gram cluster, and in response to the determining, providing a second set of search results responsive to the search query, the second set of search results differing from the first set of search results. |
地址 |
Mountain View CA US |