发明名称 Behavioral word segmentation for use in processing search queries
摘要 Substrings within strings, such as words within words, are identified based at least in part on recorded behavior of users that have submitted the strings or substrings as search queries. The behavior may relate to actions taken by the users upon having submitting the search queries. The actions may be actions taken in connection with an electronic marketplace, such as actions related to the consumption of items offered in the electronic marketplace. The identified strings and corresponding substrings are used in connection with processing search queries. The strings and substrings may be used to update a search index and/or to modify received search queries for processing.
申请公布号 US8825620(B1) 申请公布日期 2014.09.02
申请号 US201113159292 申请日期 2011.06.13
申请人 A9.com, Inc. 发明人 Fliedner Gerard;Rose Daniel E.;Evans David Kirk
分类号 G06F17/30 主分类号 G06F17/30
代理机构 Novak Druce Connolly Bove + Quigg LLP 代理人 Novak Druce Connolly Bove + Quigg LLP
主权项 1. A computer-implemented method of processing search queries, comprising: under control of one or more computer systems configured with executable instructions, obtaining, using at least one computing device, behavioral information associated with a plurality of previously-submitted queries, the behavioral information associated with each previously-submitted query indicative of one or more actions taken by one or more of the corresponding searchers in connection with the previously-submitted query;identifying, from the obtained previously-submitted queries, a set of candidate pairs, each candidate pair including a first query and a second query, the first query including a set of separated words and the second query including a single word composed of a connected combination of at least a subset of the set of separated words, wherein the subset includes at least two words;refining, using at least one computing device, the set of candidate pairs by, for each member pair of at least a subset of the set of candidate pairs, at least: obtaining first search results corresponding to the first query of the member pair;obtaining second search results corresponding to the second query of the member pair;based at least in part on the first search results, the second search results, the obtained behavioral information associated with the first query of the member pair, and obtained behavioral information associated with the second query of the member pair, removing the member pair from the set of candidate pairs;updating, based at least in part on the refined set of candidate pairs, a segmentation database that includes a plurality of member pairs, wherein each member pair includes a first member comprising a set of separated words and a second member comprising a single word composed of a connected combination of at least a subset of the set of separated words of the first member;upon receiving a search query, comparing the search query against the plurality of member pairs in the segmentation database;upon identifying a corresponding member pair for the search query in the segmentation database, substituting the search query with the corresponding member pair; andprocessing the search query using the corresponding member pair.
地址 Palo Alto CA US