发明名称 SPELL CORRECTION WITH HIDDEN MARKOV MODELS ON ONLINE SOCIAL NETWORKS
摘要 In one embodiment, a method includes receiving a search query including one or more n-grams, where the n-grams include one or more misspelled n-grams, identifying one or more variant-tokens for each misspelled n-gram, calculating a feature value for each identified variant-token based at least on the identified variant-token, the misspelled n-gram, and one or more variant-tokens corresponding to one or more n-grams preceding the misspelled n-gram, generating one or more unique combinations of the n-grams and variant-tokens, calculating a sequence-score for each unique combination based at least in part on the calculated feature values of the variant-tokens of the unique combination, generating one or more corrected queries, where each corrected query includes a unique combination having a sequence-score greater than a threshold sequence-score, and sending one or more of the corrected queries to a user for display.
申请公布号 US2016299883(A1) 申请公布日期 2016.10.13
申请号 US201514684137 申请日期 2015.04.10
申请人 Facebook, Inc. 发明人 Zhu Hongcheng;Bernhardt Daniel
分类号 G06F17/27;H04L29/06 主分类号 G06F17/27
代理机构 代理人
主权项 1. A method comprising, by one or more computing devices: receiving, from a client system of a user of an online social network, a search query comprising one or more n-grams, wherein the n-grams comprise one or more misspelled n-grams; identifying, for each misspelled n-gram, one or more variant-tokens; calculating, for each identified variant-token of a misspelled n-gram, a feature value based at least on the identified variant-token, the misspelled n-gram, and one or more variant-tokens corresponding to one or more n-grams preceding the misspelled n-gram; generating one or more unique combinations of the n-grams and variant-tokens, wherein each unique combination comprises a variant-token corresponding to each misspelled n-gram; calculating a sequence-score for each unique combination based at least in part on the calculated feature values of the variant-tokens of the unique combination; and generating one or more corrected queries, each corrected query comprising a unique combination having a sequence-score greater than a threshold sequence-score; and sending, to the client system of the user for display in response to receiving the search query, one or more of the corrected queries.
地址 Menlo Park CA US