发明名称 |
SPELL CORRECTION WITH HIDDEN MARKOV MODELS ON ONLINE SOCIAL NETWORKS |
摘要 |
In one embodiment, a method includes receiving a search query including one or more n-grams, where the n-grams include one or more misspelled n-grams, identifying one or more variant-tokens for each misspelled n-gram, calculating a feature value for each identified variant-token based at least on the identified variant-token, the misspelled n-gram, and one or more variant-tokens corresponding to one or more n-grams preceding the misspelled n-gram, generating one or more unique combinations of the n-grams and variant-tokens, calculating a sequence-score for each unique combination based at least in part on the calculated feature values of the variant-tokens of the unique combination, generating one or more corrected queries, where each corrected query includes a unique combination having a sequence-score greater than a threshold sequence-score, and sending one or more of the corrected queries to a user for display. |
申请公布号 |
US2016299883(A1) |
申请公布日期 |
2016.10.13 |
申请号 |
US201514684137 |
申请日期 |
2015.04.10 |
申请人 |
Facebook, Inc. |
发明人 |
Zhu Hongcheng;Bernhardt Daniel |
分类号 |
G06F17/27;H04L29/06 |
主分类号 |
G06F17/27 |
代理机构 |
|
代理人 |
|
主权项 |
1. A method comprising, by one or more computing devices:
receiving, from a client system of a user of an online social network, a search query comprising one or more n-grams, wherein the n-grams comprise one or more misspelled n-grams; identifying, for each misspelled n-gram, one or more variant-tokens; calculating, for each identified variant-token of a misspelled n-gram, a feature value based at least on the identified variant-token, the misspelled n-gram, and one or more variant-tokens corresponding to one or more n-grams preceding the misspelled n-gram; generating one or more unique combinations of the n-grams and variant-tokens, wherein each unique combination comprises a variant-token corresponding to each misspelled n-gram; calculating a sequence-score for each unique combination based at least in part on the calculated feature values of the variant-tokens of the unique combination; and generating one or more corrected queries, each corrected query comprising a unique combination having a sequence-score greater than a threshold sequence-score; and sending, to the client system of the user for display in response to receiving the search query, one or more of the corrected queries. |
地址 |
Menlo Park CA US |