发明名称 MULTILINGUAL SEARCH FOR TRANSLITERATED CONTENT
摘要 The technique described herein enables a user to submit a search query in both a native script and its foreign script (e.g., Roman script) transliteration and return relevant results in both scripts while taking care of the spelling variations in transliterated forms. The technique crawls the World Wide Web for data in both the native script and foreign script transliterated forms of the data. It uses a transliteration engine to generate native script equivalents of the foreign script transliterated data and disambiguates the data in native script. The unique native script word forms are then used to jointly index the data in both scripts. If the query is in native script, it is directly searched for in the index, otherwise the transliterated query is first converted into native script form(s) and then searched in the indexed database to retrieve and rank results in both the scripts.
申请公布号 WO2012149500(A2) 申请公布日期 2012.11.01
申请号 WO2012US35701 申请日期 2012.04.28
申请人 MICROSOFT CORPORATION 发明人 CHOUDHURY, MONOJIT;BALI, KALIKA;GUPTA, KANIKA;DATHA, NARENDRANATH
分类号 G06F17/30;G06F17/28 主分类号 G06F17/30
代理机构 代理人
主权项
地址