发明名称 MINING BILINGUAL DICTIONARIES FROM MONOLINGUAL WEB PAGES
摘要 Systems and methods for identifying translation pairs from web pages are provided. One disclosed method includes receiving monolingual web page data of a source language, and processing the web page data by detecting the occurrence of a predefined pattern in the web page data, and extracting a plurality of translation pair candidates. Each of the translation pair candidates may include a source language string and target language string. The method may further include determining whether each translation pair candidate is a valid transliteration. The method may also include, for each translation pair that is determined not to be a valid transliteration, determining whether each translation pair candidate is a valid translation. The method may further include adding each translation pair that is determined to be a valid translation or transliteration to a dictionary.
申请公布号 US2009070095(A1) 申请公布日期 2009.03.12
申请号 US20070851402 申请日期 2007.09.07
申请人 MICROSOFT CORPORATION 发明人 GAO JIANFENG
分类号 G06F17/28 主分类号 G06F17/28
代理机构 代理人
主权项
地址