发明名称 CONSTRUCTION OF A LEXICON FOR A SELECTED CONTEXT
摘要 Various technologies pertaining to constructing a lexicon for a defined context are set forth herein. Social media text is acquired, where the social media text has contextual data that corresponds thereto. The social media text is encoded to form encoded text (in Unicode), and the contextual data is assigned to the encoded text. A text corpus for a defined context is formed by filtering the encoded text based upon contextual data, such as location. Frequency of occurrence of words or phrases in the text corpus is used to identify words or phrases that are to be included in the lexicon.
申请公布号 US2016110341(A1) 申请公布日期 2016.04.21
申请号 US201514716244 申请日期 2015.05.19
申请人 Microsoft Technology Licensing, LLC 发明人 Chuang Darren;Li Jingmei;Liu Zhen;Bobby Mak Chiu Chun
分类号 G06F17/27;G06F17/22;G06F5/01 主分类号 G06F17/27
代理机构 代理人
主权项 1. A computing system comprising: a processor; and memory that comprises a lexicon generator system that is executed by the processor, the lexicon generator system configured to generate a lexicon used in context text, the lexicon generator system configured to include at least one of a mixed language word or a mixed language phrase in the lexicon based upon frequency of occurrence of the mixed language word or the mixed language phrase in the context text.
地址 Redmond WA US