发明名称 BOOTSTRAPPING TEXT CLASSIFIERS BY LANGUAGE ADAPTATION
摘要 <p>Training data in one language is leveraged to develop classifiers for multiple languages under circumstances where all of those classifiers will be performing the same kind of classification task, but relative to linguistically different sets of texts, thereby saving the cost of manually labeling a different set of training data for each language. Classification knowledge is learned for a source language in which training data are available. That knowledge is transferred to another target language's classifier through the integration of language transition knowledge. The transferred model is adjusted to better fit the target language. In one technique, leveraging one language's classification knowledge in order to generate a classifiers for another language involves training a text classifier in a source language, transferring the learned classification knowledge from the source language to another target language using language translation techniques, and further tuning the transferred model to better fit the target language text.</p>
申请公布号 WO2011100862(A1) 申请公布日期 2011.08.25
申请号 WO2010CN00225 申请日期 2010.02.22
申请人 YAHOO! INC.;SHI, LEI;TIAN, MINGJUN 发明人 SHI, LEI;TIAN, MINGJUN
分类号 G06F17/28 主分类号 G06F17/28
代理机构 代理人
主权项
地址