METHOD AND SYSTEM OF ADDING PUNCTUATION AND ESTABLISHING LANGUAGE MODEL
摘要
<p>A method of processing information content based on a language model is performed at a computer. The method includes the following steps: identifying a plurality of expressions in the information content that is queued to be processed; dividing the plurality of expressions into a plurality of characteristic units according to semantic features and predetermined characteristics associated with each of the plurality of characteristic units, each characteristic unit including a subset of the plurality of expressions and the predetermined characteristics at least including a respective integer number of expressions that are included in the characteristic unit; extracting, from the language model, a plurality of probabilities for a plurality of punctuation marks associated with each of the plurality of characteristic units; and in accordance with the extracted probabilities, associating a respective punctuation mark with each of the plurality of characteristic units included in the information content.</p>
申请公布号
WO2014117553(A1)
申请公布日期
2014.08.07
申请号
WO2013CN86618
申请日期
2013.11.06
申请人
TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
发明人
LIU, HAIBO;WANG, ERYU;ZHANG, XIANG;LU, LI;YUE, SHUAI;LIU, QIUGE;CHEN, BO;LIU, JIAN;LI, LU