发明名称 METHOD AND SYSTEM OF ADDING PUNCTUATION AND ESTABLISHING LANGUAGE MODEL
摘要 <p>A method of processing information content based on a language model is performed at a computer. The method includes the following steps: identifying a plurality of expressions in the information content that is queued to be processed; dividing the plurality of expressions into a plurality of characteristic units according to semantic features and predetermined characteristics associated with each of the plurality of characteristic units, each characteristic unit including a subset of the plurality of expressions and the predetermined characteristics at least including a respective integer number of expressions that are included in the characteristic unit; extracting, from the language model, a plurality of probabilities for a plurality of punctuation marks associated with each of the plurality of characteristic units; and in accordance with the extracted probabilities, associating a respective punctuation mark with each of the plurality of characteristic units included in the information content.</p>
申请公布号 WO2014117553(A1) 申请公布日期 2014.08.07
申请号 WO2013CN86618 申请日期 2013.11.06
申请人 TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED 发明人 LIU, HAIBO;WANG, ERYU;ZHANG, XIANG;LU, LI;YUE, SHUAI;LIU, QIUGE;CHEN, BO;LIU, JIAN;LI, LU
分类号 G10L15/26;G06F17/30;G10L15/02;G10L15/14;G10L15/18 主分类号 G10L15/26
代理机构 代理人
主权项
地址