发明名称 OPTIMIZATION OF TEXT-BASED TRAINING SET SELECTION FOR LANGUAGE PROCESSING MODULES
摘要 <p>A device and a method provide for selection of a database from a corpus using an optimization function. The method includes defining a size of a database, calculating a distance using a distance function for each pair' in a set of pairs, and executing an optimization function using the distance to select each entry saved in the database until the number of saved entries equals the size of the database. Each pair in the set of pairs includes either two entries selected from a corpus or one entry selected from a set of previously selected entries and another entry selected from a set of a remaining portion of the corpus;. The distance function may be a Levenshtein distance function or a generalized Levenshtein distance function.</p>
申请公布号 WO2006030302(A1) 申请公布日期 2006.03.23
申请号 WO2005IB02752 申请日期 2005.09.17
申请人 NOKIA CORPORATION;NOKIA, INC.;TIAN, JILEI;NURMINEN, JANI 发明人 TIAN, JILEI;NURMINEN, JANI
分类号 G10L15/18 主分类号 G10L15/18
代理机构 代理人
主权项
地址