发明名称 CLUSTERING OF TEXT FOR STRUCTURING OF TEXT DOCUMENTS AND TRAINING OF LANGUAGE MODELS
摘要 The present invention relates to a method, a text segmentation system and a computer program product for clustering of text into text clusters representing a distinct semantic meaning. The text clustering method identifies text portions and assigns text portions to different clusters in such a way that each text cluster refers to one or several semantic topics. The clustering method incorporates an optimization procedure based on a re-clustering procedure evaluating a target function being indicative of the correlation between a text unit and a cluster. The text clustering method makes use of a text emission model and a cluster transition model and makes further use of various smoothing techniques.
申请公布号 WO2005050473(A3) 申请公布日期 2006.07.20
申请号 WO2004IB52406 申请日期 2004.11.12
申请人 PHILIPS INTELLECTUAL PROPERTY & STANDARDS GMBH;KONINKLIJKE PHILIPS ELECTRONICS N. V.;PETERS, JOCHEN 发明人 PETERS, JOCHEN
分类号 G06F17/27;G06F17/30 主分类号 G06F17/27
代理机构 代理人
主权项
地址
您可能感兴趣的专利