发明名称 MODELING TOPICS USING STATISTICAL DISTRIBUTION
摘要 <p><P>PROBLEM TO BE SOLVED: To solve the problem that the corpus of data holding large amounts of information makes it difficult to find out relevant information, and that while assigning tags to documents makes it easy to search relevant information, a conventional document tag assignment method may not be effective for finding of information in some cases. <P>SOLUTION: In one embodiment, modeling topics includes accessing a corpus comprising documents that include words. Words of a document are selected as keywords of the document. The documents are clustered according to the keywords, where each cluster corresponds to a topic. A statistical distribution is generated for a cluster from words of the documents of the cluster. A topic is modeled by using the statistical distribution generated for the cluster corresponding to the topic. <P>COPYRIGHT: (C)2009,JPO&INPIT</p>
申请公布号 JP2009093651(A) 申请公布日期 2009.04.30
申请号 JP20080259631 申请日期 2008.10.06
申请人 FUJITSU LTD 发明人 MARVIT DAVID L;JAIN JAWAHAR;STERGIOU STERGIOS;GILMAN ALEX;ADLER B THOMAS;SIDOROWICH JOHN J;LABROU YANNIS
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址