发明名称 |
LATENT DIRICHLET ALLOCATION-BASED PARAMETER INFERENCE METHOD, CALCULATION DEVICE AND SYSTEM |
摘要 |
<p>The embodiment of the present invention relates to the field of information retrieval. Provided are a Latent Dirichlet Allocation (LDA)-based parameter inference method, calculation device and system for solving the problem of poor LDA model solution precision due to the inaccurate number of topics inputted manually. The method comprises: calculating an LDA model and obtaining a probability distribution according to a set initial first hyper-parameter, initial second hyper-parameter, initial number of topics, counting matrix of initial global documents and topics and counting matrix of a main body and words; using the expectation maximization algorithm to obtain the number of topics, the first hyper-parameter and the second hyper-parameter maximizing the value of the log likelihood function of the probability distribution; determining whether the number of topics, the first hyper-parameter and the second hyper-parameter are converging and, if not, putting the number of topics, the first hyper-parameter and the second hyper-parameter into the LDA model for calculation until the optimal number of topics, the optimal first hyper-parameter and the optimal second hyper-parameter maximizing the value of the log likelihood function of the probability distribution converges. The embodiments of the present invention are applicable to document parameter inference.</p> |
申请公布号 |
WO2012106885(A1) |
申请公布日期 |
2012.08.16 |
申请号 |
WO2011CN77097 |
申请日期 |
2011.07.13 |
申请人 |
HUAWEI TECHNOLOGIES CO., LTD.;VLADISLAV, KOPYLOV;WEN, LIUFEI;SHE, GUANGYU |
发明人 |
VLADISLAV, KOPYLOV;WEN, LIUFEI;SHE, GUANGYU |
分类号 |
G06F17/10 |
主分类号 |
G06F17/10 |
代理机构 |
|
代理人 |
|
主权项 |
|
地址 |
|