发明名称 DOCUMENT PROCESSING EMPLOYING PROBABILISTIC TOPIC MODELING OF DOCUMENTS REPRESENTED AS TEXT WORDS TRANSFORMED TO A CONTINUOUS SPACE
摘要 A set of word embedding transforms are applied to transform text words of a set of documents into K-dimensional word vectors in order to generate sets or sequences of word vectors representing the documents of the set of documents. A probabilistic topic model is learned using the sets or sequences of word vectors representing the documents of the set of documents. The set of word embedding transforms are applied to transform text words of an input document into K-dimensional word vectors in order to generate a set or sequence of word vectors representing the input document. The learned probabilistic topic model is applied to assign probabilities for topics of the probabilistic topic model to the set or sequence of word vectors representing the input document. A document processing operation such as annotation, classification, or similar document retrieval may be performed using the assigned topic probabilities.
申请公布号 US2013204885(A1) 申请公布日期 2013.08.08
申请号 US201213364535 申请日期 2012.02.02
申请人 CLINCHANT STEPHANE;PERRONNIN FLORENT;XEROX CORPORATION 发明人 CLINCHANT STEPHANE;PERRONNIN FLORENT
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址
您可能感兴趣的专利