发明名称 CONCEPT BASED CROSS MEDIA INDEXING AND RETRIEVAL OF SPEECH DOCUMENTS
摘要 <p num="1">Indexing, searching, and retrieving the content of speech documents (including but not limited to recorded books, audio broadcasts, recorded conversations) is accomplished by finding and retrieving speech documents that are related to a query term at a conceptual level, even if the speech documents does not contain the spoken (or textual) query terms. Concept-based cross-media information retrieval is used. A term-phoneme/document matrix is constructed from a training set of documents. Documents are then added to the matrix constructed from the training data. Singular Value Decomposition is used to compute a vector space from the term-phoneme/document matrix. The result is a lower-dimensional numerical space where term-phoneme and document vectors are related conceptually as nearest neighbors. A query engine computes a cosine value between the query vector and all other vectors in the space and returns a list of those term-phonemes and/or documents with the highest cosine value.
申请公布号 CA2653932(C) 申请公布日期 2013.03.19
申请号 CA20072653932 申请日期 2007.06.01
申请人 TELCORDIA TECHNOLOGIES, INC. 发明人 BEHRENS, CLIFFORD A.;EGAN, DENNIS;BASSU, DEVASIS
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址