发明名称 |
CONCEPT BASED CROSS MEDIA INDEXING AND RETRIEVAL OF SPEECH DOCUMENTS |
摘要 |
<p num="1">Indexing, searching, and retrieving the content of speech documents (including but not limited to recorded books, audio broadcasts, recorded conversations) is accomplished by finding and retrieving speech documents that are related to a query term at a conceptual level, even if the speech documents does not contain the spoken (or textual) query terms. Concept-based cross-media information retrieval is used. A term-phoneme/document matrix is constructed from a training set of documents. Documents are then added to the matrix constructed from the training data. Singular Value Decomposition is used to compute a vector space from the term-phoneme/document matrix. The result is a lower-dimensional numerical space where term-phoneme and document vectors are related conceptually as nearest neighbors. A query engine computes a cosine value between the query vector and all other vectors in the space and returns a list of those term-phonemes and/or documents with the highest cosine value.
|
申请公布号 |
CA2653932(C) |
申请公布日期 |
2013.03.19 |
申请号 |
CA20072653932 |
申请日期 |
2007.06.01 |
申请人 |
TELCORDIA TECHNOLOGIES, INC. |
发明人 |
BEHRENS, CLIFFORD A.;EGAN, DENNIS;BASSU, DEVASIS |
分类号 |
G06F17/30 |
主分类号 |
G06F17/30 |
代理机构 |
|
代理人 |
|
主权项 |
|
地址 |
|