发明名称 IMPLEMENTATION OF UNSUPERVISED TOPIC SEGMENTATION IN A DATA COMMUNICATIONS ENVIRONMENT
摘要 A method is provided in one example embodiment and includes extracting sentences from data, which comprises a speech transcript; tokenizing the plurality of sentences to develop for each of the plurality of sentences a sentence vector and at least one feature vector; and performing topic segmentation on the speech transcript using the sentence vectors and feature vectors, the topic segmentation resulting in a listing of segments corresponding to the speech transcript. In certain embodiments, the feature vector may be at least one of a cue word feature vector, a speaker change feature vector, and a scene change feature vector.
申请公布号 US2014214402(A1) 申请公布日期 2014.07.31
申请号 US201313750049 申请日期 2013.01.25
申请人 Diao Qian;Gadde Venkata Ramana Rao 发明人 Diao Qian;Gadde Venkata Ramana Rao
分类号 G06F17/21 主分类号 G06F17/21
代理机构 代理人
主权项 1. A method, comprising: extracting a plurality of sentences from data, which comprises a speech transcript; tokenizing the plurality of sentences to develop for each of the plurality of sentences a sentence vector and at least one feature vector; and performing topic segmentation on the speech transcript using the sentence vectors and feature vectors, wherein the topic segmentation is to result in a listing of segments corresponding to the speech transcript.
地址 San Jose CA US