发明名称 Sound and image segment sorting device and method
摘要 A sound segment sorting unit (103) sorts the sound segments of a video. An image segment sorting unit (104) sorts the image segments of the video. A multiple sorting result generation unit (105) generates a plurality of sound segment sorting results and/or a plurality of image segment sorting results. A sorting result pair generation unit (106) generates a plurality of sorting result pairs of the sorting results as the candidates of the optimum segment sorting result of the video. A sorting result output unit (108) compares the sorting result comparative scores of the sorting result pairs calculated by a sorting result comparative score calculation unit (107) and thus outputs a sound segment sorting result and an image segment sorting result having good correspondence. This allows to accurately sort, for each object, a plurality of sound segments and a plurality of image segments contained in the video without adjusting parameters in advance.
申请公布号 US9053751(B2) 申请公布日期 2015.06.09
申请号 US201013510811 申请日期 2010.11.05
申请人 NEC CORPORATION 发明人 Terao Makoto;Koshinaka Takafumi
分类号 G06F17/30;G11B27/28 主分类号 G06F17/30
代理机构 Sughrue Mion, PLLC 代理人 Sughrue Mion, PLLC
主权项 1. A sound and image segment sorting device comprising: a sound segment sorting unit that sorts a plurality of sound segments contained in a video based on an arbitrary operation condition and generates sound segment sorting results, each of said sound segments sorting results being a collection of sound clusters including at least one sound cluster as an element, and each of said at least one sound cluster being a collection of sound segments determined to be sound phenomena representing a same object; an image segment sorting unit that sorts a plurality of image segments contained in the video based on an arbitrary operation condition and generates image segment sorting results, each of said image segment sorting results being a collection of image clusters including at least one image cluster as an element, and each of said at least one image cluster being a collection of image segments determined to be image phenomena representing a same object; a multiple sorting result generation unit that generates at least one of a plurality of sound segment sorting results and a plurality of image segment sorting results by applying a plurality of different operation conditions to at least one of said sound segment sorting unit and said image segment sorting unit; a sorting result pair generation unit that generates a plurality of sorting result pairs each including one sound segment sorting result and one image segment sorting result based on the plurality of sound segment sorting results and the plurality of image segment sorting results obtained by said multiple sorting result generation unit; a sorting result comparative score calculation unit that calculates, for each sorting result pair, a sorting result comparative score representing a fitness between a sound segment sorting result and an image segment sorting result included in the sorting result pair; and a sorting result output unit that selects a sorting result pair having a high fitness among said plurality of sorting result pairs based on the sorting result comparative score and outputs a sound segment sorting result and an image segment sorting result included in the sorting result pair.
地址 Tokyo JP