发明名称 Video processing apparatus, method and system
摘要 According to one embodiment, a video processing apparatus includes an acquisition unit, a first extraction unit, a generation unit, a second extraction unit, a computation unit and a selection unit. The acquisition unit is configured to acquire video streams. A first extraction unit is configured to analyze at least one of the moving pictures and the sounds for each video stream and to extract feature values. A generation unit is configured to generate segments by dividing each video stream, and to generate associated segment groups. A second extraction unit is configured to extract the associated segment groups that number of associated segments is greater than or equal to threshold as common video segment groups. A computation unit is configured to compute summarization score. A selection unit is configured to select segments used for a summarized video as summarization segments from the common video segment groups based on the summarization score.
申请公布号 US8879788(B2) 申请公布日期 2014.11.04
申请号 US201113240278 申请日期 2011.09.22
申请人 Kabushiki, Kaisha Toshiba 发明人 Yamamoto Koji;Hirohata Makoto
分类号 G06K9/00;H04N21/233;H04N21/8549;H04N21/234;H04N21/25;H04N21/258;H04N21/845 主分类号 G06K9/00
代理机构 Ohlandt, Greeley, Ruggiero & Perle, L.L.P. 代理人 Ohlandt, Greeley, Ruggiero & Perle, L.L.P.
主权项 1. A video processing apparatus, comprising: an acquisition unit configured to acquire a plurality of video streams each including moving picture data items and sound data items; a first extraction unit configured to analyze at least one of the moving picture data items and the sound data items for each video stream, and to extract a feature value from the analyzed one, the feature value indicating a common feature between the plurality of video streams; a generation unit configured to generate a plurality of segments by dividing each of the video streams in accordance with change in the feature value, and to generate associated segment groups by associating a plurality of segments between different video streams, each associated segment included in the associated segment groups having a similarity of feature value between the segments greater than or equal to a first threshold value; a second extraction unit configured to extract, from the associated segment groups, one or more common video segment groups in which number of associated segments is greater than or equal to a second threshold value, the number of the associated segments being number of different video streams each including the associated segment which corresponds each of the associated segment groups; a computation unit configured to compute a summarization score indicating a degree of suitability for including a segment of the common video segment group in a summarized video created from a part of the video streams, the summarization score varying with time and being based on the feature value extracted at least one of the moving picture data items and the sound data items; a selection unit configured to select summarization segments to be used for the summarized video from the common video segment groups based on the summarization score; a detection unit configured to compute likelihood of matching the feature value with feature value models of typical shot patterns indicating a combination of shots creating a predetermined scene, and to detect the feature value in which the likelihood is greater than or equal to a third threshold value; and a correction unit configured to generate correction value for the summarization score computed from the feature value in which the likelihood is greater than or equal to the third threshold value, wherein the selection unit selects the summarization segments based on the summarized score in which the correction values are added.
地址 Minato-ku, Tokyo JP