发明名称 High-confidence labeling of video volumes in a video sharing service
摘要 A volume identification system identifies a set of unlabeled spatio-temporal volumes within each of a set of videos, each volume representing a distinct object or action. The volume identification system further determines, for each of the videos, a set of volume-level features characterizing the volume as a whole. In one embodiment, the features are based on a codebook and describe the temporal and spatial relationships of different codebook entries of the volume. The volume identification system uses the volume-level features, in conjunction with existing labels assigned to the videos as a whole, to label with high confidence some subset of the identified volumes, e.g., by employing consistency learning or training and application of weak volume classifiers.;The labeled volumes may be used for a number of applications, such as training strong volume classifiers, improving video search (including locating individual volumes), and creating composite videos based on identified volumes.
申请公布号 US8983192(B2) 申请公布日期 2015.03.17
申请号 US201213601802 申请日期 2012.08.31
申请人 Google Inc. 发明人 Sukthankar Rahul;Yagnik Jay
分类号 G06K9/46;H04N9/82;H04N21/234 主分类号 G06K9/46
代理机构 Fenwick & West LLP 代理人 Fenwick & West LLP
主权项 1. A computer-implemented method comprising: identifying, in a plurality of digital videos, a plurality of candidate volumes representing spatio-temporal segments of the digital videos, wherein each of the candidate volumes corresponds to a contiguous sequence of spatial portions of video frames of one of the digital videos, has a starting time and an ending time, and potentially represents a discrete object or action within the video frames, wherein identifying the candidate volumes in the digital videos comprises: stabilizing the digital videos using a video stabilization algorithm andidentifying, as a stable segment, a contiguous sequence of frames in one of the digital videos in which a degree of background motion is below a threshold, using a measure of background motion produced by the video stabilization algorithm; determining, for each of the identified candidate volumes, features characterizing the candidate volume, wherein the features are determined from visual properties of the spatial portions of the video frames contained in the candidate volumes; and assigning a verified label to each volume of a plurality of the identified candidate volumes using the determined features, the verified label indicating a particular object or action represented by the volume to which the verified label is assigned.
地址 Mountain View CA US