Video content-based retrieval,申请号US201012841078-传众专利搜索

发明名称	Video content-based retrieval
摘要	A method and system for video-content based retrieval is described. A query video depicting an activity is processed using interest point selection to find locations in the video that are relevant to that activity. A set of spatio-temporal descriptors such as self-similarity and 3-D SIFT are calculated within a local neighborhood of the set of interest points. An indexed video database containing videos similar to the query video is searched using the set of descriptors to obtain a set of candidate videos. The videos in the video database are indexed hierarchically using a vocabulary tree or other hierarchical indexing mechanism.
申请公布号	US9361523(B1)	申请公布日期	2016.06.07
申请号	US201012841078	申请日期	2010.07.21
申请人	HRL Laboratories, LLC	发明人	Chen Yang;Medasani Swarup;Jiang Qin;Allen David L.;Lu Tsai-Ching
分类号	G06K9/03;G06K9/00	主分类号	G06K9/03
代理机构	Tope-McKay & Associates	代理人	Tope-McKay & Associates
主权项	1. A data processing system for content-based video retrieval, comprising one or more processors configured to perform operations of: receiving a query video clip comprising a sequence of video frames, where the sequence of video frames depicts an activity; performing an interest point selection on the query video to obtain a set of interest points describing locations in the video frames that are relevant to the activity; calculating a set of spatio-temporal descriptors within a local neighborhood of the set of interest points; searching an indexed video database containing video clips of known activities using the set of spatio-temporal descriptors as calculated from the query video clip to obtain a set of candidate videos which contain activities similar to the activity in the query video, whereby the activity in the query video can be identified as a known activity in the candidate videos; wherein the interest point selection comprises an operation of selecting points which have a high motion content, where the motion content is measured by a degree of difference between pixel values in a pair of consecutive image frames, and where high motion content exists when the measured motion content exceeds a predetermined threshold; where the set of spatio-temporal descriptors are of a type selected from the group consisting of a self-similarity descriptor, and a shift-invariant feature transform descriptor; where each candidate video is given a similarity score describing a degree of similarity between the candidate video and the query video, and the similarity score is evaluated based on relevance computed using visual word frequencies; further configured to perform an operation of indexing a video database containing videos of known activities using a hierarchical indexing mechanism; and where the hierarchical indexing mechanism is a vocabulary tree having leaf nodes, and wherein in indexing the video database, all descriptors for the video clips of known activities are computed, with a closest leaf node in the vocabulary tree for each descriptor being found.
地址	Malibu CA US