发明名称 Multi-view object detection using appearance model transfer from similar scenes
摘要 View-specific object detectors are learned as a function of scene geometry and object motion patterns. Motion directions are determined for object images extracted from a training dataset and collected from different camera scene viewpoints. The object images are categorized into clusters as a function of similarities of their determined motion directions, the object images in each cluster are acquired from the same camera scene viewpoint. Zenith angles are estimated for object image poses in the clusters relative to a position of a horizon in the cluster camera scene viewpoint, and azimuth angles of the poses as a function of a relation of the determined motion directions of the clustered images to the cluster camera scene viewpoint. Detectors are thus built for recognizing objects in input video, one for each of the clusters, and associated with the estimated zenith angles and azimuth angles of the poses of the respective clusters.
申请公布号 US8983133(B2) 申请公布日期 2015.03.17
申请号 US201313912391 申请日期 2013.06.07
申请人 International Business Machines Corporation 发明人 Feris Rogerio S.;Pankanti Sharathchandra U.;Siddiquie Behjat
分类号 G06K9/00;H04N5/225 主分类号 G06K9/00
代理机构 Driggs, Hogg, Daugherty & Del Zoppo Co., LPA 代理人 Daugherty Patrick J.;Driggs, Hogg, Daugherty & Del Zoppo Co., LPA
主权项 1. A method for determining motion directions for video dataset object images, the method comprising: estimating directions of motion, through an optical flow process executing on a processor, for each of a plurality of object images that are extracted from a source training video dataset input, for each of a plurality of different camera scene viewpoints, wherein the object images are collected from each of the different camera scene viewpoints; representing each space-time point in the estimated optical flow directions of motion of the objects appearing for each respective camera viewpoint by a four-dimensional vector, the vector comprising a location of the each space-time point in an image plane, a magnitude and a direction of its optical flow; and categorizing the plurality of object images into a plurality of optical flow map clusters as a function of similarities of their determined motion directions by: discarding the space-time points that have an optical flow magnitude that is above or below certain respective fixed thresholds as noise; after the discarding the noise points, randomly sub-sampling and clustering a remainder of the space-time points into the optical flow map clusters by clustering that automatically selects a scale of analysis and a total number of the clusters; and representing different values of the directions of motion of the objects appearing in the scene viewpoint of each optical flow map cluster by a dominant direction of motion of the points within each optical flow map cluster and by a location of the cluster in the image plane.
地址 Armonk NY US