发明名称 Multi-view object detection using appearance model transfer from similar scenes
摘要 View-specific object detectors are learned as a function of scene geometry and object motion patterns. Motion directions are determined for object images extracted from a training dataset and collected from different camera scene viewpoints. The object images are categorized into clusters as a function of similarities of their determined motion directions, the object images in each cluster are acquired from the same camera scene viewpoint. Zenith angles are estimated for object image poses in the clusters relative to a position of a horizon in the cluster camera scene viewpoint, and azimuth angles of the poses as a function of a relation of the determined motion directions of the clustered images to the cluster camera scene viewpoint. Detectors are thus built for recognizing objects in input video, one for each of the clusters, and associated with the estimated zenith angles and azimuth angles of the poses of the respective clusters.
申请公布号 US9224046(B2) 申请公布日期 2015.12.29
申请号 US201514599616 申请日期 2015.01.19
申请人 International Business Machines Corporation 发明人 Feris Rogerio S.;Pankanti Sharathchandra U.;Siddiquie Behjat
分类号 G06K9/00;G06K9/62;G06T7/00;G06T7/20;H04N5/225 主分类号 G06K9/00
代理机构 Driggs, Hogg, Daugherty & Del Zoppo Co., LPA 代理人 Daugherty Patrick J.;Driggs, Hogg, Daugherty & Del Zoppo Co., LPA
主权项 1. A computer-implemented method for learning a view-specific object detector, the method comprising executing on a central processing unit the steps of: determining a position of a horizon in a target camera viewpoint scene; determining a motion direction for an object within an image of the target camera viewpoint scene; determining a zenith angle for a pose of the object within the target scene relative to the estimated target camera viewpoint scene horizon; determining an azimuth angle for a pose of the object within the target scene relative to the determined motion direction for the object within the image of the target camera viewpoint scene; and selecting one of a plurality of built detectors for recognizing objects in video data, as a function of the selected built detector having an associated cluster zenith angle that best matches the zenith angle determined for the pose of the object relative to cluster zenith angles that are associated with each of others of the plurality of built detectors, and an associated cluster azimuth angle that best matches the azimuth angle determined for the pose of the object relative to azimuth angles associated with each of the others of the plurality of built detectors.
地址 Armonk NY US