发明名称 Multisensor evidence integration and optimization in object inspection
摘要 Video image data is acquired from synchronized cameras having overlapping views of objects moving past the cameras through a scene image in a linear array and with a determined speed. Processing units generate one or more object detections associated with confidence scores within frames of the camera video stream data. The confidence scores are modified as a function of constraint contexts including a cross-frame constraint that is defined by other confidence scores of other object detection decisions from the video data that are acquired by the same camera at different times; a cross-view constraint defined by other confidence scores of other object detections in the video data from another camera with an overlapping field-of-view; and a cross-object constraint defined by a sequential context of a linear array of the objects, spatial attributes of the objects and the determined speed of the movement of the objects relative to the cameras.
申请公布号 US9260122(B2) 申请公布日期 2016.02.16
申请号 US201213489489 申请日期 2012.06.06
申请人 International Business Machines Corporation 发明人 Haas Norman;Li Ying;Otto Charles A.;Pankanti Sharathchandra U.;Trinh Hoang
分类号 B61L23/04;G06T7/20 主分类号 B61L23/04
代理机构 Driggs, Hogg, Daugherty & Del Zoppo Co., LPA 代理人 Daugherty Patrick J.;Driggs, Hogg, Daugherty & Del Zoppo Co., LPA
主权项 1. A computer-implemented method for video analytics object detection optimization, the method comprising executing on a processing unit the steps of: acquiring video image data over time from a plurality of synchronized cameras having overlapping views of a plurality of objects moving past the cameras and through a scene image in a linear array and with a determined speed; generating for each camera a plurality of object detection states that each have different times of frames of the acquired video image data within a plurality of frames of the camera video stream data, wherein each of the object detection states are associated with a confidence score; selecting ones of the plurality of object detection states for each of the different times that have a highest confidence score optimized by using a global energy function to find maximum unary potentials (ψ(skt)) of the object detection states as a function of a cross-frame constraint that is defined by other confidence scores of other object detection states from the video data that are acquired by a same one of the cameras at different times from a time of the object detection state, and of a cross-view constraint (T(skt, slt) that is defined by other confidence scores of other object detection states in the video data from another different one of the cameras that has an overlapping field-of-view with the same one camera and that are also acquired at the different times; and defining an optimal state path for a detection of an object from an initial time to a final time of a duration period comprising the selected ones of the plurality of object detection states that have the highest optimized confidence scores; and wherein the unary potentials ψ(skt) are determined according to: ψ(skt)=f(skt)Πt≠kT(skt,slt); where f(skt) is a confidence score of an object state {skt} returned by an object detector at view {k}; and the processing unit determining the cross-view spatial constraint as a function of the unary potential according to:T⁡(skt,slt)=max⁡(N⁡(skt-slt;θkl),N⁡(skt-slt+∈;θkl)); wherein θkt=[μv (k, l), Σv(k,l)] for views {k} and {l}; “μv” is a four-by-four matrix of mean values; Σv” is a four-by-four covariance matrix; and “ε” is a cross-object spatial constraint that represents an object spacing constant defined by a sequential context of the linear array of the objects determined as a function of spatial attributes of the objects relative to the determined speed of the movement of the cameras relative to the objects.
地址 Armonk NY US