发明名称 Object tracking in encoded video streams
摘要 Techniques are provided for tracking objects in an encoded video stream based on data directly extracted from the video stream, thus eliminating any need for the stream to be fully or partially decoded. Extracted motion vector and DC coefficient data can be used to provide a rough estimation of which macro-blocks are be associated with a background motion model and which macro-blocks correspond to a foreground object which is moving with respect to the background motion model. Macro-blocks which are associated with a moving foreground object can be grouped based on connectivity and a similarity measure derived from the extracted DC coefficient data. The grouped macro-blocks can be tracked from frame to frame to identify and eliminate groups having only negligible motion. The resulting validated macro-block groups will correspond to a rough object mask associated with a moving region in the analyzed frame.
申请公布号 US9589363(B2) 申请公布日期 2017.03.07
申请号 US201414224611 申请日期 2014.03.25
申请人 Intel Corporation 发明人 Lawrence Sean J.;Tapaswi Ankita
分类号 H04N7/12;G06T7/20 主分类号 H04N7/12
代理机构 Finch & Maloney PLLC 代理人 Finch & Maloney PLLC
主权项 1. A method for tracking a moving object in a compressed video stream, the method comprising: receiving, by a computer system having a processor coupled to a memory device, a compressed video stream that comprises a plurality of frames, each frame including motion vector data and DC coefficient data; using a bit stream parser stored in the memory device to parse the compressed video stream, thereby extracting motion vector data and DC coefficient data for a selected frame of the compressed video stream, the selected frame comprising a plurality of macro-blocks; using an object detection sub-module stored in the memory device and the extracted motion vector data to identify a plurality of foreground macro-blocks from amongst the plurality of macro-blocks, the foreground macro-blocks corresponding to motion that is distinguishable from a background motion model; using a grouping and labeling sub-module stored in the memory device to group a subset of the plurality of foreground macro-blocks based on a feature map that depends on the extracted motion vector data and DC coefficient data that is associated with the grouped subset of foreground macro-blocks; and using a validation and refinement sub-module stored in the memory device to validate the grouped subset of foreground macro-blocks based on a comparison of a cost metric between the selected frame and a temporally adjacent frame, the cost metric depending on frame-to-frame motion and variance of the grouped subset of foreground macro-blocks.
地址 Santa Clara CA US