摘要 |
A video processing system is configured to receive training video samples from a plurality of video sensing devices. The training video samples are sets of pair video samples. These pair video samples can include both substantially similar subject matter and different subject matter. In the first step, there is a patch pool sampled from videos, and the system select patches with more saliency. The saliency is represented by the conditional probability density function of the similar subject and the conditional probability of the different subject. During the testing phase, the system applies the selected patches from the training phase, and returns the matched subjects. |