摘要 |
Techniques, systems, and computer program products for parsing objects in a video are provided herein. A method includes producing and storing a plurality of versions of an image of an object derived from a video input, wherein each version of said image has a different resolution of said image; computing an appearance score at each of a plurality of regions on the lowest resolution version of said image for a plurality of semantic attributes with associated parts for said object, said appearance score denoting a probability of each semantic attribute appearing in the region; analyzing increasingly higher resolution versions than the lowest resolution version to compute a resolution context score for each region in the lowest resolution version; and ascertaining an optimized configuration of body parts and associated semantic attributes in the lowest resolution version, said ascertaining utilizing the appearance scores and the resolution context scores.
|