发明名称 In-video product annotation with web information mining
摘要 A system provides product annotation in a video to one or more users. The system receives a video from a user, where the video includes multiple video frames. The system extracts multiple key frames from the video and generates a visual representation of the key frame. The system compares the visual representation of the key frame with a plurality of product visual signatures, where each visual signature identifies a product. Based on the comparison of the visual representation of the key frame and a product visual signature, the system determines whether the key frame contains the product identified by the visual signature of the product. To generate the plurality of product visual signatures, the system collects multiple training images comprising multiple of expert product images obtained from an expert product repository, each of which is associated with multiple product images obtained from multiple web resources.
申请公布号 US9355330(B2) 申请公布日期 2016.05.31
申请号 US201214111149 申请日期 2012.04.11
申请人 National University of Singapore 发明人 Chua Tat Seng;Li Guangda;Lu Zheng;Wang Meng
分类号 G06K9/00;G06K9/46;G06F17/30 主分类号 G06K9/00
代理机构 Kilpatrick Townsend & Stockton LLP 代理人 Kilpatrick Townsend & Stockton LLP
主权项 1. A computer method for providing product annotation in a video to one or more users, the method comprising: generating a product visual signature for a product by at least: collecting an unannotated expert product image of the product from an expert product repository,searching for a plurality of unannotated product images from a plurality of web resources different from the expert product repository, the plurality of unannotated product images related to the unannotated expert product image,selecting a subset of the plurality of unannotated product images by filtering the plurality of unannotated product images based on a similarity measure to the unannotated expert product image, andgenerating the product visual signature from the unannotated expert product image and the subset of the plurality of unannotated product images; receiving a video for product annotation, the video comprising a plurality of video frames; extracting a plurality of key frames from the video frames; and for each key frame: generating a visual representation of the key framed;comparing the visual representation with a plurality of product visual signatures including the product visual signature; anddetermining, based on the comparison, that the key frame contains the product identified by the product visual signature.
地址 Singapore SG