发明名称 Method and system for generating at least one of: comic strips and storyboards from videos
摘要 A method, a system, and a computer program product code for generating a series of still images from an input video file are provided. The series of still images may include, but are not limited to, a comic strip and a storyboard. The method includes extracting audio and visual frames from the video file. Thereafter, basic units of the video file are identified. The basic units are exposition (beginning), conflict (middle), and resolution (end). Thereafter, key frames are extracted from the basic units based on at least one of audio frames, visual frames, and a combination of the visual frames and the audio frames. Then, the extracted key frames are manipulated to output a series of still images. Subsequently, narration in the form of audio or text is attached to the still images to generate at least one of comic strips and storyboards.
申请公布号 US9064538(B2) 申请公布日期 2015.06.23
申请号 US201113311795 申请日期 2011.12.06
申请人 Infosys Technologies, Ltd. 发明人 Gupta Puneet;Darbari Akshay;Vinmani Karthik Gopalakrishnan;Sivaramamurthy Venkat Kumar
分类号 H04N5/93;G11B27/034;G11B27/10;G11B27/28;G11B27/34 主分类号 H04N5/93
代理机构 LeClairRyan, a Professional Corporation 代理人 LeClairRyan, a Professional Corporation
主权项 1. A method for generating a series of still images from a video file, the method comprising: extracting, by a multimedia management computing device, a plurality of audio frames and a corresponding plurality of visual frames from a video file; identifying, by the multimedia management computing device, a plurality of basic units of the video file, each of the plurality of basic units comprising a respective subset of the plurality of visual frames and a corresponding subset of the plurality of audio frames; extracting, by the multimedia management computing device, for each of the plurality of basic units, one or more key visual frames from the subset of the plurality of visual frames and corresponding one or more key audio frames from the subset of the plurality of audio frames; generating, by the multimedia management computing device, a speaker dialogue graph based on the one or more key audio frames; identifying, by the multimedia management computing device, one or more characters based on the generated speaker dialogue graph; converting, by the multimedia management computing device, the one or more key visual frames into a series of still images and the corresponding one or more key audio frames into text corresponding to the one or more characters; and outputting, by the multimedia management computing device, the series of still images and the text corresponding to the one or more characters.
地址 Bangalore IN