摘要 |
<p><P>PROBLEM TO BE SOLVED: To provide a technology for facilitating the generation and edit of speech balloons and superimposed subtitles. <P>SOLUTION: A face detection means 103 receiving moving picture data detects a face feature quantity and a face position, and a voice identification means 104 receiving the moving picture data detects a voice feature quantity. Each of the detected feature quantities is fed to a talker particularizing means 107, wherein each of the feature quantities is compared with a feature quantity of a talker registered in a voice/face cross-reference data storage means 106 to particularize the position of a particular talker. A voice recognition means 105 converts the voice of the particularized talker into text data. A speech balloon generating means 112 generates a speech balloon on the basis of the position of the talker and the text data, and a moving picture generating means 114 combines the moving picture data, the voice data and balloon data to generate new moving picture data. <P>COPYRIGHT: (C)2007,JPO&INPIT</p> |