摘要 |
<p>System with methods and apparatus for rendering text converted from audio with an image. The image is captured using a photo-sensitive film camera or digital camera, or created using computer graphics software. Audio is captured either at the time of image capture or at another time. The captured image and audio are stored and associated with each other using a multimedia file format. The audio is converted to text using voice recognition software. A composite image is formed from the image and the converted text by positioning the converted text on or near the image. The composite image is output on a computer monitor, printer, or other output device. <IMAGE></p> |