摘要 |
According to one embodiment, in an apparatus for correcting a speech corresponding to a moving image, a separation unit separates at least one audio component from each audio frame of the speech. An estimation unit estimates a scene including a plurality of image frames related in the moving image, based on at least one of a feature of each image frame of the moving image and a feature of the each audio frame. An analysis unit acquires attribute information of the plurality of image frames by analyzing the each image frame. A correction unit determines a correction method of the audio component corresponding to the plurality of image frames, based on the attribute information, and corrects the audio component by the correction method.
|