摘要 |
A speaker predicting apparatus includes a speech detector that detects a person who is delivering a speech out of a plurality of persons, a feature extracting portion that extracts a feature in an image from the image in which the person is captured, a learning portion that learns the feature in the image occurring before the speech is detected by the speech detector, from the feature in the image, and a predicting portion that predicts the speaker out of the plurality of the persons, from the feature in the image in which the person is captured, with the use of a result learned by the learning portion.
|